PodDisruptionBudgets vs ReplicaSet Scaling
PDBs prevent voluntary disruption from killing too many pods. The pattern.
Setup
PDB and ReplicaSet scaling are two related but distinct concepts. PDBs control voluntary disruption; ReplicaSet scaling controls replica count. The discipline is using each for its purpose; conflating them produces operational issues.
What setup looks like:
- minAvailable: 1 or maxUnavailable: 25%.: The PDB declares minimum availability. minAvailable: 1 means at least 1 pod is always available; maxUnavailable: 25% means at most 25% can be down simultaneously.
- Per Deployment.: Each Deployment can have its own PDB. The PDB's selector matches the Deployment's pods; the protection is scoped.
- Applies to evictions.: The PDB applies to voluntary evictions (drain, taint-based eviction). It does not apply to involuntary disruptions (node failure); the team understands this scope.
- Independent of ReplicaSet.: The PDB and ReplicaSet are independent. The Deployment's replica count is one thing; the PDB's availability requirement is another; both apply.
- Reflect availability requirements.: The PDB's parameters reflect the workload's actual requirements. Critical workloads have stricter PDBs; less critical have looser; the discipline matches importance.
Setup is per-workload. The PDB protects the workload's availability during voluntary disruptions.
When
Multi-pod services benefit from PDBs. Stateful workloads, leader-elected services, multi-replica services all need protection from too-aggressive eviction.
- Multi-pod services.: Single-pod services do not benefit from PDBs (the single pod must come down for upgrades). Multi-pod services protect against losing too many pods at once.
- PDBs prevent rolling upgrades from breaking quorum.: Quorum-based services (etcd, Zookeeper, similar) need a minimum quorum count. The PDB ensures rolling upgrades preserve quorum; the service stays available.
- Stateful sets.: StatefulSets often have ordering requirements. PDBs ensure the right pods stay available during voluntary disruption.
- Leader-elected services.: Services with leader election need stability. The PDB prevents losing both leader and followers simultaneously; the leader-election can settle.
- Customer-facing services.: User-visible services need bounded disruption. PDBs ensure users see continuous availability during cluster operations.
The when-to-use is broad. Most multi-replica production services benefit from PDBs.
Avoid
Some PDB configurations produce predictable problems. minAvailable: 100% is the most common mistake; the discipline is recognizing and avoiding it.
- minAvailable: 100%.: Requiring 100% availability prevents any voluntary disruption. The PDB cannot be satisfied while any pod is being evicted; drains fail; upgrades stall.
- Blocks all evictions.: When the PDB requires 100% availability, no pod can be evicted. The cluster's voluntary disruption mechanisms cannot proceed; the discipline becomes a barrier.
- Drains and upgrades stop dead.: Cluster maintenance requires draining nodes. Drains rely on PDBs allowing evictions; tight PDBs prevent maintenance; the team's operations halt.
- Calibrate carefully.: The PDB's parameters need calibration. Too tight blocks operations; too loose does not protect; the right setting is the workload's actual minimum.
- Test the PDB.: The team tests PDB behavior. Drain a node; verify the workload behaves as expected; the PDB matches the intent.
PDB vs ReplicaSet scaling is one of those Kubernetes operational concepts that benefits from clear understanding. Nova AI Ops integrates with cluster operational telemetry, surfaces PDB-related patterns, and supports the team's operational discipline.