Cluster Sizing 2026
Sizing the cluster: nodes, pods per node, headroom.
Pods per node
Cluster sizing is the discipline of matching cluster capacity to workload needs. Too small and pods cannot schedule; too large and the team pays for unused capacity; the right size is calibrated to actual demand.
What pods-per-node looks like:
- Default 110.: Kubernetes' default maxPods is 110 per node. The number is configurable; many teams operate at the default; some need more.
- Can be raised with prefix delegation.: AWS VPC CNI's prefix delegation allows much higher pod density. The team can run 250+ pods per node; the cluster's effective capacity grows.
- Higher density equals fewer nodes.: The same workload count fits in fewer nodes when density is higher. The cost drops proportionally; the discipline is meaningful at scale.
- Memory and CPU constraints.: Beyond the pod count, memory and CPU constrain density. Some node types support many small pods; others fit fewer larger pods. The team's workload mix determines the right node type.
- Test the limit.: The team tests pod density. Some applications experience contention at high density; the actual workable limit may be below the configured maxPods.
Pods per node is the foundation. The number determines how many nodes the team needs.
Nodes
Node count comes from total pods divided by pods-per-node, plus headroom. The headroom matters; without it, autoscaling and replacements produce capacity issues.
- Total pods divided by pods-per-node.: The basic math. Total pods the cluster runs; divided by the per-node density; gives the minimum node count.
- Plus 20% headroom.: The minimum is not enough. Headroom accommodates autoscaling, replacements, occasional bursts. 20% is typical; some teams use more, some less.
- Headroom for autoscaling.: When pod count grows, capacity must be available. Without headroom, scaling is delayed; the team's workloads are constrained.
- Replacements.: Node failures and rolling updates remove nodes from the pool. Headroom accommodates these without losing capacity for workloads.
- Per-node-type sizing.: Different node types serve different workloads. The team may have multiple node pools; sizing is per-pool.
Node count is the operational decision. The right count balances cost and capacity.
Plan
Cluster sizing is not static. Workload growth, changing patterns, new applications all shift the math; the team revisits sizing periodically.
- Quarterly: actuals vs plan.: Once per quarter, the team compares actual pod counts to planned. Growth that exceeds plan needs more capacity; capacity that exceeds need can be reduced.
- Adjust sizing.: The cluster's node pool sizes adjust based on the comparison. Capacity matches actual need; the discipline keeps the cluster's cost aligned with usage.
- Drift accumulates.: Without periodic review, drift accumulates. Some teams' clusters become significantly over-provisioned; some become constrained. The review prevents both.
- Track autoscaling effectiveness.: The team tracks how often autoscaling fires. Frequent autoscaling indicates baseline sizing is too low; rare autoscaling indicates baseline may be too high.
- Document sizing decisions.: The team documents why each cluster is sized as it is. Future maintainers see the reasoning; the discipline is preserved through team changes.
Cluster sizing is one of those Kubernetes operational disciplines that pays off in cost optimization and capacity planning. Nova AI Ops integrates with cluster telemetry, surfaces sizing patterns, and supports the team's quarterly review.