Resource Quota Discipline
Per-namespace quotas prevent runaway consumption.
Setup
Resource quota discipline bounds each namespace's resource consumption. Without quotas, one team's runaway can affect all others; with quotas, the blast radius is bounded. The discipline pays off in multi-tenant clusters.
What setup looks like:
- ResourceQuota per namespace.: Each namespace has its own ResourceQuota object. The quota declares what the namespace can consume; the API server enforces it; over-quota requests are rejected.
- Limits CPU, memory, pod count.: The standard quotas cover compute and pod count. CPU requests, memory requests, total pods all are bounded.
- Storage quotas.: Persistent volume usage can be quota-limited. Total storage and per-storage-class storage are quota-able; the team's storage costs are bounded.
- Object count quotas.: Some teams quota object counts (Services, Secrets, ConfigMaps). Excessive object creation is bounded; the cluster's API server is protected.
- Per-team quotas.: Different teams get different quotas. The quota reflects the team's allocation; the discipline matches the team's capacity.
Setup is per-namespace. The discipline is consistent across the cluster.
Size
Quota sizing is the calibration. Too tight produces operational friction; too loose does not protect; the right size catches runaway while accommodating growth.
- Based on team's expected usage plus 50% headroom.: The quota accommodates expected usage with margin for growth and bursts. The team can grow within the quota; runaway is caught before becoming catastrophic.
- Tight enough to catch runaway.: A bug that produces excessive resource creation hits the quota. The quota fires; the team is alerted; the bug is fixed before the cluster is overwhelmed.
- Loose enough not to block legit growth.: The team's normal growth is accommodated. Quota does not block routine work; the discipline does not become friction.
- Calibrate per workload.: Different namespaces have different needs. Production namespaces might have larger quotas; sandbox namespaces might have smaller; the calibration matches the workload.
- Document the rationale.: The team documents why each quota is the size it is. Future maintainers see the reasoning; the calibration is preserved.
Sizing is the discipline. The right size produces real protection without unnecessary friction.
Review
Quotas need periodic review. Teams' usage grows; outgrown quotas produce friction; the review keeps quotas aligned with actual needs.
- Quarterly.: The team reviews quotas quarterly. Usage trends are visible; quotas that need adjustment are surfaced; the review is part of the operational rhythm.
- Adjust as usage grows.: Teams that legitimately grew their usage need larger quotas. The review captures this; quotas adjust upward; the team's growth is supported.
- Outgrown quotas cause real friction.: Teams hitting quotas they have outgrown waste time. The review prevents this; outgrown quotas are increased before the friction matters.
- Underutilized quotas can shrink.: Teams using significantly less than their quota can have it reduced. The cluster's overall allocation tightens; the discipline goes both ways.
- Document changes.: Each quota change is documented. The history shows the team's growth; the discipline is preserved through team changes.
Resource quota discipline is one of those Kubernetes multi-tenancy practices that pays off across many teams. Nova AI Ops integrates with cluster quota and usage data, surfaces patterns, and supports the team's quota management.