Resource Requests vs Limits: 2026
Requests reserve; limits cap. The pattern that prevents both starvation and OOM.
Requests
Resource request vs limit is the foundational Kubernetes resource model. Requests are guaranteed; limits are caps; the discipline produces predictable scheduling and bounded blast radius.
What requests provide:
- What the pod gets guaranteed.: The request is the floor. Kubernetes scheduler ensures the pod's request is available; the pod can use at least its request.
- Used for scheduling.: The scheduler uses requests for placement. Total requests across pods on a node must fit; the discipline produces predictable scheduling.
- Set to expected steady-state usage.: The right request matches actual usage. The pod's request reflects what it actually uses; the cluster's effective capacity is real.
- Affects QoS.: Pods with both requests and limits set get Burstable QoS. Pods with requests equals limits get Guaranteed QoS. The discipline shapes how Kubernetes treats the pod.
- Document the rationale.: Each request has a documented basis. Observed usage; expected baseline; the discipline is data-driven.
Requests are the foundation. The cluster's scheduling and capacity decisions use them.
Limits
Limits are the cap. The pod cannot exceed its limit; the kernel enforces; the discipline bounds blast radius.
- What the pod cannot exceed.: The limit is the ceiling. The pod is OOMed (memory) or throttled (CPU) if it exceeds; the discipline is enforced.
- Set 50 to 100% above requests for headroom.: The limit allows burst above the request. 50 to 100% headroom accommodates normal bursts; the discipline matches workload patterns.
- Memory vs CPU semantics.: Memory limits hard-cap the pod (OOM on excess). CPU limits throttle (slow but no kill). The discipline matches the resource's nature.
- Avoid limits without requests.: Setting only limits produces unpredictable behavior. The scheduler does not have requests to plan with; the discipline includes both.
- Per-workload tuning.: Different workloads have different patterns. Steady workloads tolerate tight limits; bursty workloads need looser; the discipline matches.
Limits are the safety. The blast radius of any pod is bounded by its limits.
Avoid
Some patterns produce predictable problems. No requests, or limits equal to requests, both have issues; the discipline avoids them.
- No requests.: Pod without requests cannot be scheduled predictably. The scheduler has no information about the pod's needs; oversubscription happens; the discipline fails.
- Pod could be oversubscribed.: Without requests, multiple pods can land on a node that cannot satisfy them all. Resource contention follows; the discipline degrades.
- Limits equals requests.: Setting limits equal to requests gives Guaranteed QoS but no headroom for bursts. Pods cannot burst at all; the discipline is overly tight.
- No headroom for bursts.: Real workloads have bursts. Limits equal to requests produces throttling on every burst; the discipline produces unnecessary degradation.
- Document the choice.: When the team uses Guaranteed QoS, the rationale is documented. The discipline is intentional; not the default.
Resource request vs limit is one of those Kubernetes operational disciplines that pays off in scheduling and capacity. Nova AI Ops integrates with cluster resource telemetry, surfaces patterns, and supports the team's resource discipline.