Kubernetes Intermediate By Samson Tanimawo, PhD Published Dec 13, 2026 10 min read

Kubernetes Resource Limits and Requests: The Math Behind QoS Classes

Resource limits look like a tuning knob. They are actually a scheduling-class declaration that decides whether your pod is the first or last to be evicted.

Requests vs limits, restated

Requests are what the scheduler reserves: enough capacity to fit the pod. Limits are what the kubelet enforces at runtime: more than this and you get throttled (CPU) or killed (memory).

What most teams do not realize: setting these values determines a third thing, the pod's QoS class. The class is what the kubelet uses to decide who to evict when a node runs hot. Same workload, different limits, different eviction priority.

The three QoS classes and what they do

When Burstable makes sense

Burstable is the right pick when peak usage is unpredictable but baseline is steady, most web services. Set request to 95th-percentile baseline; set limit to 200% of that. The pod gets reserved capacity for the common case and headroom for spikes.

Guaranteed is right for things you cannot afford to evict, a primary database, a stateful queue worker. Pay the cost of fully-reserved capacity to buy eviction immunity.

Catching OOMKills before they happen

Antipatterns

What to do this week

Three moves. (1) Audit your top-10 most-restarted pods; check for OOMKilled status. (2) Move latency-sensitive services from CPU-limited to CPU-unlimited (with requests still set). (3) Add an OOMKilled alert to your platform dashboard.