Preemption Latency

K8s preemption impact.

Overview

Preemption latency captures how Kubernetes preemption affects user-facing latency. Capacity planning addresses the average; preemption produces tail-latency spikes that capacity planning never sees.

K8s preemption impact. Higher-priority pods evict lower-priority; the eviction surfaces as user-visible latency.
Eviction latency. Per-pod eviction time; matches user experience; the user sees the recovery, not the eviction.
PriorityClass design. Per-workload priority; customer-facing services rank higher; matches stakes.
PodDisruptionBudget plus spot preemption. Per-deployment eviction limit; cloud-provider preemption respects the same primitives.

The approach

The practical approach: PriorityClass design per workload, PodDisruptionBudget for production deployments, eviction monitoring as routine, spot tolerance only where it fits. The team’s discipline produces predictable performance under preemption pressure.

PriorityClass design. Customer-facing higher priority; batch jobs lower; the priority matches the user impact.
PodDisruptionBudget. Per-deployment max disruption; produces real protection against accidental mass eviction.
Monitor eviction. Per-namespace eviction count; the metric the dashboard usually does not show; surface it.
Spot tolerance. Spot-tolerant workloads only on spot; the cost saving requires explicit fit, not blanket placement.
Document the policy. Per-workload priority and PDB committed to the repo; supports operational reviews.

Why this compounds

Preemption latency discipline compounds across services. Each protected workload preserves user experience; the team’s K8s expertise grows; new services inherit the priority and PDB defaults.

Better user experience. Right priority preserves customer-facing latency; the user does not feel the cluster optimisation.
Better cost efficiency. Spot tolerance produces real savings; the cluster runs cheaper without sacrificing user-facing reliability.
Better operational fit. Right priorities match stakes; the cluster respects the business hierarchy.
Institutional knowledge. Each preemption teaches K8s patterns; the team’s Kubernetes engineering muscle grows.

Preemption latency discipline is an operational discipline that pays off across years. Nova AI Ops integrates with K8s telemetry, surfaces patterns, and supports the team’s K8s discipline.