Readiness vs Liveness Probes
Readiness gates traffic; liveness restarts pods. Different.
Readiness
Readiness and liveness probes serve different purposes. Readiness controls whether the pod receives traffic; liveness controls whether the pod is restarted. Confusing them produces operational issues.
What readiness provides:
- Pod ready to serve.: The readiness probe answers "is this pod ready to handle traffic?". The pod might be alive but not yet ready (still starting up, draining, etc.).
- Removes from endpoints if not.: Failed readiness removes the pod from service endpoints. Traffic stops flowing; the pod is not deleted; it can recover and rejoin.
- Slow start.: Applications that take time to start use readiness. The pod is alive but not yet ready; traffic does not flow until ready; users do not see slow startups.
- Dependency check.: Some applications check dependencies in readiness. Database reachable; cache warm; the pod is not ready until prerequisites are met.
- Recovers automatically.: When the readiness condition recovers, the pod rejoins endpoints. The discipline is automatic recovery.
Readiness is the traffic-routing discipline. The pod's traffic is controlled by readiness.
Liveness
Liveness is more drastic. Failed liveness restarts the pod; the team's discipline includes using it carefully.
- Pod alive.: The liveness probe answers "is this pod functioning?". A failed liveness means the pod is broken in a way that needs restart.
- Restart if not.: Failed liveness triggers restart. The kubelet kills the container; restarts it; the pod tries again.
- Use carefully.: Liveness probes are powerful and dangerous. Misconfigured liveness produces restart loops; the discipline is thoughtful configuration.
- Bad liveness causes restart loops.: If the liveness probe fails for non-restart-fixable reasons, the pod restarts forever. The discipline is verifying liveness actually checks what restart fixes.
- Some applications do not need liveness.: Many applications do not benefit from liveness probes. The application self-recovers without restart; liveness produces unnecessary restarts.
Liveness is the restart discipline. The team's discipline uses it sparingly.
Startup probe
Startup probes handle slow-starting applications. The probe runs only at startup; bypasses liveness during startup; the discipline is targeted.
- Slow-starting apps.: Applications that take significant time to start benefit from startup probes. The startup probe gives time; liveness only activates after startup completes.
- Replaces initial liveness checks.: Without startup probes, slow-starting applications need long initialDelaySeconds on liveness. The startup probe is cleaner; the discipline is more flexible.
- Long timeout.: Startup probes can have long timeouts (5 minutes, longer). The application's slow startup is accommodated; the discipline matches reality.
- Switches to liveness after pass.: Once the startup probe passes, liveness takes over. The discipline transitions; the application is monitored normally after startup.
- Reduces startup-related restart loops.: Startup probes prevent the common case of liveness firing before startup completes. The discipline catches a recurring issue.
Readiness vs liveness is one of those Kubernetes pod-design distinctions that pays off when understood. Nova AI Ops integrates with cluster pod telemetry, surfaces probe-related patterns, and supports the team's probe-design discipline.