kubectl Cheats for Incident Response
20 kubectl one-liners for incident response. Each with a real use case and what it catches.
Describe and get
describe is the first investigation tool because it surfaces events that get quietly hides. CrashLoopBackOff, ImagePullBackOff, FailedScheduling: the explanation is in the event stream, not the pod status. Reach for describe before logs, before exec, before anything else.
kubectl describe pod $p -n prod. Full event log plus current state per pod. Reveals image-pull and scheduling failures the status field omits.- Describe-first order on CrashLoop or Pending. Diagnostic order matters. Skipping describe means missing the FailedScheduling event five minutes longer.
kubectl get events -n prod --sort-by=.lastTimestamp. Cluster-level event stream sorted newest-last. Catches signals across pods that per-pod describe misses.- Structured
-o yamldump. Full spec and status as YAML. Captures the snapshot for postmortem reconstruction.
Logs
Logs is the daily debugging surface and the four switches below cover most incident-response use. Follow, scope by time, read the previous container after a crash, aggregate across replicas.
kubectl logs -f deployment/api -n prod. Live tail across the deployment's pods. Standard incident view.--previous. Prior-container logs after a crash. Captures the moment-of-crash output that the live container has already lost.--since=1h. Time-bounded scope per deployment. Cuts noise during long incidents.kubectl logs -l app=api --tail=100. Multi-pod aggregate by label. Supports multi-replica investigation in one command.
Exec carefully
exec is the last-resort tool. Useful when nothing else gives a clear answer, but it leaves no audit trail by default and any side effects are invisible to the next responder. Use sparingly and document every invocation.
kubectl exec -it pod/$p -n prod -- /bin/sh. Interactive shell into the pod. Investigation of last resort.- Exec-cost rule per incident. Production exec leaves no audit trail without extra setup. Default-deny in your head before reaching for it.
- Document every exec. Capture "I ran X for reason Y" in the incident channel. Compliance posture and future investigation both need the breadcrumb.
- Cluster-level audit-policy hook. Kubectl audit logging at the cluster level. Catches "who exec'd into prod when" after the fact even if the operator forgot to document.