The Degraded-Mode Runbook

When the system can't fully serve, what's the safe partial mode? The runbook that defines.

Define modes

Degraded modes are the discipline of choosing partial-but-safe over total failure. Each mode has an explicit feature scope so on-call can read the document at 3am and know exactly what is on, what is off, and what the user-visible difference is.

Triggers

Triggers are explicit and signal-driven. Auto-trigger where the math is clean; manual escalation where the signal is ambiguous and judgment is required.

Recover

Recovery is the matching discipline. Documented criteria, auto-recover where safe, human approval where recovery has its own risk.