Incident Management
Practical
By Samson Tanimawo, PhD
Published May 12, 2026
4 min read
The Degraded-Mode Runbook
When the system can't fully serve, what's the safe partial mode? The runbook that defines.
Live workflow · 3 working · 1 queuedLive
Signal · gather Working
Decide · pick action Working
Apply · with verify Working
Learn · update playbook Queued
Define modes
Full: all features. Degraded: read-only. Minimal: cached responses only.
Per service. Documented; tested.
Triggers
Specific signals push into each mode.
Auto-trigger where possible; manual escalation if needed.
Recover
Recovery criteria documented. 'When X is true for Y minutes, recover.'
Auto-recover where safe; otherwise human approval.