Incident Management
Practical
By Samson Tanimawo, PhD
Published Mar 28, 2026
4 min read
The Degraded-Mode Recovery Runbook
Recovering from degraded mode is its own runbook. The steps that prevent re-degradation.
Live workflow · 3 working · 1 queuedLive
Signal · gather Working
Decide · pick action Working
Apply · with verify Working
Learn · update playbook Queued
Verify root cause fixed
Don't recover until the cause is fixed. Recovery on top of an unfixed root cause re-fails fast.
Verification step is mandatory.
Staged recovery
Restore one feature at a time. Watch metrics between.
Catches partial-failure modes.
Comms
Status updates as features come back.
Customers see incremental improvement.