On-Call Game-Day Rehearsals: Practice for Real Incidents
Game days are how teams stay sharp between real incidents. The cadence + structure determine whether they pay back.
Why game days
Real incidents are infrequent (good); game days build muscle memory between (better).
Without rehearsal, response time during real incidents is slow.
Four-frequency tier
- Tabletop: walk through a scenario; quarterly.
- Limited blast radius: staging environment; monthly.
- Production game day: controlled production failure; semi-annually.
- Surprise drill: unannounced test; annually.
Scenario library
10-15 scenarios in a library; rotate through.
Scenarios match common past incidents; build confidence on familiar territory.
Action items
Each game day produces 2-5 action items: documentation update; runbook fix; tooling improvement.
Track action items; close them; the next game day starts from data.
Antipatterns
- Game day without action items. Theatre.
- Production game day without rehearsal at lower tiers. High risk.
- One game day annually. Memory fades.
What to do this week
Three moves. (1) Apply this practice to your next on-call rotation. (2) Survey the team after one cycle. (3) Iterate based on feedback; the discipline is the cadence.