The On-Call Prep Checklist Before Shift
5-minute prep before going on-call. The checklist that loads context.
Pre-shift review (5 minutes)
The pre-shift review takes 5 minutes and prevents inheriting blind. Active incidents in flight (read the incident channels and understand current state); recent deploys in the last 4 hours because most pages tie to recent changes; open alerts in ‘warning’ state that haven’t fired but often become real pages during the shift.
- Active incidents in flight. Read the incident channels; understand current state; don’t inherit blind.
- Recent deploys (4-hour window). Most pages tie to recent changes; knowing what shipped lets you correlate fast.
- Open warnings. Alerts in warning state often become real pages during your shift.
- Per-shift review record. The review captured in the on-call notes; supports the handoff.
Acknowledge with the outgoing on-call
The handoff is synchronous. “I’m up to speed on these incidents and recent deploys. Taking over.” Outgoing on-call confirms acknowledgement; until then, they are responsible. Five-minute conversation if anything is weird (recent flapping alerts, pending changes, stakeholder concerns) so tribal knowledge transfers verbally.
- Synchronous handoff in chat. “Up to speed, taking over”; the explicit transition.
- Outgoing confirms. Until confirmation, the outgoing on-call is responsible; the handoff is bilateral.
- Five-minute weird-stuff conversation. Flapping alerts, pending changes, stakeholder concerns; tribal knowledge transfers.
- Per-handoff documented record. The handoff captured in chat; supports later review.
Tools loaded and tested
The tooling check is non-negotiable. Pager app open with notifications enabled, phone DND off for the on-call number, test with a synthetic page if uncertain; dashboards bookmarked (service health, deploy log, on-call context) for one click from page to context; permissions tested in non-prod or with a dry-run because finding broken permissions during a real incident is the wrong place to discover.
- Pager app and notifications. Phone DND off for the on-call number; test with a synthetic page if uncertain.
- Dashboards bookmarked. Service health, deploy log, on-call context dashboard; one click from page to context.
- Permissions tested. Test runbook commands in non-prod or with dry-run; not during real incident.
- Per-shift smoke test. Pager test, dashboard load, permissions check; the readiness loop.
Personal readiness
Personal readiness matters. Home network and laptop charged with a backup hotspot if home wifi is flaky; calendar cleared of non-essential meetings during the shift because on-call is the priority; family informed because the pager may go off at 3 AM.
- Home network plus backup. Laptop charged; backup hotspot available if home wifi is flaky.
- Calendar cleared. Non-essential meetings dropped; on-call is the priority during the shift.
- Family informed. They know you’re on-call; the pager may go off at 3 AM.
- Per-shift readiness checklist. The personal items run before each shift; supports the discipline.
Outgoing actions
Handing off well closes the loop. When your shift ends, do the same checklist for the incoming on-call (pay forward the discipline); document anything weird that happened during your shift because the next on-call benefits from context they didn’t observe; update runbooks where you wished they had been clearer because the compounding effect makes the next shift easier.
- Mirror checklist for incoming. Same pre-shift checklist for the next on-call; pay forward the discipline.
- Document weird stuff. The next on-call benefits from context they didn’t observe.
- Update runbooks. Where you wished they had been clearer; the compounding effect makes the next shift easier.
- Per-shift handoff log. Documented at end of shift; supports continuous improvement of on-call quality.