The On-Call Handoff Checklist That Saves Incidents
The 60-second handoff that prevents an incident from being inherited blindly. Six items, in order, with examples of what each catches.
The six items
1. Currently active incidents and their owner. The on-call coming on must know what is in flight before any ticket lands.
2. Recent deploys in the last 4 hours. Yesterday's deploy is today's regression; visibility prevents repeated investigation.
3. Open changes in flight (feature flags rolling, scheduled maintenance, partial rollouts). The on-call inherits the change context.
4. Known flapping alerts that can be ignored. Without this, the new on-call investigates noise that the previous on-call had context to dismiss.
5. Anyone you should NOT page (vacation, sick, unavailable). Handoff failures often come from paging the wrong people during the new shift.
6. Anything weird that did not become an incident but might. Latent signals matter; the previous on-call has fresh intuition the new one does not.
When to do the handoff
5 minutes before the shift change. Long enough to cover the items, short enough to fit the calendar.
Synchronously, in the on-call channel. Async handoff loses the urgency of items 4 and 6.
The outgoing on-call is on the hook until the handoff is acknowledged. No 'I emailed you' shortcuts.
Make it a ritual
The same six items, the same order, every shift. Predictable. No deviation based on who is on-call.
If the channel does not have an outgoing handoff message, page the outgoing on-call. The ritual is mandatory.
Ritual catches the cases where someone is too tired to remember. Cognitive offload is the point.