Handing Off an Active Incident at Shift Change
Shift change during a hot incident is one of the most error-prone moments in operations. The 15-minute overlap, the IC swap protocol, and the mistake of cold handoffs.
The error-prone moment
The departing team is tired. The arriving team is cold. The customer-facing situation is unresolved. The handoff happens in 30 seconds because everyone wants to leave. Two hours later the new team realises a key fact about the incident is missing, it was in the departing IC's head and never got written down.
The structural problem. Incident context is dense and partly tacit. The departing team has built a mental model over 4-8 hours of bridge work; the model includes timeline, theories ruled out, customer commitments, key engineers' current focus, and a hundred other things. Compressing this into a 30-second written handoff loses 80% of the model. The new team operates on the surviving 20% and re-derives the rest, slowly, while customer impact continues.
The fix is to invest in the handoff with the same rigour as the incident itself. The 15-minute overlap rule, the formal IC swap, the running document, these aren't ceremony; they're how you preserve enough of the mental model to keep the response coherent across shifts.
The 15-minute overlap rule
Both teams stay on the bridge for 15 minutes. Not 5. The overlap is for the new team to ask questions the new team will think to ask, which is different from the questions the old team thought to ask.
The mechanism. The departing team has built a mental model that has converged on certain assumptions. The new team comes in without those assumptions; they ask questions like "wait, did you check X?" that the departing team didn't think to mention because X had been ruled out by silent inference 90 minutes ago. Each question reveals a piece of the model that wasn't written down.
The 15 minutes is also a humanising buffer. Bridge culture is intense; switching from "I'm running this incident" to "I just left" in 30 seconds produces psychological residue (engineers continue to think about the incident for hours after, even when they're off-shift). The overlap creates a real exit ramp; the departing team gets to mentally close the loop before they go.
Formal IC swap
The current IC says: "I'm handing IC to NAME. NAME, do you accept?" The new IC says yes out loud. The bridge topic is updated. The old IC stays as a debugger or rests; the new IC owns coordination from this moment.
The verbal acceptance matters. Without it, both ICs hover ambiguously, the old IC continues to make decisions out of habit, the new IC defers because they don't feel ownership yet. The bridge has two heads, which is worse than having one. The verbal "I accept" creates a clean transfer of authority.
The post-swap behaviour. The departing IC stops giving orders and starts answering questions. The new IC starts giving orders and asking questions. The role flip is visible to everyone on the bridge, which prevents the senior-engineer-by-default phenomenon where the bridge keeps deferring to the departing IC because they've been driving for hours.
The running doc
The IC has been keeping a running summary in the channel. Before swapping, the departing IC posts a final consolidated summary: timeline, theories ruled out, current hypothesis, next step. The new IC reads it before saying "I accept."
The format of the running doc. Sections: Symptom (what customers experience right now), Timeline (key events with timestamps), Theories ruled out (with brief reasoning), Current theory (the one being tested), Current action (who's doing what), Customer commitments (next update due to customers at HH:MM), Open questions. Each section is short; the doc is comprehensive without being long.
The discipline. The IC updates the doc at every 10-minute status cycle. Without the discipline, the doc gets stale; with the discipline, the doc IS the handoff document, the departing IC doesn't have to write a new summary at handoff time, they just need to confirm the doc is current.
Cold handoffs
"I'm leaving in 5, can you take over?", the departing IC writes a quick summary; the new IC has questions; the old IC is already on the way home. The new IC inherits a hot bridge with stale context. The next 30 minutes are spent recovering what was already known.
The cold handoff is the most common pathology in multi-shift incidents. It happens because the departing engineer's shift is ending and they want to leave; the arriving engineer is logging in and hasn't fully oriented. Both feel like the handoff should be quick. It shouldn't be.
The exit cost of the cold handoff. The new IC takes 30-60 minutes to rebuild the model; during that time, decisions are slower and lower-quality. Customer impact continues. The departing team is also unhappy because they get follow-up questions at home and feel like they didn't actually leave. Both sides pay a cost; the 15-minute overlap eliminates both costs.
When the handoff is really done
The departing IC leaves only when the new IC has run one full status cycle (asked the team, written the consolidated update, posted to customers). If the new IC freezes during the cycle, the departing IC is still there to coach. Once one full cycle is clean, the handoff is real.
The signal that the new IC is ready. They start asking the right questions in the bridge, questions that build on the running doc rather than rebuilding from zero. They start writing the next status update without help. They start naming who's doing what at the next status checkpoint. Each is a signal that the new IC has internalised the model.
What if the new IC isn't ready after 15 minutes. Extend. Stay 15 more minutes. The handoff is more important than either engineer's shift ending on time. Most extensions are 5-10 minutes; the rare ones are longer. The cost of a too-fast handoff is dramatically higher than the cost of a 15-minute extension.
Planned handoffs vs emergency handoffs
A planned handoff happens at predictable shift change times (e.g., end of business day for follow-the-sun rotations). An emergency handoff happens when the IC has to leave unexpectedly, illness, family emergency, fatigue.
The planned handoff has time. The 15-minute overlap, formal swap, running doc, all available. Use them.
The emergency handoff has less time and higher stakes. The protocol: the departing IC names a successor immediately, posts a summary in the channel, stays on the bridge for as long as physically possible. The new IC swap happens with whatever overlap is achievable; if that's only 5 minutes, the new IC inherits a degraded handoff and that's the cost. The team should plan to have a follow-up handoff once the original IC is recovered.
Multi-shift incidents
Some incidents run 12+ hours and require multiple handoffs. Each handoff is an opportunity for context loss; over 3 handoffs, you can lose half the original mental model if discipline isn't strict.
The countermeasure. Each handoff updates the running doc; the doc accumulates as the durable record. The third IC reads the doc, not the original IC's memory. The doc is what makes multi-shift incidents tractable.
The other countermeasure. After 12+ hours of incident time, schedule a written debrief with all ICs involved. Not the postmortem (that's later); a meta-debrief specifically about the handoff quality. What context was lost at each handoff? What got added back? Where did decisions repeat themselves because the new IC didn't know the old IC had already considered something? The debrief identifies the handoff failure modes for the team to fix.
What to do this week
Three moves. (1) Document the running-doc format your team uses. Most teams have an implicit format; making it explicit lets new ICs adopt it faster. (2) Practise the formal swap in a tabletop. Have engineers say "I'm handing IC to NAME, NAME do you accept?" out loud. The first time is awkward; practice removes the awkwardness. (3) Pin the 15-minute overlap rule in your on-call channel. The visible rule is what the IC reaches for at 4am when they're tempted to do a 5-minute handoff.