Alerts Practical By Samson Tanimawo, PhD Published Nov 22, 2025 4 min read

The On-Call Cool-Down Period

After incidents: cool-down. Reduces secondary errors.

The cool-down protocol

After major incidents, the on-call gets 30-60 minutes of explicit rest before resuming normal work. Mandatory; not optional.

Rest reduces secondary errors. A tired on-call making decisions immediately after a sev 1 is at higher risk of compounding the incident.

Backup on-call covers during cool-down. Not a vacation; an explicit handoff for a bounded window.

When to invoke

After sev 1 incidents. Always.

After sev 2 incidents that ran more than 4 hours. Long durations are draining.

After multiple consecutive sev 3 incidents. The cumulative load is the issue.

How long

30 minutes minimum. Longer for severe incidents (multi-hour sev 1, customer-facing data issues).

Up to 2 hours for incidents that involved customer impact or required coordination with leadership.

Half-day for catastrophic incidents (data loss, major outages, security events). The recovery is real.

What to do during cool-down

Step away from the keyboard. Walk; eat; rest. Anything except continued incident work.

Brief debrief with the team if it helps process the experience.

Defer the postmortem first draft. The on-call writes the timeline; the analytical work waits until they're rested.

Making it stick

Manager enforcement. Engineers self-impose poorly; managers must require the rest.

Public norm. Team announces cool-down: 'I'm cooling down for an hour after that sev 1.' Removes stigma.

Track usage. Cool-downs that aren't taken should be flagged. Engineers powering through is not heroism; it's risk.