Escalation Policy Design: Three-Tier Pattern
Escalation policies have to be predictable and unambiguous. Three tiers are the minimum that handles real situations.
Why escalation matters
Without escalation: a missed primary page becomes an outage.
With escalation: missed pages reach a human within minutes.
Three-tier pattern
- Tier 1: primary on-call. Pages immediately.
- Tier 2: secondary. Pages after N minutes no-ack.
- Tier 3: manager / broader team. Pages after secondary no-ack.
Per-team timing
SEV1: T1 immediate, T2 +5min, T3 +15min.
SEV2: T1 immediate, T2 +15min, T3 +60min.
Severity decides cadence; one policy for all loses signal.
No-ack defaults
No-ack default: page automatically escalates without human action. Engineering judgement removed at 3am.
The policy must escalate without trusting the on-call to manually request it.
Antipatterns
- One-tier escalation. Lost pages.
- Manual escalation. Forgotten in the moment.
- Same timing for all severities. Wastes higher tiers on minor.
What to do this week
Three moves. (1) Apply this practice to your next on-call rotation. (2) Survey the team after one cycle. (3) Iterate based on feedback; the discipline is the cadence.