On-Call Page Volume Targets

< 3 pages per shift.

Healthy page volumes

Page volume targets put numbers on what an acceptable on-call shift looks like. Without numbers, fatigue grows silently and engineers leave with vague reasons.

Under 3 pages per shift. Per-engineer per-shift bar; above it is a sign of alert noise or real reliability issues, not heroism.
Under 10 pages per week. Per-engineer per-week bar; beyond it, on-call quality degrades and engineers stop responding cleanly.
Under 1 off-hours page per night. Sleep is a precondition for next-day functioning; off-hours pages compound across the rotation.
Published target per team. Visible bar that the team agreed to; supports buy-in and gives engineers grounds to push back when volume creeps up.

Measurement discipline

Targets without measurement are aspirational. Per-engineer, per-service, and per-time-of-day breakdowns each surface a different aspect of the on-call experience.

Per-engineer per-shift count. Page count per shift, aggregated weekly, trended monthly; surfaces uneven distribution across the rotation.
Per-service page volume. Noise contribution per service; identifies the services driving most of the rotation pain.
Per-time-of-day distribution. Business-hours versus after-hours split; different patterns demand different responses (alert tuning versus business-hours processes).
Per-quarter trend chart. Volume trajectory across quarters; drift surfaces faster on a chart than in retrospective complaints.

Responding to overflow

Overflow has three response levels: rotation, service, and architecture. Each addresses a different cause; pick the right level for the symptom.

Rotation level. Emergency staffing per rotation when volume is unsustainable; backup on-call activated and alert tuning prioritised.
Service level. Tuning sprint for the noisiest services; engineering capacity reallocated until volume drops to target.
Architecture level. Persistently noisy services may indicate architectural problems; tactical fixes do not address them.
Documented response per overflow. Named owner and timeline per overflow event; supports accountability and prevents the response from drifting.

Page budget discipline

Page budgets work like error budgets. Above the budget triggers tuning time; below it allows feature work; chronic overage triggers staffing or architectural investment.

Weekly page budget per team. Bar set per team; above it pulls engineering time into tuning, below it releases time for features.
Quarterly budget-achievement review. Chronically over-budget teams need staffing or architectural investment, not another quarter of tactical fixes.
Major-incident overflow exemption. Incident weeks do not count against the budget; the metric is normal-week volume.
Named budget owner per team. Steward who tracks the budget and escalates when it goes off the rails.

Link to retention

Page volume is a leading indicator of attrition. The math compounds: bad on-call drives senior engineers out, mentor pairs stop forming, the rotation gets worse for the engineers who stayed.

Engineers leave bad on-call. Volume-driven attrition is real; page volume predicts departures better than satisfaction surveys.
Healthy on-call retains seniors. Mentor pairs build over time when the rotation is sustainable; bad rotations break that pattern.
Alert-quality investment is retention investment. Quarterly alert tuning produces the same effect as a retention bonus, more durably.
Exit-interview signal. On-call mentions in exit interviews catch root causes that aggregate metrics miss.