Alert Volume Budget
Per-week alert budget. Enforces tuning discipline.
What an alert volume budget is
An alert volume budget is a team-level cap on alerts per on-call shift. Above the cap, the rotation is over capacity and alerts must be retired or escalated. Typical budgets: 5 pages per shift for sev1, 15 per shift total; above that, response quality drops and engineers burn out. The budget is a forcing function: over budget means no new alert ships without retiring an old one.
- Per-shift cap. Team-level; above the cap means over capacity.
- Typical: 5 sev1, 15 total. Above that, response quality drops and burnout begins.
- Forcing function. Over budget blocks new alerts until an old one is retired.
- Per-team budget documented. The cap committed to the engineering handbook; supports stakeholder alignment.
Measure actual versus budget
Measurement makes the budget real. Pull alert history weekly and count pages per shift, per team, per severity; plot against the budget on a team-owned dashboard with red when over budget for 2 weeks running; break it out by alert source because the top 3 sources usually account for 70% of pages.
- Weekly alert history pull. Pages per shift, per team, per severity; the basic count.
- Dashboard with red-band. Red when over budget for 2 weeks running; visible to the team.
- Top-3 source breakdown. Top 3 sources usually account for 70% of pages; the focus list.
- Per-week trend view. Trend visible to the team; supports continued attention.
When you go over budget
Going over budget triggers a mandatory cleanup before any new alert ships. The team picks 3 alerts to retire or retune; if the team cannot reduce volume, the rotation is understaffed and headcount or scope must change; don’t ignore the budget because the next escalation is attrition and that costs more than the cleanup.
- Mandatory cleanup first. 3 alerts retired or retuned before any new alert ships.
- If can’t reduce, restaff. Rotation understaffed; add headcount or shrink scope.
- Attrition is the next escalation. Costs more than the cleanup; the budget exists for a reason.
- Per-overrun runbook. Documented response when budget is exceeded; supports consistent action.
How to set the budget
The budget number comes from the rotation itself. Survey on-call asking how many pages per shift feel sustainable and pick the median; industry baseline is 2 pages per shift comfortable, 5 upper limit, 10+ burnout territory; tighter for smaller rotations because a 3-person rotation cannot absorb 10 pages a shift while a 10-person rotation can.
- Survey on-call. Pages-per-shift sustainable; pick the median answer.
- Industry baseline. 2 comfortable, 5 upper limit, 10+ burnout; the calibration.
- Per-rotation size adjustment. Smaller rotations need tighter budgets; the absorption capacity scales.
- Per-team baseline review. Annual budget review; supports continued fit.
Roll out the budget this quarter
The rollout is concrete. Publish the current page count and the proposed budget and get team sign-off; hold a cleanup sprint to get under budget by retiring or retuning 5-10 alerts; make budget compliance a recurring metric because staying over budget for 2 quarters means the rotation is broken structurally.
- Publish and sign-off. Current page count and proposed budget; team sign-off makes the budget legitimate.
- Cleanup sprint to get under. 5-10 alerts retired or retuned; the rollout pays for itself.
- Recurring compliance metric. Tracked quarterly; over budget for 2 quarters means structural break.
- Per-quarter health check. Budget compliance reported in engineering reviews; supports continued accountability.