On-Call Rotation Fairness Math
Rotation should be fair. The math that proves it.
Metrics that measure fairness
Fairness in on-call rotation needs measurement. Three metrics surface the picture: pages per engineer per quarter, off-hours pages per engineer, and minutes-on-page per engineer. Off-hours pages weigh more than business-hours pages; minute-count captures the difference between 5-minute pages and 5-hour pages.
- Pages per engineer per quarter. The headline metric; the simplest fairness comparison.
- Off-hours pages. The pain metric; off-hours pages weigh more than business-hours pages.
- Page-minutes per engineer. Some pages take 5 minutes, some take 5 hours; minute-count captures the difference.
- Per-rotation tracking. Metrics tracked per rotation, not just org-wide; the cohort matters.
Acceptable variance
Some variance is natural. Within 20% across engineers in the same rotation is acceptable; more than 20% is imbalance worth investigating; sustained imbalance over multiple quarters is the issue, while random variance over one quarter is noise.
- 20% variance threshold. Within 20% across engineers is acceptable; more than 20% means imbalance worth investigating.
- Natural variance exists. Some shifts have more pages; some engineers happen to be on for active incidents.
- Sustained vs random. Sustained imbalance over multiple quarters is the issue; random variance over one quarter is noise.
- Per-quarter trend view. Trend lines distinguish sustained imbalance from one-quarter spikes.
Common causes of imbalance
Imbalance has predictable causes. Senior engineers carry noisier services; rotation gaps from vacation are not always covered properly; geographic clustering puts certain engineers in bad timezones for off-hours pages. Each cause has a specific remediation.
- Senior-service coupling. Senior engineers carry more pages because they’re trusted with the noisier services; distribute or compensate.
- Rotation gaps from leave. Vacation or leave not covered properly; backup on-call should equalise.
- Geographic clustering. Engineer in a bad timezone gets more off-hours pages; compensate or rotate timezones.
- Per-cause remediation. Each cause has a specific fix; the diagnosis informs the action.
Responding to imbalance
Responding to imbalance means three things: reshuffle the rotation, compensate the outliers, address the underlying cause. Reshuffling alone is a rotation trick; compensation alone misses the noise; both plus tuning the noisy service is what actually moves the metric.
- Reshuffle the rotation. Different shift patterns; different engineer pairings; the structural fix.
- Compensation for outliers. Time off, stipend, recognition; the work is real, the compensation should match.
- Address the underlying noise. Noisy services need tuning, not just rotation tricks; the root-cause fix.
- Per-action accountability. Each action has an owner and a deadline; supports follow-through, not just discussion.
Quarterly fairness review
The quarterly fairness review closes the loop. Per-engineer metrics ranked; outliers explained (new engineer ramping up, service that hit a bad week, vacation coverage); action items tracked quarter over quarter. The review is the system that keeps fairness from drifting silently.
- Per-engineer metrics ranked. Pages, off-hours pages, page-minutes; ranked for visibility.
- Outliers explained. New engineer ramping up; service that hit a bad week; vacation coverage; the context matters.
- Action items tracked. Reshuffle, compensate, fix the underlying noise; tracked quarter over quarter.
- Per-quarter accountability cycle. Each review produces named actions with owners; supports continuous improvement of fairness.