On-Call Noise Tracking
Per-engineer noise.
Overview
On-call noise tracking measures per-engineer alert load and per-alert firing rate so investment in alert quality is targeted rather than diffuse. Total alert count is a vanity metric; what matters is how many pages the on-call engineer absorbed during their shift, and which alerts were responsible for most of them. Track the right two numbers and the alert-quality work becomes obvious.
- Per-engineer noise. Per-on-caller alerts received per shift; the operator-experience metric that predicts burnout.
- Per-alert noise. Per-alert firing count; reveals which alerts contribute the most to per-engineer load.
- Per-quarter noise budget. Per-team noise SLO (e.g. fewer than 8 pages per shift); when exceeded, alert-quality work takes priority over feature work.
- Top noisy alerts plus alert-quality work. Per-quarter the top-N alerts get the engineering investment; the metric directly drives the work.
The approach
The practical approach is per-engineer alert-count tracking on every shift, per-alert firing-rate tracking continuously, a per-team noise SLO that triggers alert-quality work when exceeded, top-N investment per quarter against the noisiest alerts, and a documented policy so the rule survives leadership changes. The metric must drive the work or the metric is theatre.
- Per-engineer tracking. Pages received per shift, per on-caller; the number that anchors the operator-experience conversation.
- Per-quarter budget. Per-team noise SLO; quarterly review against the budget; budget exhaustion triggers alert-quality investment.
- Top noisy investment. Per-quarter take the top 5-10 noisiest alerts and tune, route, suppress, or delete; Pareto rules.
- Alert-quality work plus documented policy. Per-quarter alert-quality work tracked alongside features; per-team noise SLO documented in the handbook.
Why this compounds
Noise tracking compounds across quarters. Each tuned alert reduces baseline noise; each lowered baseline raises the bar for what counts as a page worth investigating; signal-to-noise improves quarter over quarter. After a year, the on-call rotation is sustainable rather than punishing, and the alerts that fire mean something.
- Operator experience. Less noise preserves on-call sanity; the rotation becomes a sustainable schedule rather than an attrition driver.
- Signal-to-noise. Real alerts get attention; the on-call investigates rather than triages.
- Operational improvement. Tracked noise drives investment; the budget conversation has data behind it, not anecdotes.
- Institutional knowledge. Each tuning teaches alert design patterns; the team learns to write quiet, useful alerts.
Noise tracking is an operational discipline that pays off across years. Nova AI Ops integrates with alert telemetry, surfaces noise patterns, and supports the team’s on-call discipline.