On-Call Noise vs Coverage

Trade-off in alert tuning.

Overview

Alert tuning is a trade-off between sensitivity and noise. Tighten thresholds and you catch more incidents earlier but live with more false positives. Loosen thresholds and the rotation calms down but coverage gaps appear. The discipline is making that trade-off explicitly per tier rather than letting it drift, and tracking false-positive rate as a first-class metric so the conversation stays grounded in data.

The approach

Three habits keep alert quality high: per-tier thresholds tied to SLO priority, per-alert FP-rate tracking, and a quarterly review that prunes alerts that no longer earn their pages.

Why this compounds

Each tuned alert deposits a little more on-call quality. Retention improves; mean time to detect improves on the alerts that matter; the rotation stops being the place engineers go to burn out.