Alert Deduplication: Noise Reduction That Actually Works
Dedup is the highest-impact on-call improvement. The pipeline pattern below cuts page volume by 60-80%.
Why dedup matters
Without dedup: same incident fires 50 separate pages.
With dedup: one page; clear signal; faster response.
Four-stage pipeline
- 1. Event-time, same alert within N minutes = one event.
- 2. Label-based, same alert+labels = one event.
- 3. Similarity-based, ML-based similarity scoring.
- 4. Dependency-aware, parent alert suppresses children.
Tooling per stage
Stage 1-2: Alertmanager / PagerDuty native.
Stage 3-4: ML platforms (Moogsoft, BigPanda, native AIOps).
Open-source: Karma + Alertmanager covers stages 1-2.
False-merge audit
Audit: weekly review of merged events; verify no real distinct incidents lost.
Without audit, dedup hides real signals occasionally.
Antipatterns
- No dedup. Page flood.
- Aggressive ML dedup without audit. Hidden incidents.
- Different dedup per source. Inconsistent signal.
What to do this week
Three moves. (1) Apply this practice to your next on-call rotation. (2) Survey the team after one cycle. (3) Iterate based on feedback; the discipline is the cadence.