Alert Noise by Team Attribution
Some teams' alerts are noisier. Attribute and act.
Why per-team attribution matters
Without attribution, alert noise is SRE's problem. With attribution, it's the team that wrote the rule.
Per-team noise scores reveal which teams generate the most noise per service. Surprising results are the norm.
Attribution drives behavior change faster than any internal training program.
How to attribute
Tag every rule with a team label. owner_team is mandatory metadata.
Aggregate fire counts by team weekly. Publish a leaderboard, sorted ascending by signal-to-noise ratio.
Pull team mappings from the service catalog. A rule without a team tag goes to a default 'unowned' bucket that SRE actively shrinks.
Metrics to publish
Total fires per team per week. Auto-resolved fires per team. Pages per on-call shift per team.
Cost per page (vendor fees + interruption time). Some teams generate 10x the cost of others.
Improvement velocity: change in noise score week-over-week.
Aligning incentives
Tie SLO budget to noise budget. A team in the top quintile of noise loses change-management privileges until the score drops.
Make new rule creation conditional on the team's existing noise score. High-noise teams must delete one rule per new rule.
Recognize improvement publicly. The team that cut noise 40% in a quarter gets a shoutout.
Start with a public dashboard
Don't enforce penalties before publishing data. Visibility alone moves the needle 30%.
Skip naming-and-shaming language. Frame as a team improvement metric, not a leaderboard of failures.
Audit attribution accuracy monthly. Misattributed noise erodes trust in the whole system.