Pre-Prod Alert Noise
Pre-prod alerts shouldn't page production on-call.
Where pre-prod noise comes from
Staging clusters reuse production alert configs. They fire on every test, every chaos run, every flaky deploy.
Pre-prod has fewer humans. The page rate per engineer is often higher than production.
Pre-prod alerts are often misrouted to the production rotation. The on-call gets paged for a staging issue at 2am.
Separate paging for pre-prod
Pre-prod alerts go to a dedicated channel, not the production on-call rotation.
Slack-only for sev2 and below. Pre-prod sev1 still pages, but to the team's business-hours rotation, not the 24/7 on-call.
Tag every alert with environment. Routing rules use the tag.
Mute during known events
CI runs, chaos drills, performance tests should mute pre-prod alerts on the affected services.
Build a maintenance-mode API. CI calls it before a destructive test, ends it after.
Without muting, pre-prod alerts train the team to ignore alerts. That habit carries to production.
Pre-prod gets a noise budget too
Pre-prod alert volume should be 10-20% of production volume. If it's higher, the configs are over-noisy or staging is broken.
Pre-prod page count above production count is a red flag. Investigate same week.
Review pre-prod alerts on the same quarterly cadence as production.
How to fix pre-prod noise
Add an environment label to every alert. Adjust routing so pre-prod doesn't page production on-call.
Add muting hooks to CI for chaos and load tests.
Remove or downgrade alerts that are pre-prod-only. Production alerts shouldn't run in staging without modification.