Alerts Depending on Other Incidents
Some alerts shouldn't fire during specific incidents.
The cascade problem
When the primary database goes down, a hundred services alert at once. Most of those alerts are downstream symptoms; the human only needs the root cause.
Without dependency-aware suppression, the on-call drowns in pages that all describe the same incident. MTTA recovers slowly even when the root is identified in a minute.
PagerDuty, Opsgenie, and Nova AI Ops support dependency rules; few teams configure them properly.
Building the dependency graph
Start with the service catalog. Backstage, OpsLevel, or a homegrown YAML file all work. Map each service to its critical upstream dependencies.
Use OpenTelemetry traces to validate the graph. Manual catalogs drift; trace-derived graphs stay accurate within hours.
Update the graph in CI. New service deploys should fail if dependencies are not declared; this is the only way to keep it fresh.
Suppression rules
If service A is in incident state and service B depends on A, suppress B's alerts for the duration of A's incident plus a 5 minute cooldown.
Always log the suppression. The on-call should be able to query "what was suppressed during incident X?" for the postmortem.
Never suppress security alerts or data-loss alerts. The cost of a missed signal there outweighs any noise reduction.
When suppression backfires
Stale dependency graphs suppress real alerts. Always include a kill switch to disable suppression during a major incident.
Bidirectional dependencies (rare but real) confuse simple rule engines. Map them explicitly or use a graph-aware engine.
Cross-team dependencies need cross-team postmortems. Suppression that hides another team's incident from them is worse than no suppression.
Get started
Pick the top 5 services by page volume. Map each to its 3 most-critical upstreams.
Configure dependency suppression in PagerDuty event orchestration or your alerting tool.
Run for one month, then review every suppressed alert in a postmortem. Adjust until the false suppression rate is under 1%.