Incident vs Alert: Different Things

An alert is a signal. An incident is the response.

Alerts and incidents are different

An alert is a signal: a threshold was crossed, a check failed; cheap to fire, cheap to resolve. An incident is the human response: someone is engaged, status is communicated, customers may be affected. Conflating them creates two failure modes: every alert becomes a manual incident (toil) or real incidents are buried in alert noise.

Aim for a low alert-to-incident ratio

The ratio is the signal-to-noise health metric. Healthy: 1 incident per 5-10 alerts; below that means alerts are noisy; above that means you’re missing real problems. Track per team and per service because the trend is more useful than the absolute number; when the ratio drops, retire alerts, and when it rises, look at recent code changes.

Automate the alert-to-incident step

Auto-create incidents from alerts but don’t do it for every alert. PagerDuty, Incident.io, FireHydrant create incidents from alerts based on severity, label, count over time; the rule must require escalation criteria (sample: 3 sev1 alerts on the same service in 10 minutes triggers an incident with status page integration).

What gets a postmortem

Postmortems are for incidents, not alerts. A noisy alert is a cleanup item, not a postmortem; sev1 incidents always get postmortems, sev2 that exceed time-to-resolve targets do, customer-impacting events regardless of severity do; track incident counts over time because alert counts are noise but incident counts are signal.

How to introduce the distinction

Three steps introduce the distinction. Pick a tool (PagerDuty’s incident object, Incident.io’s full workflow, or a homegrown table); define the auto-creation rule (which alerts auto-create incidents, which require a human); train the rotation so “did this become an incident?” is the post-shift question, not “did you get paged?”.