Incident vs Alert: Different Things
An alert is a signal. An incident is the response.
Alerts and incidents are different
An alert is a signal: a threshold was crossed, a check failed. Cheap to fire, cheap to resolve.
An incident is the human response: someone is engaged, status is communicated, customers may be affected.
Conflating them creates two failure modes: every alert becomes a manual incident (toil), or real incidents are buried in alert noise.
Aim for a low alert-to-incident ratio
Healthy ratio: 1 incident per 5-10 alerts. Below that, alerts are noisy. Above that, you are missing real problems.
Track the ratio per team and per service. The trend is more useful than the absolute number.
When the ratio drops, retire alerts. When it rises, look at recent code changes.
Automate the alert-to-incident step
PagerDuty, Incident.io, and FireHydrant all create incidents from alerts on rules: severity, label, count over time.
Don't auto-create incidents for every alert. That defeats the purpose. Create only when escalation criteria are met.
Sample rule: 3 sev1 alerts on the same service in 10 minutes triggers an incident with status page integration.
What gets a postmortem
Incidents, not alerts. A noisy alert is a cleanup item, not a postmortem.
Sev1 incidents always. Sev2 incidents that exceed time-to-resolve targets. Customer-impacting events regardless of severity.
Track incident counts over time. Alert counts are noise; incident counts are signal.
How to introduce the distinction
Pick a tool. PagerDuty's incident object, Incident.io's full workflow, or a homegrown table.
Define the auto-creation rule: which alerts auto-create incidents, which require a human.
Train the rotation on the distinction. "Did this become an incident?" is the post-shift question, not "did you get paged?"