Frequency vs Severity: Reading the Incident Mix
Counting incidents misses the picture. The mix matters more than the total.
The mix
Incident totals lie; the mix tells the truth. Many small incidents looks busy but is healthy (lots of small things, nothing systemic); few large incidents looks calm but signals classification or detection missing the small ones. Publish the severity histogram per quarter so the team's internal conversation lands on real shape.
- Many sev-3s, zero sev-1s: healthy. Lots of small things, nothing systemic. Quarterly histogram confirms.
- Few sev-3s, many sev-1s: unhealthy. Big things hide; classification or detection is missing the small ones. Investigate why.
- Published severity histogram per quarter. Visible chart per quarter. Supports honest internal conversation about shape.
- Comparison to peer teams. Relative-mix view per team. Catches outlier patterns that team-internal review misses.
Read it
Reading the mix is a quarterly practice with a named owner. Histogram for shape, trend chart for drift, comparison to prior quarter for shift detection. Without a named owner the chart drifts into "we have a chart somewhere" and nobody actually reads it.
- Quarterly histogram. Severity-count distribution per quarter. Drives the "is the mix healthy?" check.
- Watch for shifts. Comparison to prior quarter. Fewer total incidents but higher severity is a different problem than fewer everywhere.
- Multi-quarter trend chart. Multi-quarter view per quarter. Catches slow drift before it becomes the new normal.
- Named chart owner per quarter. Responsible analyst per quarter. Catches "we did not actually look."
Act
Acting on the mix is shape-driven. More small incidents points to alerting that needs tuning; more large incidents points to systemic fragility that needs architectural investment. Documented response per shift catches the "we noticed the trend but did not act" failure mode; quarterly action review closes the loop.
- More small incidents: tune alerts. Alert calibration per team. Classification too sensitive; many sev-3s are not real incidents.
- More large incidents: systemic fragility. Architectural review per team. Reliability investment, not alert tuning.
- Documented response per shift. Named follow-up per shift. Catches "we noticed but did not act."
- Quarterly action review. Prior-action retro per quarter. Continuous improvement of the response.