Incident Severity Rubric That Survives Real Pressure
Most severity rubrics fall apart in the moment. The four-quadrant model that holds up at 3 AM and produces consistent decisions across teams.
The two axes
Customer impact: how many customers are affected, how badly.
Time sensitivity: how long until impact escalates without intervention.
Cross the two; you get sev 1-4 across four quadrants. Each quadrant has a clear definition.
Examples per quadrant
Sev 1: many customers, escalating fast (e.g., login is broken globally). All hands; war room.
Sev 2: many customers, stable (e.g., a feature is degraded but not breaking). Standard incident response.
Sev 3: few customers, escalating fast (e.g., one customer's data is at risk). Targeted response; faster than sev 4.
Sev 4: few customers, stable (e.g., a non-critical feature has a bug). Bugfix queue.
Consistency across teams
The rubric is the same across all teams. Cross-team incidents are easier when sev definitions match.
Calibrate quarterly: review the last 30 incidents; sev 2s that should have been sev 1, sev 4s that should have been sev 3. Update the rubric.
Document the rubric in one place. Linked from every incident channel; on the wall in the war room.