Incident Cost vs Prevention Cost
When does prevention pay? The math that's defensible to leadership.
Expected cost
Expected cost is the probability-weighted impact: how often the incident class fires, times what each occurrence costs. Frequency matters as much as severity. The same incident shape can be a $500/year nuisance (rare) or a $50k/year drain (frequent); the math determines whether prevention pays.
- Probability times impact. Expected-value math per incident class. Drives the "is this worth fixing?" answer.
- Frequency as a first-class input. Rare-vs-frequent split changes the answer entirely. Capture both axes per class.
- Documented historical baseline. Per-class incident frequency from the actual archive. Supports honest probability estimates rather than gut numbers.
- Customer-segment weighting. Tier the impact by which customers feel it. Prioritisation follows business impact, not raw count.
Prevention cost
Prevention cost is engineering time times burdened rate, plus any recurring operational burden the prevention introduces. Some preventions are config-only and effectively free; some are multi-quarter projects with real ongoing cost. Both axes belong in the math.
- Engineering time times rate. Documented dollar cost per prevention. Pulled from real estimates, not vibes.
- Free preventions exist. Config-only fixes. Always worth doing when the math is positive; cost is essentially zero.
- Multi-quarter projects exist too. Real cost, honest planning. Do not pretend the prevention is free because it is desirable.
- Recurring operational cost. Per-prevention ongoing burden. Catches "ship and forget" assumptions that hide the real cost.
Decide
Compare prevention cost to expected loss. Invest when prevention is cheaper; accept the risk explicitly when it is not. Document the rationale per decision and review quarterly so changing incident frequency reaches the conclusion before the next outage does.
- Prevention cheaper than expected cost. Invest decision per class. Math says yes; engineering plans the work.
- Prevention more expensive. Accept-the-risk decision, documented explicitly. Not a default from inattention.
- Documented rationale per decision. Named driver for invest or accept. Catches "we never actually decided" defaults six months later.
- Quarterly decision review. Re-accept prior decisions each quarter. Catches drift in incident frequency before the math flips silently.