The Acceptable-Loss Conversation Every SRE Team Must Have
Some failures cannot be prevented at acceptable cost. The conversation that surfaces what is acceptable, with whom, and how it is documented.
The framing
Reliability is bounded by economics. Some failures cost more to prevent than they cost to absorb; the acceptable-loss conversation makes that explicit.
- Not all failures are preventable. Some require investment that the business will not fund, and that is a legitimate answer.
- Frame the question. Which failures are acceptable, at what budget, with what mitigation, signed off by whom.
- Without it. Every failure feels unacceptable; the team chases impossible standards and burns out.
- Stakeholders. Engineering, product, support, and a finance representative; without finance the trade-off is incomplete.
Examples
The conversation gets concrete only with examples. Three classes from the wild help anchor the discussion when you run it for real.
- 5-second failover blip. Acceptable up to once per quarter; eliminating it would cost months of engineering for marginal benefit.
- Regional outage during vendor incident. Acceptable; multi-cloud mitigation may not justify its operational cost.
- Customer-data exposure of a specific class. Never acceptable; engineering effort is unbounded here, no budget question.
- Sub-second degradation during deploy. Acceptable when canary catches the bad rev; the goal is fast detection, not zero impact.
Document the agreement
Verbal agreements rot. Write the conclusions down so the next on-call inherits them and the next budget cycle can revisit them with context.
- Per-item record. What it is, why it is acceptable, what the mitigation is, what would change the answer.
- Annual review. Risk tolerances shift; the document follows the business, not the other way around.
- Visible to the team. On-call knows in advance what is in scope versus out before the page fires.
- Sign-off trail. Named approver per item; future arguments resolve by reading the trail, not by re-running the debate.