Quantifying Reliability's Impact on Revenue (Without Hand-Waving)
"Reliability matters" loses budget arguments to revenue projections. The way to win the argument is to quantify reliability in revenue terms, not to argue about quality.
The frame
The CFO does not buy "reliability matters." The CFO buys "this incident cost us $X in revenue, and the proposed work reduces the next incident's cost by Y%." Until you can fill in X and Y, you are arguing in adjectives.
The translation problem. Engineering knows reliability matters; finance needs to see it in their language. The translation isn't optional — every reliability project competes with other engineering investments for budget, and qualitative arguments lose to quantitative ones.
The pleasant surprise. Once the translation is done, reliability projects often look better than they did intuitively. A project that "feels worth doing" might also be the highest-ROI project on the engineering roadmap; the quantification surfaces that.
Two numbers
Revenue per minute during peak hours, and the typical duration of an incident. Multiply. That is the cost of the average incident at peak. Most teams who have never done this are off by a factor of 3-10 in either direction.
The numbers' source. Revenue per minute: total monthly revenue divided by minutes in month, then multiplied by peak-hour ratio (peaks are usually 2-4x average). Incident duration: average of last quarter's SEV2+ durations from your incident-management tool.
The order-of-magnitude check. Most teams discover their average peak-hour incident costs $10k-$50k in revenue. Annualised across a typical incident frequency (10-30 per year), this is $100k-$1.5M/year. The number is large enough to fund significant reliability investment.
Attach revenue to specific incidents
Pull last quarter's incidents. For each, estimate the revenue impact using the two numbers above plus the affected user fraction. The total often surprises leadership; "we had eight incidents and they cost us $1.4M" lands differently than "we had eight incidents."
The exercise. List incidents from last quarter (typically 5-15 SEV2+). For each, compute: revenue/minute × duration × affected fraction. The sum is the quarterly cost. Most teams haven't done this; the number always changes the conversation.
The output. A one-page summary: incident, date, duration, affected fraction, estimated revenue impact. Total at the bottom. Send to your VP of engineering and the CFO. The conversation that follows is dramatically different from "we had a tough quarter."
Segment by customer
An incident affecting your top 10 customers costs more than one affecting the long tail, even at the same minute count. If your enterprise tier is 60% of revenue, an incident that hits them is roughly 6x worse than the headline number suggests.
The segmentation's lever. Engineering investments can prioritize "incidents that affect enterprise" over "incidents that affect free tier." The investment that prevents an enterprise-affecting incident is worth more than one that prevents a free-tier incident, even if the technical work is similar.
The data needed. Customer segment definitions (free, paid, enterprise). Revenue share per segment. Incident analysis showing which segments were affected. Most teams have all three but haven't connected them.
The simple model
Cost = revenue/min × duration × affected fraction × segment weight. Four numbers. Two you have (revenue/min, duration). Two you estimate (affected fraction, segment weight). The estimates do not need to be precise; the model is more honest than the alternative of not modelling at all.
The estimation discipline. Affected fraction is often hard to compute precisely; use rough buckets (5%, 25%, 75%, 100%). Segment weight is the multiplier from the segment analysis. Both are approximate; the product is order-of-magnitude correct.
The model's robustness. Even with 50% errors on the estimates, the model's output is within a factor of 2 of truth. That's plenty for budget conversations; better data improves the precision but the order of magnitude is what matters.
Moves that move the number
Faster mitigation (reduces duration). Better isolation (reduces affected fraction). Higher-tier prioritisation in incident response (reduces segment weight on the worst incidents). Each one becomes a project with a quantified business case.
The faster-mitigation move. Improving MTTR from 60 minutes to 30 minutes cuts incident cost in half. The work to achieve this (better runbooks, better tooling, faster on-call response) translates directly to revenue protection. Quantify the protection; pitch it.
The better-isolation move. Improving from "incident affects all users" to "incident affects 25% of users" cuts cost by 75%. Architecture work (multi-region, sharding, circuit breakers) often shows up in this category. The investment is large but the protection scales.
Presenting to leadership
The presentation is short. One slide: last quarter's incident cost (the surprising number). Second slide: the proposed reliability investment. Third slide: projected impact on next quarter's cost.
The pitfall to avoid. Engineering presents 12 slides of technical detail. The CFO's eyes glaze over. Stick to three slides. Make the dollar amounts large; make the asks specific; make the impact projections honest (with confidence intervals).
The follow-up. After the project ships, present the actuals. Did the projected impact materialise? Honest accounting builds credibility for the next budget request. Most teams skip this step and lose credibility over time.
Common antipatterns
The qualitative-only pitch. "Reliability is a competitive advantage." True; loses to quantitative pitches. Always quantify.
Over-precision. 12-decimal-place revenue impact estimates. Hides the model's actual uncertainty. Use round numbers; make the uncertainty explicit ("approximately $1.4M ± 30%").
Cost without revenue framing. "This will save us 100 engineer-hours per quarter." Saves engineering time; doesn't show revenue impact. Translate engineer-hours to dollar value AND mention revenue protection.
One-time presentation. Quantify costs for one budget cycle; never measure actuals. Credibility erodes. Track post-implementation impact and report quarterly.
What to do this week
Three moves. (1) Compute revenue/minute during peak. The number is two divisions in a spreadsheet; takes 10 minutes. (2) Pull last quarter's SEV2+ incidents; estimate revenue impact for each. The exercise takes a few hours; produces a one-page report. (3) Send the report to your VP of engineering with a one-paragraph summary. The conversation it triggers is what unlocks the next reliability project.