SLO & Reliability Practical By Samson Tanimawo, PhD Published Jun 11, 2025 4 min read

SLO Confidence Intervals

SLO measurements have uncertainty.

Idea

Every SLO number you publish is a sample, not a population. When you say "checkout was 99.95% available last month", what you really observed is some count of successful requests over some count of total requests, and the true underlying availability of the service is some other number that you cannot know exactly. The width of the gap between observed and true is a function of how many requests you saw.

The honest way to report this is a confidence interval:

Most teams skip this because point estimates fit on a dashboard and CIs do not. The cost of skipping it shows up later, when a low-volume service reports 99.4% one month and 100% the next and nobody can tell whether anything actually changed.

When

Confidence intervals matter most when sample size is small relative to the precision you are claiming. Three failure modes to watch for:

The rule of thumb: if your SLO target has more nines than the log10 of your monthly request count, you are overclaiming. A service with 100,000 requests a month (5 zeros) cannot honestly claim 99.99% (4 nines) at month resolution. The math will not let you.

Display

Putting confidence intervals on dashboards is the part most teams resist, because intervals are visually busier than point estimates. The fix is a small UI investment that pays back in calibration and trust.

Showing CIs on the SLO dashboard is the difference between a reliability practice that is rigorous and one that is theatre. Nova AI Ops computes Wilson intervals on every SLI by default, plots the band on every chart, and flags services whose claimed precision exceeds what their request volume can statistically support, so you stop overclaiming numbers your data cannot back up.