SLO & Reliability Practical By Samson Tanimawo, PhD Published Dec 23, 2025 4 min read

SLO Historical Data: Use It

Past performance informs target.

Baseline

The most common mistake in SLO setting is picking a number that sounds good in a meeting. The right way is to anchor the target on what the service has actually been doing, then decide how much to stretch beyond that. Historical data is the foundation under any honest SLO target, and the team that skips this step usually ends up with a target the architecture cannot sustain.

What a useful baseline looks like:

90 days minimum.: Anything shorter misses seasonality. A service running clean for 30 days might be a service that has not yet seen a quarter-end batch load, a holiday traffic spike, or a regional weather event. 90 days catches most of the predictable variability.
Capture seasonality explicitly.: Plot the data and look for repeating patterns. Daily, weekly, monthly, quarterly. A service with a clear weekly cycle has a different SLO posture than one with daily variability. The baseline analysis names the cycles and accounts for them.
Per-service, not per-platform.: Each service gets its own baseline. Platform-wide aggregates hide the variation between services and lead to one-size-fits-all targets that fit no service well. The 30-line query that produces a per-service baseline is worth running.
Exclude known anomalies, document them.: A baseline that includes the day a major dependency had a 4-hour outage will artificially lower the target. Exclude the anomaly, document why it was excluded, and compute the baseline against the remaining data. The exclusion log is part of the baseline analysis.
Show the distribution, not just the mean.: A service that hits 99.9% on the median day and 99.0% on the worst day is different from one that hits 99.45% every day. The SLO target is set against the worst-day reality, not against the average.

The baseline takes a day to compute and the conversation around it sets the foundation for the next year of SLO discussions.

Aspire

Once you have the baseline, the question is how much to stretch beyond it. The right answer is "enough to be ambitious, not so much that the target is unreachable." Most teams either set the target equal to the baseline (which builds in zero growth) or pick a number that requires architectural rework to hit (which leads to a year of misses).

Beat the baseline by 10 to 20%.: If the 90-day baseline is 99.85% availability, a target of 99.9% is a meaningful stretch (a one-third reduction in error budget) without requiring a redesign. If the baseline is 99.5%, a target of 99.6% is a similar stretch in absolute terms.
Account for known investments.: If reliability work in the next quarter will improve the underlying signal (better caching, redundant dependencies, faster rollback), bake that into the target. The target is what the service should be after the planned work, not just what it has been historically.
Stretch in the dimension that matters.: If the baseline shows availability is fine but latency is the user-felt problem, stretch on latency, not availability. The dimension that gets stretched is the one that the team is willing to invest in.
Don't stretch all dimensions at once.: Picking a target that improves availability AND latency AND correctness simultaneously usually produces a target nothing improves on. Pick one dimension to stretch per cycle, hold the others at baseline.
Sanity-check against the dependency math.: If your target requires upstream services to be 99.99% reliable and they are at 99.9%, your target is mathematically impossible regardless of your own work. The dependency math is the ceiling; aspirational stretching cannot exceed it.

Aspirational targets driven by data are achievable. Aspirational targets driven by ambition alone are not. The 10 to 20% rule keeps the team in the zone where the target requires real work but is not a fantasy.

Track

An SLO target is a hypothesis: "given our investment plan and our dependency tree, we believe this number is the right one." The hypothesis is tested every quarter against the actual performance. The discipline of tracking is what keeps the SLO honest.

Quarterly: actual vs target.: At the end of each quarter, plot the rolling SLO performance against the published target. Three columns: target, actual, delta. The numbers tell the story; the team writes the analysis around them.
Adjust if persistently off.: If the actual is above target by a wide margin for three consecutive quarters, the target is too loose; tighten it. If the actual is below target for three consecutive quarters, the target is unreachable given the current architecture; either invest more or relax the target. Either way, do not let the gap persist.
Honest about misses.: A miss is information about the system, not a failure of the team. Document why it missed (a specific incident, a dependency degradation, an underinvested team), what was done in response, and whether the response is expected to close the gap.
Target changes are a public event.: When the SLO target changes, document it in the engineering log, update the customer-facing SLA if there is one, and explain the reasoning. Quietly tightening or loosening targets undermines the practice; openly adjusting them based on data strengthens it.
Year-over-year is the real metric.: A team whose SLO target has tightened year over year for three years is a team with maturing reliability. A team whose target stays flat or loosens is one that has stalled. The trajectory matters more than the current number.

SLO targets driven by historical data and adjusted on real performance are the SLOs that survive contact with reality. Nova AI Ops computes baselines per service over rolling windows, suggests target ranges based on observed performance and dependency math, and tracks actual versus target quarter over quarter so the SLO conversation is anchored in evidence rather than opinion.