The SLI Revision Cadence That Keeps Targets Honest
SLIs and SLOs drift. Revisit them quarterly. The format, the questions to ask, and what teams have changed in their second year.
Quarterly agenda
Per service: what is the current SLI / SLO? Was it right for the last quarter?
Customer feedback: are customers complaining about reliability dimensions the SLI does not measure?
Cost: is the team paying to over-deliver on the SLO? Could a relaxed SLO save engineering time?
Common changes in year two
Add latency SLI. Most teams start with availability; year two adds latency for user-facing services.
Tighten or loosen the SLO. The first SLO is usually wrong; the data tells you which way.
Split SLIs by user segment. 'P99 latency for premium customers' becomes a separate SLI when stakes differ.
Avoid
Changing SLOs to make them easier. The point is to be honest, not to look good.
Adding too many SLIs. 3-5 per service is plenty; more becomes noise.
Skipping the cadence. Without ritual, SLIs stop matching reality.