SLO & Reliability Practical By Samson Tanimawo, PhD Published Dec 29, 2025 4 min read

SLO Cascade: Service Dependencies

Downstream SLO depends on upstream.

Math

Cascading SLO failures are the kind of math that quietly invalidates ambitious reliability targets. If your service depends on N upstream services, and your code can only succeed when every upstream call succeeds, your maximum achievable SLO is the product of the upstream SLOs. The math gets worse fast.

The numbers you cannot escape:

The first move when designing any SLO is walking the dependency tree, multiplying the upstream availabilities, and asking whether the target you want to commit is mathematically possible. Most of the time it is not, and the conversation has to shift from "how reliable do we want to be" to "what does the architecture allow."

Design

The architectural fix for cascading SLO failures is to reduce your effective dependency on each upstream. You cannot make a 99% upstream produce a 99.99% downstream by being more careful. You can soften the dependency so that upstream failures degrade you partially instead of totally.

The architectural goal is not zero dependencies. It is fewer hard dependencies on the request path, with soft fallback for the ones that remain. That ratio is the lever that determines whether your SLO is realistic.

Monitor

Even with the best design, upstream failures will show up in your SLO. The question is whether you find out fast enough to act and whether you have receipts when stakeholders ask why this quarter's budget burned.

Cascading SLO failures look like your team's failure but are usually structural. Nova AI Ops tracks every outbound call by destination, computes per-dependency burn rate, and shows the contribution of each upstream to your own SLO so you can renegotiate dependency contracts with the teams whose reliability is capping yours.