SLO & Reliability Practical By Samson Tanimawo, PhD Published Oct 3, 2025 4 min read

Aggregate SLOs vs Per-User

Aggregate hides individual experience.

Aggregate

The standard way to compute an SLO is across all requests in aggregate: total successes divided by total requests. The result is easy to compute, easy to display, and easy to game. The biggest weakness of aggregate SLOs is that they hide the case where most users have a great experience and a small subset have a terrible one. Both populations contribute to the same number; neither is visible separately.

What aggregate SLOs are good for and bad for:

Aggregate SLOs are necessary, not sufficient. They are the right top-line metric and the wrong investigation tool.

Per-user

The complement to aggregate SLOs is per-user SLI tracking. Instead of computing reliability across all requests, compute it per user (or per tenant, per region, per cohort) and look at the distribution. The tail of that distribution tells the story the aggregate hides.

Per-user SLOs are the diagnostic tool. They surface the failure modes that aggregate metrics hide, especially in multi-tenant SaaS where customers experience the platform very differently from each other.

Layer

The right answer is not aggregate vs per-user. It is both, layered. The aggregate SLO is the headline; the per-user SLI is the investigation tool. Each answers a different question; using both is what makes the practice robust.

Aggregate plus per-user is the SLO architecture that scales from a single-tenant API to a multi-thousand-tenant SaaS platform. Nova AI Ops computes both layers in parallel, surfaces the per-user distribution alongside the aggregate, and identifies the cohorts that are pulling down the tail so the team's reliability investment targets the cases that matter most.