Best Practices Advanced By Samson Tanimawo, PhD Published Mar 24, 2026 6 min read

Composite SLOs vs Per-Service: When Each Makes Sense

Composite SLOs (one budget across multiple services) feel cleaner on paper and break in practice. Per-service SLOs feel noisy and end up holding teams accountable. Pick deliberately.

The two shapes

A composite SLO multiplies the availabilities of underlying services to produce one user-facing number. A per-service SLO sets independent targets for each service. The same incident shows up differently in each model, and pretending the difference does not matter has cost teams entire planning cycles.

The composite shape: 0.999 × 0.998 × 0.9995 = ~0.9965. The user-facing service is "available" only when all three dependencies are. The composite captures the user experience but obscures which service caused what.

The per-service shape: each service has its own target (99.9%, 99.8%, 99.95%), measured independently. Each team owns their target. The shape captures team accountability but doesn't reflect the user's actual experience.

Composite SLOs

One target. One budget. One conversation with leadership. Composites are cleaner on dashboards and easier to align with revenue. They make sense when the user-facing SLO is genuinely a chain of dependencies and a failure in any link is a failure for the user.

The reasons to choose composite. The user doesn't care which service was down — they care that their checkout failed. Revenue impact correlates with the composite, not with any single service. Leadership wants one number to track. Composite SLOs map cleanly to business impact.

The cost of composite. The per-team accountability is weak; when the budget burns, no team owns the burn directly. Engineering managers can claim "it was the upstream service" and there's no SLI that proves them wrong. Composite SLOs work organisationally only when there's a strong incident commander or platform team that owns the composite.

Per-service SLOs

One target per service. Owners are accountable for their own number. Per-service SLOs scale with team count and force conversations about which services are actually critical. They make sense when teams ship independently.

The accountability benefit. When Service A's SLO is missed, Team A owns it. The team can't deflect to upstream services because the SLO is on their service specifically. Per-service SLOs create clear ownership.

The cost. The user experience is invisible from per-service dashboards. Service A at 99.9% and Service B at 99.9% can produce a user experience at 99.8% (combined), but no per-service dashboard shows that 99.8% number. Leadership tracking per-service SLOs may think reliability is fine while customers churn because the composite (which they're not tracking) is bad.

How composites hide failures

If service A is 99.99% and service B is 99.9% and they are independent, the composite is roughly 99.89%. A failure in B that looks catastrophic at the team level barely moves the composite. You will not page; you will not learn; the team that owns B will not get the signal that they have a problem.

The math. A 5-minute outage of Service B is 0.011% of the month — within Service B's 0.1% budget. The same 5-minute outage moves the composite from 99.89% to 99.88% — within the composite's noise. Both metrics say "fine" while customers experienced 5 minutes of outage.

The downstream consequence. Team B doesn't get the page. They don't postmortem. They don't add the regression test. The same outage happens again next month. The pattern continues until somebody notices the cumulative pattern, which often takes 6+ months because no individual incident triggers escalation.

How per-service becomes noisy

Once you have 50 services with 50 budgets, the dashboard is unreadable and most are usually green. Leadership stops looking. Worse, a service that is critical to user experience but has 99.9% availability looks the same as a backwater nobody uses.

The signal-to-noise problem. With 50 services, on any given day 1-2 will have minor budget issues. The dashboard always shows a few yellow tiles. Leadership learns to ignore yellow because it's always there. The yellow that actually matters gets lost in the noise.

The other failure mode. Per-service SLOs hide cross-service issues that don't fit any single service's responsibility. A latency cascade across three services may stay within each service's individual SLO while creating user-perceptible degradation. No per-service alert fires; no team owns the issue.

The hybrid most teams adopt

One composite at the user-facing level (the number leadership tracks) plus per-service SLOs for the engineering teams that own each service. Two views of the same data. The composite catches the user perspective; the per-service catches the team accountability. Maintain both, accept the duplication.

The hybrid's cost: more dashboards, more alerting, more configuration. The hybrid's benefit: leadership sees one user-facing number, engineering sees team-specific accountability. Both audiences served.

The discipline of the hybrid. Composite SLOs for the 3-5 most user-visible journeys (login, search, checkout, etc.). Per-service SLOs for every service. The composite SLOs are owned by the platform team or a designated incident commander; the per-service SLOs are owned by the service team. Different audiences, different ownership.

Common antipatterns

Just composite, no per-service. Looks clean, kills accountability. Service teams have no individual targets, so individual services can degrade without team-level pressure to fix.

Just per-service, no composite. Strong accountability, no visibility into user experience. Multi-service incidents don't trigger any clear ownership; cumulative degradation goes unnoticed.

Composite computed once daily. Daily composite is too coarse to drive incident response. Composite must be computed at the same cadence as per-service SLIs (per-minute aggregates).

Each team picks their own SLO target without coordination. Service A picks 99.99%, Service B picks 99.9% because that's what their team is comfortable with. The composite ends up at 99.89% — but the leadership wanted 99.95%. The targets weren't designed; they were accreted.

When each is the right call

Composite-only is right when: the team is small (under 30 engineers), services are tightly coupled, the user journey IS the service. Composite captures everything; per-service is overhead.

Per-service-only is right when: services are genuinely independent, teams ship independently, the "user journey" is unclear or varies widely across customers. Per-service maps to team structure cleanly.

Hybrid is right when: 30+ engineers, multiple critical user journeys, mature SRE practice with capacity to maintain multiple SLO views. Hybrid is the most expensive option but produces the best operational signal at scale.

What to do this week

Three moves. (1) Inventory your current SLOs. Are they composite, per-service, or hybrid? Most teams have an accidental mix that nobody designed. (2) Identify your top 3 user journeys. Define a composite SLO for each (it doesn't have to be implemented immediately; the definition is the first step). (3) For your largest service, ensure a per-service SLO exists with clear ownership. Pair the per-service with the composite next quarter; the conversation is easier when both exist.