SLOs by Customer Segment

Different SLOs for different customers.

Why segment SLOs

Customers are not equal. Premium tier paying $5k/month deserves tighter availability than Basic tier paying nothing; a single SLO across all customers averages the experience and hides the asymmetry. Segmented SLOs let you commit different reliability tiers in pricing (pay more, get tighter availability) and operational priority follows segment (premium degradation demands all-hands; free-tier degradation may be acceptable).

How to structure segment SLOs

Structure starts at the edge. Tag every request with customer segment at the edge (authentication layer or API gateway adds a header, downstream metrics carry the label); define per-segment availability and latency SLOs (Premium 99.95% / p99 < 200ms; Standard 99.9% / p99 < 500ms; Free 99% / p99 best-effort); cardinality matters (3-5 segments works, 10 creates sprawl).

Operating segment SLOs in production

Operating segment SLOs needs per-segment infrastructure. Per-segment dashboard with each segment’s SLO health visible plus burn-rate alerts per segment; capacity planning per segment because premium may need dedicated capacity, separate connection pools, prioritised queue lanes; incident triage looks at per-segment impact first (sev 1 if premium degraded, sev 2 if only standard).

Trade-offs and gotchas

Three trade-offs deserve attention. Maintenance burden (each segment is another set of dashboards, alerts, runbooks; don’t add segments unless they correspond to real commercial differences); internal-only segments are usually a mistake (engineering convenience is not a customer commitment); metric cardinality cost (the segment label adds a multiplier to time series count, real observability bill increase at high volume).

When to add segment SLOs

Add when pricing tiers exist with reliability commitments because without that, segmentation is optics not policy; add when premium customers complain about reliability that aggregate metrics show as fine; don’t add for engineering convenience and don’t add when the team is already drowning in operational complexity. Wait until pricing or contracts demand it.