Region-Specific SLOs
Different SLOs per region.
Why per-region SLOs
Regions are not interchangeable. us-east-1 has different traffic shape, dependency latency, and infrastructure quality than ap-southeast-2. A global SLO averages these and hides regional issues.
Customers in a specific region experience that region's reliability. A global 99.9% SLO with one bad region at 99.5% is still a real customer-facing problem for users in that region.
Regional SLOs surface infrastructure issues that global aggregates hide. A noisy AZ, a slow link to a regional dependency, a vendor's regional outage: all visible per-region, invisible globally.
How to define them
Define a baseline SLO that applies globally, then per-region modifiers. Global: 99.9%. Major regions: same. Minor regions or recent expansions: looser, with explicit communication.
Tag every request with the region the user hit. Edge or load balancer adds the label. Metrics carry it through.
Per-region dashboards with the same panels. Operators compare like with like. Anomalous region surfaces against its peers.
Rolling up to global
Weighted average by traffic. Don't take an unweighted mean across regions. A region with 1% of traffic shouldn't move the global SLO as much as a region with 50%.
Show both rollup and per-region. Stakeholders see both views. The conversation about an underperforming region happens because the per-region view exists.
Per-region burn rate alerts on top of global burn rate alerts. The two layers catch different failure modes.
Communicating regional SLOs
Status page per region. Customers see the truth for their region, not a global average that hides their experience.
Pricing tiers may include region-specific guarantees. Premium-in-us-east-1 may be 99.95%; premium-in-ap-southeast-2 may be 99.9% reflecting actual capability.
When opening a new region, set a soft SLO for the first 90 days. Build data; tighten when capability is proven. Customers in the new region understand the trade.
Operating regional SLOs
Per-region on-call rotation if scale supports it. Otherwise global on-call with regional context loaded into the dashboard.
Multi-region failover playbooks reference per-region SLOs. The trigger to fail over: local SLO breached, peer region healthy.
Quarterly review per region. Some regions trend up; some plateau. Investment decisions follow the data.