Region-Specific SLOs
Different SLOs per region.
Why per-region SLOs
Regions are not interchangeable. us-east-1 has different traffic shape, dependency latency, and infrastructure quality than ap-southeast-2; a global SLO averages these and hides regional issues. Customers in a specific region experience that region’s reliability, so a global 99.9% with one bad region at 99.5% is still a real customer-facing problem.
- Regions differ. Traffic shape, dependency latency, infrastructure quality; us-east-1 versus ap-southeast-2.
- Global average hides issues. A bad region at 99.5% inside a global 99.9% is still real customer impact.
- Regional issues invisible globally. Noisy AZ, slow regional dependency, vendor regional outage; visible per-region, hidden globally.
- Per-region surface. The customer experience matches the region they hit; the SLO must too.
How to define them
Definition starts with a baseline plus per-region modifiers. Global 99.9%, major regions same, minor regions or recent expansions looser with explicit communication; tag every request with the region the user hit (edge or load balancer adds the label, metrics carry it through); per-region dashboards with the same panels so operators compare like with like.
- Baseline plus modifiers. Global 99.9%; major regions same; minor or new regions looser with explicit communication.
- Per-request region label. Edge or load balancer adds the label; metrics carry it through.
- Per-region dashboards. Same panels per region; operators compare like with like.
- Per-region documented baseline. The baseline plus modifier rationale committed; supports later review.
Rolling up to global
Rollup needs the right math. Weighted average by traffic (a 1%-traffic region shouldn’t move global as much as a 50% one); show both rollup and per-region so stakeholders see both views and the conversation about an underperforming region happens; per-region burn-rate alerts on top of global burn-rate alerts catch different failure modes.
- Weighted average by traffic. Don’t take unweighted mean; the per-region weight matters.
- Show both views. Rollup plus per-region; the underperforming-region conversation needs the per-region view.
- Per-region burn rate alerts. On top of global; the two layers catch different failure modes.
- Per-rollup methodology documented. The math committed to the SLO doc; supports investigation.
Communicating regional SLOs
Communication mirrors the regional reality. Status page per region (customers see the truth for their region, not a global average); pricing tiers may include region-specific guarantees (premium-us-east-1 may be 99.95% while premium-ap-southeast-2 is 99.9%); new regions get a soft SLO for the first 90 days, tightened when capability is proven.
- Status page per region. Customers see the truth for their region; not a global average that hides experience.
- Pricing tier alignment. Region-specific guarantees reflect actual capability; the SKU matches reality.
- Soft SLO for new regions. First 90 days are a build window; tighten when capability is proven.
- Per-region disclosure. Customers in new regions understand the trade; supports trust during expansion.
Operating regional SLOs
Operating regional SLOs needs three disciplines. Per-region on-call rotation if scale supports it (otherwise global on-call with regional context in the dashboard); multi-region failover playbooks reference per-region SLOs (trigger: local SLO breached, peer region healthy); quarterly review per region because some trend up and some plateau.
- Per-region rotation when possible. If scale supports it; otherwise global on-call with regional context loaded.
- Failover playbook references SLOs. Trigger: local SLO breached, peer region healthy; the SLO drives the action.
- Quarterly per-region review. Some regions trend up, some plateau; investment decisions follow the data.
- Per-region investment record. Documented decisions per region; supports continued capability building.