AWS us-east-1 2021

Multi-service outage.

Overview

The December 2021 us-east-1 outage was a multi-hour, multi-service AWS incident that took down core services in the region and most third-party tools that ran there. Many AWS internal services depended on us-east-1; even the console became unreachable. The case study reshaped how teams design for regional resilience: multi-AZ is not enough when the failure domain is the region itself.

The approach

Multi-region for critical services (active-active or hot-standby), alternative consoles via CLI from outside the affected region, dependency mapping that catches transitive failures, region-aware monitoring that runs from a different region, game-day exercises that test regional failure before it happens.

Why this compounds

Each architecture review that applies the lesson reduces regional-dependency risk. Multi-region matches enterprise compliance requirements and unlocks regulated markets. The team's resilience muscle grows from "we hope us-east-1 stays up" to deliberate failure-domain design.