Multi-Region Data Pattern: Active-Passive vs Active-Active
Data is the hardest part of multi-region. The patterns and their trade-offs.
Active-passive
Multi-region data patterns determine how the system handles writes across regions. The two basic patterns are active-passive (one region writes, others replicate) and active-active (all regions write, conflict resolution merges). Each has distinct trade-offs; mature stacks often use both for different data classes.
What active-passive looks like:
- Single writeable region.: One region is the primary; all writes go there. The other regions are read replicas; they receive replicated data but do not accept writes. The simplicity comes from the single source of truth for writes.
- Read replicas elsewhere.: Other regions serve reads from local replicas. The latency for reads is low (no cross-region trip); the replica lag is the only freshness concern. For most read-heavy workloads, the lag is acceptable.
- Simpler.: Active-passive is significantly simpler than active-active. There are no write conflicts to resolve; the consistency model is the familiar single-writer model. The team's mental model matches the system's behavior.
- Consistent.: The active region is strongly consistent. The replicas are eventually consistent with replica lag; the lag is bounded and observable. The consistency model is well-understood.
- Failover requires DNS or LB shift.: When the active region fails, traffic must be redirected. The redirect is via DNS update, load balancer reconfiguration, or application-level routing change. The failover takes minutes; that window is the practical RTO.
Active-passive is the right pattern for most transactional workloads where strong consistency and operational simplicity matter more than write latency for distant users.
Active-active
Active-active multi-region accepts writes in every region and merges them. The pattern provides the lowest write latency for distributed users but at the cost of significant complexity in conflict handling.
- Multi-region writeable.: Every region accepts writes locally. Users get the lowest possible write latency because they always write to their nearest region. The geographic distribution of users is matched by the geographic distribution of writes.
- CRDT or last-write-wins resolution.: When concurrent writes target the same data, the system needs a resolution strategy. Conflict-free Replicated Data Types (CRDTs) are designed to merge automatically. Last-write-wins picks the most recent timestamp. Each strategy has trade-offs.
- Complex.: Active-active is significantly more complex than active-passive. Conflict handling, replication topology, network partitions, schema evolution all become harder. The team's operational burden is significantly higher.
- Eventually consistent.: The system is eventually consistent; strong consistency across regions is not achievable without prohibitive latency. Applications that require strong consistency must avoid this pattern or fall back to single-region operations for those flows.
- Best for globally-distributed users.: When users are distributed and write latency matters (collaborative editing, social, gaming), active-active is the right pattern. The latency benefit justifies the complexity.
Active-active is powerful but expensive. Most teams under-estimate the complexity until they live with it.
Hybrid
Most production stacks end up hybrid. Some data classes are active-passive (for consistency); others are active-active (for latency). The split matches the data's actual access pattern.
- Active-passive for transactional data.: Financial records, ordering, inventory, billing. The data classes that require strong consistency and where occasional cross-region latency is acceptable. The simpler model fits the requirements.
- Active-active for cache and session state.: Caches, ephemeral sessions, presence data. The data classes where local-region performance matters most and where eventual consistency is acceptable or even ideal. The complex model fits the requirements.
- Most production stacks.: Few production stacks are pure active-passive or pure active-active. The hybrid is the realistic answer that captures the benefit of each pattern where it applies.
- Match the pattern to the data class.: Each data class is evaluated separately. What is the read latency requirement? What is the consistency requirement? What is the write latency requirement? The answers determine which pattern fits.
- Document the boundaries.: The team documents which data classes use which pattern. New developers understand which APIs they can call from any region and which require the active-region path. The documentation prevents accidental cross-pattern operations.
Multi-region data pattern selection is one of the highest-leverage architectural decisions in any distributed system. Nova AI Ops integrates with multi-region observability data, surfaces replica lag and conflict rates, and helps teams understand whether their pattern choice is producing the expected behavior.