GLOSSARY · N

Network Partition

A failure mode where part of the network can't reach another part, the worst-case scenario for distributed-systems correctness.

Definition

A network partition (split-brain in some literatures) is a failure mode where some nodes in a distributed system can no longer communicate with the others, even though the nodes themselves are still up. Causes include physical-link failure, cloud-provider issue, misconfigured firewall, BGP route flap. CAP theorem says under partition, you can have either consistency or availability, not both. Most production systems pick availability (return potentially-stale data) and try to reconcile when the partition heals.

Why it matters

Most distributed-systems bugs in production are network-partition bugs that nobody tested for. A test suite that runs everything on one host can't surface them; chaos engineering that injects partitions can. Designing for partitions (idempotent operations, conflict-free data structures, explicit reconciliation) is what makes a system survive when the network does its worst.

How Nova handles it

See the part of the platform that handles network partition in production.

Nova service map

Network Partition

Definition

Why it matters

How Nova handles it

Related terms