BGP Basics for SREs: What You Need to Know
Most SREs do not need BGP expertise; all SREs benefit from BGP literacy. The four concepts cover 80%.
Why SREs need BGP basics
Cloud network outages often trace to BGP. Understanding the words means understanding the postmortem.
Without basics, you wait for the network team to translate.
Four concepts
- AS (Autonomous System): a network with one routing policy.
- Prefix: an IP range (e.g., 10.0.0.0/24).
- Path: the AS sequence to reach a prefix.
- Policy: rules for accepting/sending routes.
When to escalate
BGP issues cross team boundaries. Cloud-provider BGP is the cloud’s network team.
Document the escalation path; rehearse in tabletops.
Anycast (BGP application)
Anycast: same prefix advertised from multiple locations; BGP picks closest.
Used by major DNS, CDN, and global API providers.
Antipatterns
- Treating BGP as ‘the network team’s problem.’ Postmortems unintelligible.
- Configuring BGP without expertise. Outage.
- Ignoring ‘route hijacks’ in news. Could happen to you.
What to do this week
Three moves. (1) Apply this pattern to your highest-risk network path. (2) Measure the failure mode rate before/after. (3) Document the change so the next incident-responder inherits the knowledge.