BGP Fundamentals for SREs
Interdomain routing.
Overview
BGP fundamentals for SREs covers interdomain routing well enough to investigate incidents that touch the global internet. Memorising protocol details is not the goal; reading a looking glass and recognising hijacks is what matters during the rare BGP incident.
- Interdomain routing. How the internet’s autonomous systems exchange routes; matches investigation needs.
- Path attributes. AS_PATH, LOCAL_PREF, MED; matches route-decision logic for understanding why traffic flows where.
- Common failure modes. Hijacks, leaks, withdrawals; matches incident reality so the team recognises the pattern.
- Looking glass tools plus RPKI. Per-region BGP visibility for investigation; cryptographic origin validation for modern protection.
The approach
The practical approach: investigate via looking glass during incidents, learn from public BGP postmortems, monitor your own prefix announcements, deploy RPKI for your prefixes, document the topology. The team’s discipline produces fast investigation when the rare BGP incident happens.
- Looking glass tools.
bgp.he.net, RIPE Atlas; supports investigation by showing routes from external vantage points. - Read major BGP postmortems. Facebook 2021, Cloudflare 2022 lessons; grows expertise without needing to live through the incident.
- Monitor your prefixes. Per-prefix announcement tracking; catches hijacks that only show up from external view.
- RPKI plus documented topology. Cryptographic origin validation produces real protection; per-prefix upstream documented for investigation.
Why this compounds
BGP discipline compounds across years. Each investigation grows the team’s networking expertise; the next BGP incident is investigated faster because the team recognises the patterns.
- Faster internet investigation. BGP fluency produces fast root cause; reduces MTTR for incidents that touch external routing.
- Better network security. RPKI plus monitoring catches hijacks; produces real protection rather than reactive response.
- Better incident response. Looking glass tools support real-time investigation; the team has signal during the incident.
- Institutional knowledge. Each session teaches networking; the team’s networking muscle grows.
BGP fluency is an operational discipline that pays off across years. Nova AI Ops integrates with networking telemetry, surfaces patterns, and supports the team’s network engineering discipline.