Network Monitoring is the network-layer observability slice. Per-link bandwidth and packet loss, per-flow retransmits, per-VPC traffic split, per-DNS-zone failure rate. Network issues often manifest first as confusing service degradations; this page surfaces them as the network problems they are.
Four primary signals: per-link bandwidth (and saturation), per-flow packet retransmits and packet loss, per-DNS-zone failure rates, per-NAT-gateway port-allocation saturation. Each signal is correlated with the services that use it so a network issue shows up as "payments-api is degraded because its NAT gateway is full."
Every network signal is automatically tied to the services it affects. A packet-loss spike on a peer link surfaces on the affected services' incident pages with "network: 2.4% loss on this peer." The agent fleet sees the network signal alongside service signals so runbooks consider both.
Two source options. eBPF: a kernel probe on each host captures every flow. Best fidelity. Flow logs: AWS / GCP / Azure flow log ingestion. Less precise but works without host agents. Pick one or both. Reconciliation when both are present catches gaps in either.
DNS gets its own subtab because DNS issues are specifically painful. Per-zone failure rate, per-resolver latency, recent NXDOMAIN spikes, recent SERVFAIL spikes. When DNS goes weird, this view tells you which zone, which resolver, and which downstream service is feeling it.
Subscribe to Nova AI Ops on YouTube for demos, tutorials, and feature deep-dives.
Network monitoring stops the "we spent two hours debugging the app, it was DNS" pattern.