Reliability Engineering

Every container, every dependency,
one graph that updates as you deploy

Container Graph is the live dependency map of your container fleet. Every pod, every service, every call edge. Use it to see what would break if you drained a node, which pods will be impacted by an upcoming maintenance, and which services have no redundancy. The graph updates as deploys happen, no manual edges to maintain.

Get Started Talk to Sales
app.novaaiops.com / container-graph
● LIVE
Live
auto-discovered
< 30s
edge propagation
eBPF
or service-mesh source
What-if
drain simulation
Auto-Discovery

No manual edges

Container Graph discovers edges automatically. Two sources: eBPF probes on the nodes (sees every TCP flow regardless of mesh), and service-mesh telemetry (Linkerd, Istio, Consul) when present. Both sources reconcile so an edge is only present if at least one source confirms it. New deploys show up within 30 seconds.

  • eBPF + service mesh: two independent sources that reconcile; an edge is real only if at least one observes it
  • 30-second propagation: new pods and new edges appear in the graph within 30s of the first traffic
  • No manual config: no annotations, no IaC declarations, discovery is observational
app.novaaiops.com / container-graph · discovery
What-If Drain

Simulate a node drain before you click drain

Hover any node and the graph highlights every pod that would have to reschedule. Hover any pod and the graph highlights every service that would lose a replica. Use it before maintenance: see what is about to break, decide whether you have enough headroom, then drain.

  • Hover-to-highlight: hover a node → highlights affected pods; hover a pod → highlights services that lose replicas
  • Replica count overlay: each service shows current replicas / minimum required so headroom is visible
  • No-redundancy badge: services with one replica get a red badge so you do not drain their host without thinking
app.novaaiops.com / container-graph · drain
Service Health Overlay

Color the graph by SLO compliance

Toggle the service-health overlay and every node colors by its SLO compliance: green for healthy, yellow for fast-burning, red for over budget. The graph becomes a map of where reliability work is concentrated. Useful in weekly reviews to see whether the unhealthy services are clustered (one team) or scattered (platform-wide).

  • SLO color per node: pulls live from Service Health Matrix; same color encoding everywhere
  • Cluster pattern detection: when unhealthy nodes cluster around one service, that is a likely root cause
  • Alternate overlays: cost per service, traffic share, replica count, same graph, different colors
app.novaaiops.com / container-graph · overlay
Time Travel

Replay the graph at the time of any incident

For postmortems, replay the graph at the moment of an incident. Pods that were running, edges that were active, the service-health overlay at that minute. The replay is sourced from the same eBPF/mesh data so it shows what was actually happening, not a reconstruction.

  • Per-minute replay: rewind to any minute in the last 30 days; longer windows on request
  • True state: replay reads the same observation source, so no reconstruction artifacts
  • Cite from postmortems: every incident page links to the graph replay for its open timestamp
app.novaaiops.com / container-graph · replay
Video walkthrough coming soon

Subscribe to Nova AI Ops on YouTube for demos, tutorials, and feature deep-dives.

Stop discovering dependencies during incidents

The graph is the answer to "what depends on this?" before you take it down, not after.

Get Started Request a Demo