System Gauge is the resource-utilization page. CPU, memory, disk, network, connection counts, queue depths, per host and per service, with threshold lines from your alert rules drawn directly on the gauges. Use it as the first stop when an SLO is burning hot and you need to know which resource is the bottleneck.
CPU, memory, disk, network, connection counts, and queue depth, captured every 10 seconds per host and rolled up per service. The metrics come from the agents already deployed for log/metric collection so there is no new sidecar. Bring-your-own-Prometheus also works: the page reads from your existing Prometheus if you do not want Nova's collectors.
Every gauge shows the warn and page thresholds from your alert rules as horizontal lines. Crossing a threshold is visible at a glance, no need to know "is 78% bad?" The thresholds come from your existing alert rules so the page agrees with whatever fires your pager.
Saturation rarely lives alone. The page highlights cross-resource correlations: CPU spike on the API host correlated with connection-count climb on the database host. The correlations are the same engine that drives Cross-Signal Correlation; the gauge view is just the resource-only slice of it.
Each gauge has a small forecast line drawn from the recent slope of the metric. "At current rate, this fills in 14 days." Use it for capacity planning: when does this disk need to be bigger, when does this pool need to be wider. The forecast updates every hour so it tracks reality, not last quarter.
Subscribe to Nova AI Ops on YouTube for demos, tutorials, and feature deep-dives.
Most incidents are a saturated resource somewhere. System Gauge tells you which one in seconds.