Instances is the running-process view of the agent fleet. Each agent type runs as one or more instances across regions. This page shows them all: CPU, memory, queue depth, last heartbeat, version pinning, owning region. Use it to spot a stuck instance, an under-provisioned region, or an instance running an outdated version.
For each instance, the page reports: CPU usage, memory usage, queue depth (tasks waiting for this instance), p95 task latency, last heartbeat time, and the agent version pin. The data refreshes every 5 seconds. Color coding is consistent with the rest of the platform: green / yellow / red on threshold breach.
Stuck instances happen. The page has a one-click restart per instance with graceful semantics: the instance drains its queue (or hands work to a sibling), then exits, then is replaced. The whole loop usually takes under 30 seconds. Restart is logged in Agent Ledger and Audit Logs so the postmortem has the trail.
Each region's instance count is shown alongside its load. A region with high queue depth and no spare instances is under-provisioned. The page surfaces this directly: "us-east is at 92% saturation across 4 instances, eu-west has 1 instance at 38%." Recommendations include a one-click "scale us-east +2" action.
Each instance carries a version pin. Mixed-version fleets cause subtle bugs. The page highlights version drift: when most instances are on v 12 but one is on v 11 (or vice versa), the outlier is flagged. Drift detection runs continuously; flagged instances surface in the daily report.
Subscribe to Nova AI Ops on YouTube for demos, tutorials, and feature deep-dives.
Agents are processes. Processes get stuck. Instances is the page that shows you which one and lets you restart it without ssh.