Feature: Agentic Loop

New core feature.

Overview

The Nova agentic loop drives incident investigation and remediation through a disciplined hypothesis-evidence-action cycle. The loop is what turns a single LLM call into a system that can actually operate against production.

Hypothesis, evidence, action. Each iteration generates a hypothesis, gathers evidence to test it, and proposes the next action. The loop terminates when the hypothesis is confirmed or rejected.
Tool-using agent. Telemetry queries, runbook lookups, and code reads are all tool calls the agent makes inside the loop, not free-text reasoning.
Human-in-the-loop on destructive actions. Read actions run automatically; writes that change production state require operator approval.
Audit trail. Every hypothesis, every tool call, every action is logged. The audit trail is what makes the loop reviewable after the fact.

The approach

Three properties make the loop safe for production: hypotheses are explicit and ranked, tools are wrapped with bounded scope, and writes are gated behind operator approval. None of the three is optional.

Explicit hypotheses. Each loop iteration emits a ranked hypothesis list with confidence scores. The ranking is what the operator reviews.
Bounded tool surface. Six narrow tools, each with default scope limits. The agent never sees a raw provider API.
Approval gate on writes. Restart, scale, rollback, schema change all stop at the approval gate. Reads do not.
Documented boundaries. The agent’s scope is written down. Out-of-scope situations escalate rather than improvise.

Why this compounds

The loop pays back through faster MTTR and a growing library of resolved incidents the agent has handled before. Each clean run is also a training case for the next.

Faster MTTR. Agent-driven first triage routinely shaves 30 percent off median resolution time on the incident classes the agent has seen.
Reusable runs. Each completed loop becomes a worked example the next loop can reference. The library compounds.
Trust through approval. Operators stay in control of writes; trust accumulates without giving up safety.
Year-one investment, year-two habit. The first quarter sets up the eval suite and the tool wrappers. By year two the loop is routine and the team feels the leverage.