Stopping Criteria for Iterative SRE Agents

When does the agent stop? Goal achieved. Goal unreachable. Budget exhausted. Confidence threshold hit. The four-criterion stop policy and the order they fire in.

The four criteria

An iterative agent without explicit stopping criteria runs until cost catches up with it. The four criteria below cover every case an agent loop has to handle.

Goal achieved. The agent self-reports it has met the success criterion. Stop and return the result.
Goal unreachable. The agent self-reports it cannot make further progress. Stop and escalate with what it has.
Budget exhausted. Tokens, time, or actions have hit their cap. Stop and escalate with partial state.
Confidence threshold. The agent’s confidence in its current hypothesis has crossed a threshold high or low. Stop and act, or stop and escalate.

Order they fire

The order matters because a confused order produces a confused agent. Budget is hard, confidence is fast, and self-report fires last when nothing else has resolved the run.

Budget first. A budget-exceeded run cannot be saved by goal-achieved. The cap is hard and checked before every step.
Confidence next. High confidence ends the run early; low confidence escalates early. Both save iteration cost when the verdict is already clear.
Goal achieved or unreachable last. These are the agent’s self-report and fire only when no earlier criterion has decided the run.
Race-safe ordering. Each step evaluates the criteria in this fixed order so two simultaneous triggers resolve deterministically.

How agents self-report goal status

Self-report is what makes the loop legible. A structured status field forces the model to evaluate its own progress at every step rather than burying it in prose.

Structured output. Each step emits a status field: in_progress, done, or blocked. The loop reads the field and dispatches accordingly.
Forced reflection. Requiring the field every step forces the model to slow down and ask whether it is done. This is a feature, not overhead.
Self-reports are sometimes wrong. The eval suite tests goal-status accuracy directly; the prompt is tuned over time to be honest about its own progress.
Tie to citations. A “done” status that does not cite the evidence is rejected; the loop demands the agent point to what convinced it.

Confidence threshold trade

Threshold choice is product-specific. Stop-too-late wastes cost; stop-too-early acts on uncertainty. Pick by use case rather than by default.

High threshold above 0.9. Agent stops only when very sure. Reduces false positives at the cost of longer runs.
Low threshold above 0.6. Agent stops earlier. Reduces run length but allows more uncertain conclusions through.
Triage tolerates lower confidence. Output will be human-reviewed; a 0.7 hypothesis with strong evidence pointers is useful.
Auto-remediation needs high confidence. Acting on uncertainty is dangerous; the threshold for acting alone sits above 0.9.

Evaluating the stop policy

Stop-policy regressions are silent. The eval cases below catch them before they ship by exercising each direction the policy can fail.

Should stop at step 3. Cases where the answer is plain after a few steps. Pass if the agent stops; fail if it keeps iterating.
Should escalate at step 5. Cases where progress stalls. Pass if the agent escalates; fail if it spins to budget.
Should hit budget. Cases that genuinely require the full budget. Pass if budget enforcement works; fail if the agent silently truncates.
Should ignore noise. Cases where a transient signal could trick a low-confidence threshold. Pass if the agent waits for the signal to stabilise.