When SRE Agents Hallucinate Tool Output (and How to Detect It)

Agents sometimes invent tool results that the tool never returned. The detection harness, the most common provocations, and the prompt-level fixes that work.

How it happens

Hallucinated tool output happens because the model is trained to produce confident answers. Three structural patterns make it worse and each has a fix.

Detection harness

The harness wraps tool calls and cross-checks the model’s output against what the tool actually returned. Without it, hallucinations are invisible until they cause a downstream incident.

Common provocations

Three prompt patterns provoke hallucination reliably. Knowing them by name makes them avoidable in design review.

Prompt-level fixes

Three prompt-level changes catch most hallucinations before they ship. None require model changes.

The cost of detection

Detection is mostly compute, not engineering effort. The harness pays back through caught regressions.