Debugging an Agent That Made the Wrong Call

The five-question debug rubric. Was the tool result wrong? The prompt missing context? The model confused? The plan flawed? The output mis-parsed? Asked in this order, the bug is usually in the first answer.

The five-question rubric

The rubric isolates the wrong call to one of five layers. Walking it in order shortens the debugging path because the most common bugs sit in the first two questions.

Why this order

The order is empirical. Walking it in this sequence shortens the average debug time by half compared to starting with prompt rewrites.

Artefacts you need to debug

Without the artefacts below, the rubric collapses into guesswork. Make them required logging on every agent run, not on-demand.

Once you know which step is wrong

The fix depends on which question failed. Each step has a canonical remedy that scales without rewriting the whole agent.

Add an eval case after the fix

The reproducer is the artefact that turns a one-off debug into a regression guard. Always commit the case alongside the fix.