Agentic SRE Advanced By Samson Tanimawo, PhD Published Jul 30, 2026 5 min read

The Decision Tree Trap in Early Agent Designs

Hand-coded decision trees feel safe and end up brittle. The four signs your agent has degenerated into a decision tree, and how to refactor back toward genuine reasoning.

How to recognise the trap

An agent that has degenerated into a decision tree usually has one tell: every prompt change adds a clause for a new edge case. The prompt has 12 if/then branches. The new failure mode that just shipped becomes branch 13. The pattern is a nested if-statement masquerading as reasoning.

If you can express the agent's behaviour as a tree, the model is not adding value. A static tree, hand-coded in Python, would be cheaper, faster, and more reliable. The model exists to handle the cases the tree cannot enumerate, not to replace the tree.

Other tells: regression in production every time you add a branch (the new branch leaks into other cases), prompts that grow monotonically, evals that pass during dev but fail in production on slightly different inputs.

Why teams fall in

The trap is incremental. The agent worked. A new edge case appeared. The fastest fix was a prompt branch. Repeat 30 times. Each individual fix was correct; the cumulative result is a brittle tree.

The cultural cause: prompt edits are cheaper than code edits. Engineers reach for the prompt because the iteration loop is fast. They do not reach for the codebase because that requires PRs, reviews, deploys.

The technical cause: the model is good enough that the tree-shape often works. "It mostly does the right thing" is the seductive zone where the trap deepens. The pain only shows up at the long tail.

Refactor back toward genuine reasoning

Step one: enumerate the branches in the prompt. Most agent prompts can be parsed into 5-15 distinct conditions. Make the list explicit; this alone often shows the absurdity.

Step two: promote the deterministic branches into code. "If service is X, do Y" is a switch statement, not a reasoning step. Move it. The prompt loses 40% of its length.

Step three: rephrase the remaining prompt in terms of what the agent should reason about, not which branches it should take. "Given the alert and metrics, identify the most likely cause among A, B, C, D." Reasoning, not branching.

How to tell if the refactor worked

Eval scores should hold or improve. If they drop, you removed a branch the model was actually using. Add it back as code, not as a prompt clause.

Prompt size should drop materially. If it does not, you have not refactored, you have rearranged.

Latency and cost should improve. The model has less to read and less to decide. A 30% reduction in both is realistic for a thoroughly trapped prompt.

The policy that prevents recurrence

Treat every prompt branch addition as technical debt. Log it; review monthly; refactor the worst offenders into code.

Run a prompt-length budget. "Triage prompt must stay under 1500 tokens." When a new branch breaks the budget, you cannot just add it; you must refactor first.

Pair-review prompt changes the way you pair-review code. The second pair of eyes asks: "is this a branch or a reasoning step?" That question alone catches half the trap.

What to do this week

Open your largest agent prompt. Highlight every if/then branch. If there are more than 8, you are in the trap. Promote the deterministic ones into a Python switch. Re-run evals. The smaller prompt should hold quality and improve cost.