The Coordinator Agent Pattern: When You Actually Need One
Coordinators add latency and cost. The signals that you actually need one, and the simpler patterns that work most of the time.
Signals you need a coordinator
Three signals indicate a coordinator pays back. More than 3 specialists involved in a single workflow (below that, code-based orchestration is simpler); dynamic dependencies (which specialist runs next depends on what the previous one found, static orchestration cannot express this); multi-step plans that branch (plan A pursues one path, plan B another, the coordinator picks per-run).
- 3+ specialists per workflow. Below that, code-based orchestration is simpler.
- Dynamic dependencies. Next specialist depends on previous output; static can’t express.
- Branching multi-step plans. Plan A vs B picked per run; coordinator decides.
- Per-workflow signal evaluation. Each workflow tested against the three signals; supports correct decision.
Simpler patterns that work most of the time
Three patterns work without a coordinator. Pipeline (agent A to B to C, fixed order; a function call chain in code, no coordinator needed); fan-out (agent A to (B, C, D in parallel); code-based, no coordinator); conditional (if the result of A is X, run B, else run C; a switch statement, no coordinator).
- Pipeline. A to B to C fixed order; function call chain.
- Fan-out. A to (B, C, D parallel); code-based; no LLM coordination.
- Conditional. Result of A is X then B else C; switch statement.
- Per-pattern code preference. When code suffices, use code; LLM coordination is the fallback.
Cost of a coordinator
The coordinator has real costs. A coordinator is itself an LLM call (adds latency: 1-3 seconds per coordination decision); adds tokens (the coordinator has to read the state of all sub-agents, tokens grow with the number of sub-agents); adds failure modes (the coordinator can mis-route, deadlock, or loop, so eval and observability are required).
- 1-3 second latency per decision. The coordinator is an LLM call; adds round-trip time.
- Token cost grows with sub-agents. Coordinator reads state; cost scales with depth.
- New failure modes. Mis-route, deadlock, loop; eval and observability required.
- Per-coordinator monitoring. The coordinator has its own SLO; supports correct operation.
Design the coordinator narrowly
The coordinator’s only job is routing. Not reasoning, not analysis, not action; coordinator input is the goal, the available specialists, the current state, and output is which specialist to invoke next with what scope; coordinator does not see specialist internals and reads only the structured output of each specialist.
- Routing only. Not reasoning, not analysis, not action; the discipline.
- Goal-specialists-state input. Three inputs; bounded surface.
- Specialist-and-scope output. Two-field output; structured.
- Sees only structured output. No specialist internals; the abstraction holds.
Trust the coordinator carefully
Three constraints keep the coordinator safe. Read-only by default (the coordinator can route but cannot act, specialists do the acting); bounded loops (the coordinator can iterate up to N times, beyond that escalate, N is small typically 5); eval the coordinator separately because its routing decisions are first-class outputs that need their own test suite.
- Read-only by default. Routes but doesn’t act; specialists act.
- N=5 loop bound. Up to 5 iterations; beyond that escalate.
- Separate eval suite. Routing decisions are first-class outputs; need their own tests.
- Per-coordinator audit log. Routing decisions captured; supports investigation.