The Agent Toolbox: How to Decide Which APIs an Agent Can Call

More tools = more capability and more risk. The four-axis framework for deciding which APIs an agent gets, with worked examples for triage, remediation, and audit roles.

The four axes that decide

Reversibility. Tools that can be undone are safer than tools that cannot. "Restart pod" is reversible (worst case the pod restarts again); "drop table" is not. Reversibility is the first filter.

Blast radius. Tools that affect one resource are safer than tools that affect many. A scoped "restart pod X" is fine; a "restart all pods" is a separate, escalated tool with extra approvals.

Read vs write. Read-only tools are nearly always safe. Write tools require the other three checks. The default for new agents should be read-only until each write tool earns its place.

Cost and rate-limit. Some tools cost money per call (Cost-Explorer, certain SaaS APIs); some have low rate limits. Treat these as scarce resources with explicit budgets.

The triage agent toolset

Read-only by design. The toolset includes: metric query, log search, recent-events lookup, runbook search, related-incident retrieval. No tool can change state. Triage is reasoning, not acting.

The triage tools should be cheap and fast. p99 latency under 500ms each. The agent will call them many times; slow tools dominate run latency. Cache aggressively where the data is stable.

Output of triage is a hypothesis, not an action. The hypothesis is the input to the remediation agent (a separate agent with a separate, narrower toolset).

The remediation agent toolset

Carefully scoped. Each remediation tool has a tight allowlist of resources it can affect. "Restart connection pool" works only on databases the agent owns; it cannot touch random RDS instances.

Approvals required for the irreversible subset. The agent can propose; only a second pair of eyes (or a second agent) signs off. Never let a single agent execute an irreversible action without a gate.

Verify after every action. The remediation tool must be paired with the verification tool. "Did the action complete" and "did the metric move" are inputs to the loop's next decision.

The audit agent toolset

Audit agents read everything: action logs, change history, metric history, deploy timelines. They write nothing except their own report. The toolset is broad but read-only.

Audit tools are the cheapest to allowlist because they cannot do harm. They are also the most useful for the team because they aggregate signal across systems.

The audit agent's report is itself an action: it gets posted to Slack or filed as a ticket. That posting tool is the only write capability the agent has, and it is bounded to one channel or one project.

The toolbox-approval policy

Every new tool requires a one-page review: what it does, what it can affect, what it costs, what could go wrong. The review goes to the team owning the agent and the team owning the underlying system.

The default decision is no. The default starts at read-only and expands incrementally. Each expansion is a deliberate decision, not a drift.

Tools have owners. The owner is responsible for the tool's behaviour, its failure modes, and its retirement when it is no longer needed. Ownerless tools are technical debt.

What to do this week

List every tool your agent can call. Mark each by reversibility, blast radius, read-vs-write, and cost. The tools that score risky on three of four axes are the ones to scrutinise. Most teams find one tool that should be removed entirely and one that should be gated more tightly.