Agentic SRE Advanced By Samson Tanimawo, PhD Published Jul 25, 2026 5 min read

The Read-Only First Rule for New SRE Agents

Ship every new agent in read-only mode for 30 days before letting it act. The metrics to track, the graduation criteria, and the bugs this rule has caught at three different companies.

The rule, simply stated

Every new SRE agent ships in read-only mode for at least 30 days. It can observe, reason, and recommend. It cannot act. After 30 days, with eval data and team consensus, individual write tools graduate one at a time.

The rule is simple and the discipline is hard. There is always pressure to skip the read-only phase; "the agent is ready, why wait." The 30-day window catches the bugs you do not yet know about.

The cost of the rule is small (a month of slower rollout). The benefit is large (no agent-caused incidents while the agent is finding its feet). The trade is asymmetric in the right direction.

Metrics to track in the read-only window

Recommendation quality. For each recommendation the agent makes, did the human on-call agree, partially agree, or disagree? Aim for 80%+ agree by the end of week four.

Hallucination rate. How often does the agent reference data the tools never returned? Should be near zero by week four. Anything above 1% blocks graduation.

Coverage. Of the incidents the agent should have triaged, how many did it actually attempt? Low coverage means the input contract is too narrow; high coverage means the agent is reaching beyond its scope.

Graduation criteria, in order

Recommendation quality plateaued at the target level. Hallucination rate at zero for the last week. Coverage stable. No outstanding eval regressions. Two team leads have signed off.

Pick the lowest-risk write tool first. "Tag the alert as triaged" before "restart the pod." Watch for two weeks. If clean, graduate the next tool.

Never graduate two tools at once. The 30-day window proved the agent's read behaviour; each new write tool has its own incubation. Be patient.

Bugs the read-only window has caught

An agent that confidently recommended restarting a service that was already restarting. Read-only window caught the mismatch between the agent's view of the world and reality. A write-mode bug would have caused a cascading restart.

An agent that retrieved a runbook from the wrong service and recommended an inapplicable action. Read-only caught the misclassification. A write-mode bug would have applied the action.

An agent that understated the blast radius of an action it was recommending. The human on-call caught it; the read-only review surfaced the prompt issue. The agent now estimates blast radius explicitly.

How to make the rule stick

Document the policy. Have it signed by leadership. "Read-only for 30 days" is the norm, not a debate point.

Build the read-only path into the agent infrastructure from day one. The agent should support a flag: capabilities=read. Toggling that flag is one PR review, not a refactor.

Celebrate read-only graduations. "Agent X graduated tool Y this week with eval scores Z" is the kind of changelog entry that reinforces the discipline.

What to do this week

Audit your existing agents. Which ones skipped the read-only window? Pick the one that would benefit most from a retroactive 30-day read-only audit. Run it; see if any prior actions look risky in hindsight. Use the audit findings to formalise the rule going forward.