Agentic SRE Advanced By Samson Tanimawo, PhD Published Mar 3, 2026 5 min read

Onboarding On-Call Engineers to Work Alongside Agents

On-call has to trust the agent. The 30-day onboarding curriculum, the shadow-mode period, and the first agent decisions a human should be expected to override.

Week 1: shadow only

The agent runs read-only. The on-call sees the agent's hypothesis alongside their own.

No expectation that the on-call defers to the agent. They form their own opinion; the agent is a peer, not an authority.

Daily debrief: where did the agent agree, where did it disagree, where was each right.

Weeks 2-3: agent-first triage

The agent triages first; the on-call reads the triage; decides whether to follow or override.

Most triage cases follow the agent. Override is logged with a reason; reasons feed back into agent improvement.

On-call still does the action; the agent does not act in this phase.

Week 4: agent action with monitoring

The agent takes specific low-risk actions. The on-call monitors and intervenes if needed.

Action allowlist starts narrow: tag the alert, post a Slack update, create a ticket. Adds-up over weeks.

Confidence in the agent's actions builds gradually. Trust is earned per action class.

First overrides matter

The first time the agent is wrong about something the on-call had to override, the on-call's trust is tested.

Make the override visible. The team sees: "agent said X, on-call did Y, on-call was right because Z." Transparency builds trust.

Track override patterns. Repeated overrides on the same cause means the prompt needs work.

When the on-call has "graduated"

Comfortable defaulting to the agent on standard cases. Confidence to override when needed.

Adds new cases to the eval suite from their experience. The on-call becomes an active participant in agent improvement.

The agent becomes part of the team's tooling, not a curiosity. That is the success state.