Onboarding On-Call Engineers to Work Alongside Agents

On-call has to trust the agent. The 30-day onboarding curriculum, the shadow-mode period, and the first agent decisions a human should be expected to override.

Week 1: shadow only

The first week is observation. The agent runs read-only and the on-call sees the agent’s hypothesis alongside their own; no expectation that the on-call defers (they form their own opinion, the agent is a peer not an authority); daily debrief surfaces where the agent agreed, where it disagreed, where each was right.

Weeks 2-3: agent-first triage

The next two weeks shift the order. The agent triages first, the on-call reads the triage and decides whether to follow or override; most triage cases follow the agent; override is logged with a reason and reasons feed back into agent improvement; on-call still does the action because the agent does not act in this phase.

Week 4: agent action with monitoring

Week 4 is the first action phase. The agent takes specific low-risk actions while the on-call monitors and intervenes if needed; action allowlist starts narrow (tag the alert, post a Slack update, create a ticket) and adds up over weeks; confidence builds gradually because trust is earned per action class.

First overrides matter

The first override is a trust test. The first time the agent is wrong about something the on-call had to override, trust is tested; make the override visible (“agent said X, on-call did Y, on-call was right because Z”) because transparency builds trust; track override patterns because repeated overrides on the same cause means the prompt needs work.

When the on-call has "graduated"

Three signs mark graduation. Comfortable defaulting to the agent on standard cases with confidence to override when needed; adds new cases to the eval suite from their experience so the on-call becomes an active participant in agent improvement; the agent becomes part of the team’s tooling, not a curiosity (that is the success state).