On-Call Ramp-Up for New Engineers
Engineers don't fully on-call from day one. Ramp.
The problem
New engineers added to on-call cold are dangerous. They miss context and call senior staff at 3am for routine fixes; senior burnout from carrying junior shifts is a quiet drain that compounds across the team. Ramp-up is a workflow, not ad-hoc; the path must be built explicitly.
- Cold ramp-up dangerous. New engineers miss context, call senior staff at 3am for routine fixes.
- Senior burnout drain. Carrying junior shifts compounds across team size; the cost is large.
- Workflow, not ad-hoc. The path must be built explicitly; ad-hoc ramp-up produces inconsistent results.
- Per-engineer ramp record. Each engineer’s ramp documented; supports accountability and continuous improvement.
The ramp
The ramp has three phases. Week 1: shadow shifts (attends incidents but does not act, reads runbooks). Week 2-3: secondary on-call (pages route through primary, new engineer is backup, real exposure low risk). Week 4+: primary on-call with explicit escalation rules (page senior on the third unrunbookable issue, not the first).
- Week 1: shadow shifts. Attends incidents but does not act; reads runbooks; asks questions in retrospect.
- Week 2-3: secondary on-call. Pages route through primary first; new engineer is backup; real exposure, low risk.
- Week 4+: primary on-call. Explicit escalation rules; page senior on the third unrunbookable issue, not the first.
- Per-phase exit criteria. Each phase has a documented exit criterion; supports consistent progression.
What to teach
Three knowledge pillars cover most ramp-up. Service architecture (top 5 services, top 5 dependencies, top 3 third-party integrations); incident process (ack flow, comms templates, decision authority during crisis); runbook navigation (how to find, update, flag stale).
- Service architecture. Top 5 services, top 5 dependencies, top 3 third-party integrations.
- Incident process. Ack flow, comms templates, decision authority during a crisis.
- Runbook navigation. How to find, update, flag stale; the runbook discipline.
- Per-pillar curriculum. Documented curriculum per pillar; supports consistent ramp content.
Checkpoints
Three checkpoints close the ramp loop. End of week 2: simulated incident drill where senior runs a fake outage and new engineer leads response. End of week 4: review with manager (confidence rating, gaps, graduate or extend). Quarterly: refresher drill because skills decay without practice.
- Week 2 drill. Senior runs a fake outage; new engineer leads response; the first applied test.
- Week 4 review. Manager review; confidence rating; gaps identified; graduate or extend.
- Quarterly refresher. Skills decay; periodic practice maintains them across the rotation.
- Per-checkpoint deliverable. Each checkpoint produces a documented decision; supports the ramp record.
Apply to your team
The application is concrete. Document the ramp-up plan in one page that new engineers read on day 1; run one simulated incident per quarter for the whole rotation (cheap practice, high payoff); track time-to-first-solo-shift per new hire with a target of 4 weeks for senior hires and 8 weeks for junior.
- One-page ramp plan. New engineers read it on day 1; the discipline is documented and visible.
- Quarterly simulated incident. Whole rotation participates; cheap practice, high payoff.
- Time-to-solo target. 4 weeks senior, 8 weeks junior; supports the ramp accountability.
- Per-hire ramp metric. Time-to-solo tracked per hire; supports continuous improvement.