Best Practices By Samson Tanimawo, PhD Published Feb 25, 2025 9 min read

The On-Call Rotation Playbook for Teams of 5,50 Engineers

On-call is the fastest way to burn out a platform team. Here is how to run a rotation that people don't dread.

Shape of a healthy rotation

A healthy rotation is predictable, paired, short, and paid. Unpredictable shifts kill planning. Unpaired shifts burn the primary. Long shifts destroy sleep. Unpaid shifts breed quiet resentment.

If any one of the four is missing, the rotation degrades within a quarter, usually faster than anyone says out loud.

Shift length and primary/secondary

One week is the default. Shorter weeks create too much handoff cost. Longer weeks cause cumulative sleep debt.

For teams under 6 engineers, rotate every two weeks so nobody is back on primary within the same calendar month.

The 10-minute handoff

At shift end, the outgoing primary runs a 10-minute sync with the incoming primary. Template:

Record it. Async follow-ups are fine; the live sync is what makes handoffs feel owned rather than thrown over a wall.

Compensation isn't optional

Some combination of: extra PTO day per week of on-call, 10,15% pay differential for the week, automatic time off the day after a busy night. Pick one and commit. The exact mechanism matters less than that it exists.

Teams that run on-call without compensation lose their best senior engineers first, they are the ones with enough leverage to leave.

Two metrics to watch

Pages per primary per week, and nights interrupted (any page between 11pm and 7am) per month. If either crosses a threshold for two consecutive weeks, something has to give.

Reasonable thresholds for a mature team: under 5 pages/primary/week, under 2 interrupted nights/primary/month. If your numbers are above this, the fix is tuning alerts and SLOs, not adding more people to the rotation.

Teams that run on-call without compensation lose their best senior engineers first. They are the ones with enough leverage to leave.

1 week
default shift length
<2
interrupted nights per month target

Rotation health check

Once a quarter, pull four numbers: pages per primary per week, interrupted-nights per primary per month, percentage of pages that led to a real action, and mean acknowledgement time.

If any of those four is drifting in the wrong direction for two consecutive quarters, the fix is tuning alerts and SLOs, not hiring. Teams that hire their way out of on-call pain almost never reverse the alert drift that caused it.

The best signal that a rotation is healthy is boring: engineers volunteer to swap shifts without drama, and nobody feels the need to explain it to HR.