SLO Handoff Between Teams
Service ownership moves; SLOs that do not move with it become orphans. The handoff is mechanical but rarely done.
Why SLOs orphan
Team A owns service X with an SLO; team B inherits service X; the SLO stays in A’s alerts and A’s on-call gets paged for B’s service. The SLO must move; in practice it forgets to.
- Service moves, SLO doesn’t. Service catalog updates; alert routing keeps pointing at the previous owner.
- Wrong team paged. A’s on-call wakes up for B’s outage; cannot fix it; the page is noise plus delay.
- No catalog audit. Without a quarterly check, mismatches accumulate; the alerts become advisory then ignored.
- The fix. A four-step handoff: document, route, dashboard ownership, joint period. Mechanical, but rarely done.
Four-step handoff
- 1. Document the SLO and its history.
- 2. Update alert routing to new team.
- 3. Update dashboard ownership.
- 4. Joint owners during transition.
Joint-ownership transition
Old team and new team jointly own the service for one quarter. Old team mentors; new team learns the SLO’s context. Cleaner than instant handoff; preserves institutional memory.
- Old team mentors. Walk the new team through historical incidents, SLO threshold reasoning, known-flaky alerts.
- New team shadows. Joins the old team’s on-call rotation for one quarter; sees the SLO in motion.
- Joint pages. Both teams paged on alert during the transition; old team can disengage when comfortable.
- Cutover. End of quarter, alert routing flips fully to new team; old team archived as escalation contact.
Quarterly audit
Quarterly review: list every service, its owning team, and the team SLO alerts route to. Mismatches get re-handed off explicitly. Without the audit, drift accumulates.
- Service catalog. Source of truth for ownership; every service has a current owner; no orphans.
- Alert routing. PagerDuty / Opsgenie configuration verified against the catalog; mismatches flagged.
- Dashboard ownership. Who fixes a broken Grafana board? Same team as the alert; mismatch is a smell.
- Quarterly cadence. Slow enough that the audit is not busywork; fast enough that drift stays bounded.
Antipatterns
- Silent handoff. Old team continues to own (or nobody does).
- Instant handoff with no joint period. New team flying blind.
- No catalog audit. Mismatches accumulate.
What to do this week
Three moves. (1) Apply the pattern to your most-impactful service. (2) Measure adherence for 30 days. (3) Rewrite the policy or the SLO if the gap is durable.