The SRE Staffing Model That Actually Scales
Embedded vs central vs platform. The three patterns, when each works, and the model most teams converge on by year three.
Embedded SREs
Embedded SREs are the closest thing to product engineers among the three models. Each lives inside a product team and owns its reliability surface end to end.
- Shape. One SRE per product team; reports into the product team's manager or dotted line to an SRE leader.
- Strength. Deep context on the team's domain; fast iteration on alerts, runbooks, and SLOs.
- Best fit. Fast-moving product teams with distinct domains; the SRE adapts to the team's stack.
- Risk. Embedded SREs drift apart; reliability practices fragment unless a community of practice ties them together.
Central SRE team
Central SRE works well at small to mid scale. One team owns shared infrastructure, sets standards, and carries the on-call rotation for the whole company.
- Shape. One SRE team for the company; product teams file requests for reliability work.
- Strength. Consistent practices across services; one rotation, one runbook style, one set of dashboards.
- Best fit. Companies with fairly uniform infrastructure or a small number of services.
- Risk. Bottleneck; product teams queue on the central team for changes that should be self-served.
Platform SRE
Platform SRE is the model most companies converge on by year three. A platform team builds the leverage; embedded SREs apply it inside product teams.
- Shape. Central platform team builds and runs the reliability platform; embedded SREs in product teams consume it.
- Strength. Combines context (embedded) with consistency (platform); standards live in tooling, not memos.
- Best fit. Scale-ups and large organisations where neither pure embedded nor pure central fits.
- Convergence. Most companies arrive here regardless of where they started; the trick is recognising the transition early.