The SRE Staffing Model That Actually Scales
Embedded vs central vs platform. The three patterns, when each works, and the model most teams converge on by year three.
Embedded SREs
One SRE per product team. Deep context, fast iteration, owns the team's reliability.
Works well for: fast-moving product teams with distinct domains. Less well for: cross-cutting concerns like observability.
Risk: SREs become disconnected from each other; reliability practices drift across teams.
Central SRE team
One SRE team for the company. Sets standards, owns shared infrastructure, runs the on-call rotation.
Works well for: smaller companies, or companies with fairly uniform infrastructure. Less well for: companies with many distinct domains.
Risk: bottleneck. Product teams wait on the central SRE team for changes.
Platform SRE
A central team that builds and maintains the platform; product teams self-serve. Embedded SREs in product teams use the platform.
Works for: scale-ups and large organisations. The platform is leverage.
Most companies converge here by year three. Embedded SREs handle product specifics; platform handles cross-cutting concerns.