Latency Budgets per Service: The Math That Holds
End-to-end latency goals decompose into per-service budgets. The math is simple; the discipline of tracking is rare.
Why budgets per service
User wants p99 page load < 2s. Page load is 8 services in series. Each service’s p99 must be < 250ms.
Without per-service budgets, no team owns their fraction; the goal misses.
Four-step decomposition
- 1. Identify user journey.
- 2. Trace the call graph.
- 3. Allocate budget per hop.
- 4. Measure each hop independently.
Tracking pattern
Per-service dashboard with budget line; team owns their hop.
Quarterly: re-allocate as services change.
Renegotiation cadence
When one service breaches: either invest in that service or renegotiate the budget. The conversation has to happen.
Without renegotiation, missing budgets become the new normal.
Antipatterns
- End-to-end goal without decomposition. Nobody owns it.
- Static budget for years. Service mix changes; budget rots.
- Budget without consequence. Decoration.
What to do this week
Three moves. (1) Apply this pattern to your slowest production endpoint. (2) Measure p99 before/after. (3) Document the win and ship the runbook so the team can reproduce.