The Reliability Budget Meeting (Monthly)
Once a month, leadership and engineers meet to review reliability budgets and decide priorities. The agenda, the outcomes, and why this beats ad-hoc decisions.
Agenda (45 min)
The agenda is fixed at 45 minutes across four time-boxed sections: SLO review, burn-rate review, next-month prioritisation, backlog grooming. Time-boxing is what keeps the meeting from becoming a status update.
- 10 min: SLO performance review. Per-service month-over-month performance trend. Material changes get attention.
- 15 min: burn-rate review. Per-service error-budget consumption. Services in red explain why.
- 10 min: prioritise next month’s work. Per-team reliability investment based on the burn data.
- 10 min: backlog grooming. Defer or accelerate decisions. Real prioritisation, not status reporting.
Outcomes
The meeting produces three concrete outcomes: per-service priority for the next month, leadership-engineering alignment on burn data, and a visible signal that the budget conversation is real.
- Per-service reliability priority. Named priority for the next month. Documented and tracked.
- Leadership sees burn data. Aligned conversation about feature velocity versus reliability.
- Engineers see that leadership cares. Budget-is-real signal lands across the team. Not theoretical.
- Published notes per month. Documented outcomes support continuity across attendees and across quarters.
Avoid
Three failure modes destroy the practice. Long status updates inflate the meeting; oversized invite lists devolve into status; skipped months erode the data trustworthiness.
- Long meeting becomes status. Strict 45-minute box. Status updates kill the meeting.
- Inviting everyone. 6-8 person cap. Larger meetings devolve into status reports.
- Skipping months. No-skip rule. Cadence is what makes the data trustworthy month over month.
- Named owner per org. Responsible scheduler catches the “we forgot this month” failure mode.