SLO Dashboard Design: Five Must-Haves
An SLO dashboard is the most-glanced surface in engineering. Design it to answer questions in seconds.
What the dashboard answers
An SLO dashboard is the most-glanced surface in engineering. The design imperative is to answer three questions in seconds; everything else is drill-down.
- Healthy now? Current SLO compliance versus target; the headline at the top.
- Trending right? Direction of travel over the period; up or down matters more than absolute number.
- Burn-rate state? Multi-window burn rate; is the budget at risk this hour, this day, this week.
- Beyond that. Anything else requires drill-down; the dashboard does not try to answer everything.
Five must-have panels
- 1. SLO compliance percentage with target line.
- 2. Error budget remaining with sparkline.
- 3. Burn rate trend over multiple windows.
- 4. Top contributors to budget consumption.
- 5. Recent incidents tagged to budget impact.
Visual idioms
The visual choices decide whether the dashboard works under pressure. Polished design beats baroque every time; the on-call should parse it in seconds.
- Big number with trend. Top-of-dashboard metric shown as a single large number plus a trend sparkline.
- Red/yellow/green. Status colours convey state without reading; the eye picks them up in milliseconds.
- No slow charts. Charts that take more than a second to read fail the dashboard's purpose.
- Polished beats baroque. Restraint in chart count and styling; the surface should feel calm, not busy.
Incident-resilient layout
The dashboard must hold up during a real incident. Bookmarked URL, fast load, dark-mode-friendly, clear sectioning; the basics that survive 3am stress.
- Bookmarked URL. On-call's first move on an SLO alert is to open the dashboard; the URL is canonical.
- Loads in under 2 seconds. Slow dashboards lose the on-call to other tools; speed is reliability.
- Dark mode. 3am stress on light-mode dashboards is its own problem; dark mode is not optional.
- Clear sectioning. Headline metrics, then drill-downs, then context; reading order matches investigation order.
Antipatterns
- 20 panels. Overwhelming.
- Pretty visualisations that take effort to read. Slows incident response.
- Dashboard never reviewed for use. Drifts from utility.
What to do this week
Three moves. (1) Apply the pattern to your most-impactful service. (2) Measure adherence for 30 days. (3) Rewrite the policy or the SLO if the gap is durable.