Recording Rules: The Pattern for Fast Dashboards
Recording rules pre-compute. The pattern, the trade-offs, and the dashboards that get instant instead of slow.
The idea
Recording rules are Prometheus's mechanism for pre-computing common queries. Instead of computing the same expensive aggregation every time a dashboard loads, the recording rule computes it periodically and stores the result as a new time series. Dashboards query the pre-computed series; the cost shifts from read to write; the dashboard loads quickly.
What the pattern looks like:
- Common queries are pre-computed.: The team identifies queries that are run frequently and are expensive. The query is moved into a recording rule; Prometheus evaluates the rule on a schedule (typically every 30 seconds or every minute).
- Stored as new time series.: The recording rule's output is a new time series with a new name. The series name follows convention (level:metric:rate format is common). Dashboards reference the new series.
- Dashboards query the pre-computed series.: Instead of computing rate(http_requests_total[5m]) sum by (service) at every dashboard load, the dashboard queries the pre-computed series. The pre-computed series is much cheaper to query.
- Not the raw.: The dashboard does not see the raw metric for these queries. The pre-computed view is what feeds the dashboard; the raw is available for ad-hoc queries that do not match the recording rule pattern.
- Trade compute at write for query at read.: The recording rule runs whether or not anyone queries it. The write-side cost is constant; the read-side cost drops dramatically. The trade-off pays off when the read frequency is high.
The pattern is simple but powerful. Common dashboard queries are exactly the queries where the trade-off pays off.
When it pays
Recording rules are not free. The cost is real (additional series, ongoing computation); the savings come from amortizing across many reads. Understanding when the pattern pays off prevents over-application.
- Queries that take more than 1 second to render.: Slow queries are the candidates. Sub-second queries do not need recording rules; the optimization would not produce visible improvement.
- Dashboards loaded multiple times per day.: The amortization works when the dashboard is loaded often. A dashboard loaded 100 times per day amortizes the recording rule's cost across 100 reads; the per-read savings are large.
- Heavy aggregations across many series.: Queries that sum or aggregate across thousands of series are expensive. The pre-computation does the aggregation once; reads see the aggregated result.
- The pre-computation amortises across reads.: The recording rule's cost is paid once (per evaluation); the savings accrue to every read. The more reads, the more amortization. The pattern is the right choice for popular dashboards.
- Alert rules also benefit.: Alerts that use the same heavy queries also benefit. The alert evaluates against the pre-computed series; alert evaluation cost drops; the rest of the time series database has more capacity for ad-hoc queries.
The pattern pays off in the high-leverage cases. Outside those cases, raw queries work fine.
Limits
Recording rules have real costs. Understanding the limits prevents over-application; the team adopts recording rules deliberately rather than reflexively.
- Pre-computed series add cardinality.: Each recording rule produces new time series. The cardinality of the metric database grows. For databases with cardinality budgets, the recording rules consume budget.
- Watch the budget.: A team with many recording rules can consume significant cardinality. The team tracks the cardinality impact; rules that produce excessive cardinality are reviewed and possibly tightened.
- Recording rules need maintenance.: Like any code, recording rules can become stale. The query they replace might change; the dashboards using them might be retired; the rule continues running without producing value.
- Stale rules accumulate.: Without periodic review, stale rules accumulate. The list of rules grows; the cost grows; the value does not. The accumulation is silent; it shows up only in growing infrastructure costs.
- Quarterly audit.: Once per quarter, the team reviews the recording rules. Are they still being queried? Are the dashboards they serve still active? Stale rules are removed; the rule list stays current.
Recording rule pattern for fast dashboards is one of those Prometheus disciplines that pays off proportionally to dashboard usage. Nova AI Ops integrates with Prometheus and observability platforms, surfaces query patterns and recording rule effectiveness, and produces the per-rule audit that drives the quarterly cleanup.