Metrics Cost Optimization
High cardinality drives cost.
Overview
Observability cost is dominated by metric cardinality, not metric count. A single metric with a user-id tag can produce more billable series than a hundred well-scoped metrics combined. The discipline starts with cardinality awareness, not metric pruning.
- Cardinality drives cost. Most observability vendors charge per unique time-series. High-cardinality tags multiply the count.
- Per-tag awareness. Each tag’s cardinality contribution is the actionable signal. user_id, request_id, trace_id are the usual culprits.
- Drop unused metrics. Per-quarter sweep of metrics no dashboard or alert references. Default to deletion.
- Aggregation tier plus quarterly audit. Match aggregation level to actual use; audit the inventory each quarter to catch drift.
The approach
Three habits keep metrics cost from compounding: per-tag awareness in design, drop-unused as standing practice, and a quarterly audit that catches drift before it shows on the bill.
- Per-tag awareness. The metrics-instrumentation review asks “what is the cardinality of each tag here?” before merge.
- Drop unused metrics. Each quarter, find metrics not referenced by any dashboard or alert. Delete them by default.
- Aggregation tier per metric. Coarse aggregations for high-volume metrics; fine for low-cardinality ones. Avoid one-size-fits-all retention.
- Documented policy. Team-wide metrics policy lives in the wiki. New instrumentation references it.
Why this compounds
Each correctly-scoped metric saves money every month for the life of the metric. Compounded across a fleet of services, the savings reshape the observability bill.
- Cost efficiency. Right cardinality matches the actual investigation need. Most teams over-tag and pay for it.
- Operational fit. Right metrics surface the right signals. Less noise, better dashboards.
- Cost-aware culture. Engineers learn what cardinality costs. New metrics ship right-sized from the start.
- Year-one investment, year-two habit. The first audit is heavy lift. By year two the cadence runs itself and savings continue compounding.