Set Up Prometheus

From zero to dashboard.

Overview

Setting up Prometheus is the discipline of bringing metrics observability online quickly. Feature parity with Datadog is not the goal; getting useful metrics in front of engineers within 30 minutes is.

From zero to dashboard. Helm install plus a default Grafana stack; one cluster, one afternoon, real signal.
Helm install. kube-prometheus-stack chart; ships Prometheus, Alertmanager, Grafana, node-exporter; the standard bootstrap.
Per-target scrape. ServiceMonitor CRDs select pods to scrape; declarative, version-controlled, recreatable.
Grafana dashboards plus retention. Per-team dashboards from day one; retention sized to budget; tune up later when you know what you keep.

The approach

The practical approach: helm install with sensible defaults, ServiceMonitors per workload, standard tags everywhere, document the stack as you build it. The team’s discipline produces useful metrics observability, not just Prometheus running.

helm install kube-prometheus-stack. Standard chart; ships everything; tune values.yaml later, ship now.
ServiceMonitor per workload. Per-target scrape declared as CRD; the alternative is hand-edited prometheus.yml.
Standard tags. service, env, version on every metric; the foundation of correlation across services.
Per-team dashboards. Each team owns their Grafana folder; cross-team views compose from team panels.
Document the stack. Per-cluster configuration committed to the repo; supports investigation and rebuild.

Why this compounds

Prometheus bootstrap discipline compounds across services. Each instrumented service grows the team’s observability surface; cost-per-question falls as the foundation matures.

Faster investigation. Per-service metrics produce fast root cause; MTTR drops because the data is already there.
Better understanding. Per-service metrics teach the team how the service actually behaves under load.
Better cost efficiency. Open-source stack reduces vendor cost; the savings compound over years of growth.
Institutional knowledge. Each service instrumented teaches observability patterns; the team’s muscle grows.

Prometheus bootstrap discipline is an infrastructure investment that pays off across years. Nova AI Ops integrates with metrics telemetry, surfaces patterns, and supports the team’s observability discipline.