Loki vs Elasticsearch

Logging.

Loki strengths

Loki's strengths are cost, operational simplicity, and Grafana fit. The label-index model is the core trade: small index, cheap object storage, fast label queries, slower content scans.

Cost. One to five dollars per GB per month versus twenty to fifty for Elasticsearch. Loki indexes labels only; raw logs sit on cheap object storage like S3 or GCS.
Operational simplicity. No shard tuning, no node sizing, no snapshot scheduling. Stateless query layer scales horizontally; storage scales with the underlying object store rather than the cluster.
Tight Grafana integration. Same-vendor fit per stack. LogQL mirrors PromQL syntax; native for Grafana-stack shops where dashboards already pivot on labels.
Bounded label set. Curated label list per cluster. Discipline preserves the cost model; high-cardinality labels are the fastest way to lose it.

Elasticsearch strengths

Elasticsearch's strengths are full-text search, ecosystem maturity, and aggregation. Worth the cost when you actually use them; expensive when you do not.

Full-text search. Free-form queries across all log fields per cluster. Loki only matches labels efficiently; content searches against Loki scan, while Elastic's inverted index makes them constant-time.
Mature ecosystem. Kibana, Beats, Logstash, and ML features per stack. Long history, broad community, deep integration library that covers the long tail of input formats.
Aggregation power. Complex aggregations across structured fields per query. Useful for analytics-style queries on log data where the question is "how many" rather than "show me lines matching".
ILM policy. Index-lifecycle management per cluster. The discipline catches storage explosion before it becomes a budget conversation.

How to decide

The decision is shape-driven. Grafana-stack, full-text-heavy, and compliance-bound each point to a different answer; the wrong pick produces years of friction.

Kubernetes-heavy, Grafana-stack, cost-sensitive. Loki pick per org. The ecosystem is converging here, and the cost model rewards label discipline.
Heavy full-text search. Elasticsearch pick per org. Do not migrate to Loki without a clear pain to justify it; full-text on Loki is slow.
Compliance or Kibana-specific needs. Elasticsearch pick per org. Some industry-specific tooling and audit workflows assume Elastic and cost weeks to retrofit.
Team-skill match. Existing operational expertise per org. Catches the wrong-tool pick when the spec sheet says one thing and the team's reflexes say another.

Hybrid approaches

Hybrid is rarely worth it. Most teams should pick one and stick; the operational tax of two log stacks usually exceeds the benefits unless the requirements actually demand both.

Loki hot, Elasticsearch search. Recent-logs-on-Loki, full-text-on-Elasticsearch split per stack. Heavier operational burden; both stacks need backups, upgrades, and on-call coverage.
Operational complexity. Two-tool ops per stack. Worth it only when specific requirements (compliance retention plus dev-ergonomic search) genuinely cannot collapse to one tool.
Migration is real work. Querying logic, dashboards, and alerting rewrite per stack. Do not switch without strong justification; the migration tail is months, not weeks.
Named owner. Responsible team per stack. Operational reviews have a target rather than splitting blame across the platform org.

Common pitfalls

The pitfalls are predictable. High-cardinality Loki, no-ILM Elastic, cargo-cult migration. Each one shows up reliably in retros from teams that skipped the planning step.

Loki high-cardinality labels. No-high-cardinality rule per cluster. Defeats the cost model the moment a label like user-id or trace-id slips in; log fields go in content, not labels.
Elasticsearch without ILM. Index-lifecycle policy per cluster. Old indices accumulate, storage explodes, and the cluster gets too expensive to upgrade safely.
Migrating without understanding. Deep-tool understanding per team before commit. Cargo-culting Loki because Grafana said so produces unhappy teams when the queries the SOC actually runs are full-text.
Cost monitor. Storage and ingest cost gauge per stack. Catches drift before the next budget review surfaces a six-figure surprise.