Elasticsearch Tuning

Shards; replicas; refresh.

Overview

Elasticsearch tuning matches shard count, replica count, refresh interval, JVM heap, and index lifecycle settings to the actual workload rather than to the framework defaults. The defaults are conservative for safety; tuned clusters routinely run 2-5x more efficiently. Most production Elasticsearch incidents trace back to one of: too many small shards, too-aggressive refresh interval, JVM heap above the 30GB compressed-object-pointer boundary, or absence of ILM letting hot data stay hot forever.

The approach

The practical approach is to size shards to 30-50 GB each (smaller produces overhead, larger produces uneven distribution), one replica minimum for resilience and read throughput, refresh interval of 30s for non-search-critical workloads (default 1s is aggressive), JVM heap at half the node memory capped at 30GB to stay below the compressed-object-pointer boundary, and ILM policies that move data through hot, warm, cold, and frozen tiers as it ages.

Why this compounds

Elasticsearch tuning compounds across the cluster lifetime. Each tuned setting produces ongoing performance; each ILM policy keeps the cluster manageable as data accumulates; the team builds search-platform muscle that pays off on every new index.

Elasticsearch tuning is an operational discipline that pays off across years. Nova AI Ops integrates with search telemetry, surfaces tuning patterns, and supports the team’s search-platform discipline.