Set Up Elasticsearch
Full-text logs.
Overview
Standing up Elasticsearch (or its OSS sibling OpenSearch) plus Kibana gives a team full-text log search, structured aggregations, and the dashboarding to actually use both. The work that matters is not node count; it is index lifecycle, mapping discipline, and choosing managed-versus-self-hosted before the first GB lands.
- Full-text log search. Free-text queries with relevance ranking across millions of lines. The investigation tool ops engineers reach for first.
- Aggregations on structured fields. Sum, count, percentile bucketed by service, status code, region. The summarisation layer Kibana visualisations sit on.
- Kibana for dashboards and Discover. Saved searches, dashboards, alerting. The UI most engineers learn through.
- Index lifecycle plus cluster topology. Hot/warm/cold/delete tiers from day one; deliberate master/data/ingest role split as the cluster grows.
The approach
Three habits make Elasticsearch a reliable platform rather than a recurring 3am page: managed when possible, ILM configured before ingest starts, and index templates that enforce mapping consistency.
- Managed when possible. Elastic Cloud or AWS OpenSearch Service. The operational tax of self-hosting is genuine; pay it only for a clear reason.
- ILM from day one. Hot, warm, cold, delete tiers configured before the first index ships. Retrofitting ILM under load is painful.
- Index templates and consistent mappings. Standard mappings per data source so field types stay stable. Mapping explosions are the recurring outage source.
- Cluster planning plus health monitoring. Master/data/ingest role split for scale; cluster health, JVM, and query latency on the standing dashboard.
Why this compounds
Each indexed source grows the team's investigation surface. Cross-service patterns become visible; mean time to root cause drops; the platform becomes a primary analysis tool rather than just a search box.
- Faster investigation. Full-text search across the whole estate cuts MTTR on log-heavy incidents.
- Cross-service visibility. Aggregations reveal patterns no single service dashboard could show.
- Retention matched to access. ILM keeps hot data fast and cold data affordable. Storage cost stays predictable.
- Year-one investment, year-two habit. The first install is heavy. By year two, onboarding a new log source is a 30-minute task.