Envoy Config Debugging
Envoy debugging via admin endpoint.
Admin
Envoy config debugging is the discipline of investigating Envoy's runtime state when service mesh issues arise. Envoy exposes a rich admin interface; knowing the endpoints accelerates investigation. The discipline pays off when traffic flows mysteriously go wrong.
What the admin interface provides:
- curl http://localhost:9901/clusters shows cluster state.: The /clusters endpoint shows what upstream clusters Envoy knows about, their health, their current state. The first stop in mesh investigation.
- /config_dump for current config.: The /config_dump endpoint shows the entire configuration Envoy is using. Listeners, routes, clusters, all dumped. The team verifies what Envoy is actually configured to do.
- /stats for metrics.: The /stats endpoint shows Envoy's internal metrics. Connection counts, request rates, latencies all are queryable. Combined with /clusters, the data tells the full story.
- /listeners for inbound config.: The /listeners endpoint shows what Envoy is listening on. The team verifies inbound routing matches expectations.
- Authenticated access.: Production Envoy admin should be authenticated. The endpoint is sensitive; access controls protect it; investigation requires authorization.
The admin interface is the primary debug tool. Without it, mesh investigation requires reading control plane logs and inferring; with it, the team queries Envoy directly.
Logs
Envoy logs are configurable. Per-component log levels let the team get verbose output for specific subsystems without flooding logs from everything.
- envoy --component-log-level connection:debug.: The flag sets the connection component to debug level. Per-component logging produces targeted verbosity; the rest of the logs stay manageable.
- Increases verbosity.: Debug-level logs show internal decisions: connection establishment, TLS handshake details, upstream selection, retry decisions. The detail supports investigation.
- Per-component.: Different components handle different concerns. Connection, http, router, filter all are separate components; the team enables debug only on the relevant one.
- Surgical.: The per-component approach is surgical. Investigation focuses on the suspect component; logs stay readable; the signal-to-noise is preserved.
- Don't leave on.: Debug logging produces high volume. After investigation, return to default levels; production logs stay manageable.
The per-component logging is what makes verbose output usable. Without it, debug logs are overwhelming.
When
Envoy debug skills are reached for when service mesh issues arise. Slow inter-service traffic, mysterious failures, unexpected routing all are good moments.
- Service mesh issues.: The mesh is supposed to handle service-to-service traffic. When it does not, Envoy is the component handling the traffic. Investigating Envoy directly accelerates resolution.
- Verify Envoy is doing what you think.: The mesh's control plane sends configuration to Envoys. The configuration may not match expectations; verifying directly catches the discrepancy.
- Faster than reading mesh control plane.: The control plane has its own logs and dashboards. Sometimes the issue is between control plane and data plane; Envoy's direct admin access cuts through the layers.
- Combine with traces.: When traces show a service-to-service call failing or slow, Envoy's admin interface shows what Envoy was doing for that request. The combination produces complete diagnosis.
- Document common patterns.: The team's runbooks include common Envoy investigation patterns. New engineers can follow them; the discipline transfers; investigation gets faster.
Envoy config debugging is one of those mesh operational skills that pays off in faster incident response. Nova AI Ops integrates with mesh telemetry, surfaces traffic patterns, and complements direct Envoy access with cluster-wide visibility.