Symptoms of a Saturated OTel Collector

Saturated collectors drop telemetry silently. The symptoms, the metrics to watch, and the mitigations.

Symptoms

Saturated OpenTelemetry collectors fail in three observable ways. Receiver drops at the source, exporter retries climbing as the backend falls behind, and memory pressure leading to OOM. The queue-depth gauge catches saturation before drops start.

Metrics to alert on

The alert set is small and specific. Sustained drops are data loss; climbing failure rate is a backend or saturation issue. Memory and CPU triggers fire before OOM.

Mitigations

Three mitigation modes cover most saturation events: scale out, tune batching, drop low-value telemetry. Each has a named command in the runbook.

Design for saturation

Saturation handled at design time avoids the 3am page. Right-size collectors, configure auto-scaling, and prefer backpressure over silent drops.