Observability Practical By Samson Tanimawo, PhD Published Apr 28, 2026 4 min read

Symptoms of a Saturated OTel Collector

Saturated collectors drop telemetry silently. The symptoms, the metrics to watch, and the mitigations.

Symptoms

Receiver dropped batches. Telemetry data lost at the source; otelcol_processor_dropped_spans rate above zero.

Exporter retry rates climbing. Backend cannot keep up; data buffered or dropped.

Memory utilisation at the cap. Collector OOM kills imminent.

Metrics to alert on

otelcol_processor_dropped_spans rate > 0 sustained. Any sustained drop is data loss.

otelcol_exporter_send_failed_spans_total rate increasing. Backend issue or saturation.

Memory and CPU utilisation. Approach to caps; preempt OOM.

Mitigations

Scale horizontally: add collector instances. Round-robin or sticky distribution.

Tune batching: larger batches, fewer exports. Reduces per-batch overhead.

Drop low-value telemetry first. Increase head sampling rate temporarily; restore after recovery.

Design for saturation

Right-size collectors based on traffic. Default sizing usually undersized for production.

Auto-scaling on memory and CPU thresholds. Don't rely on fixed-size deployments at peak.

Backpressure to producers when collector is saturated. Better to slow producers than drop data.