Observability Intermediate By Samson Tanimawo, PhD Published Dec 10, 2026 10 min read

Sampling Strategies for Distributed Tracing: Head, Tail, and Adaptive

Tracing every request is expensive; tracing none is useless. The strategy you pick decides what you can and cannot diagnose.

Why sampling is the cost knob

A modern service emits 1-3 spans per request. At 10k requests per second, that is 600M-1.8B spans per minute, storage and network you cannot afford. Sampling is the lever.

The wrong strategy makes the problem look solved while quietly throwing away the spans you most need.

Head-based sampling

Tail-based sampling

Tail-based sampling waits until the trace finishes, then decides. ‘Keep 100% of traces with errors, 100% of traces above p99 latency, 1% of normal traces.’

Pros: keeps what you actually need; storage matches signal.

Cons: requires the OTel Collector or equivalent to buffer all spans for the trace duration; complex configuration; memory pressure on the collector.

Adaptive sampling

Adaptive sampling adjusts the rate dynamically based on traffic. Low traffic → sample more (so you have data); high traffic → sample less (so cost stays bounded).

Pros: bill stays predictable; coverage stays useful.

Cons: harder to reason about ‘what would I see for this trace?’ Comparison across time becomes muddier.

The pragmatic combo: tail-based with adaptive baseline rate. Keeps interesting; bounds cost; handles traffic spikes.

Antipatterns

What to do this week

Three moves. (1) If you are head-sampling, set up the OTel Collector with tail-based for one service. (2) Add ‘keep all errors’ and ‘keep all p95+ latency’ rules. (3) Watch the storage bill for two weeks; tune the baseline rate to keep it bounded.