p99 and Tail Latency: The Number You Cannot Ignore

Average latency is comforting and wrong. p99 is uncomfortable and right.

Why average lies

Average latency hides the worst experiences. Two services with the same mean can deliver wildly different user experiences depending on the shape of the tail.

Four causes of tail growth

Per-cause mitigation

Each cause has a different remediation. Identify which one is dominant before reaching for a fix; mitigations rarely transfer across causes.

Tail-aware monitoring

You cannot fix what you cannot see. Tail-aware monitoring records distributions, not summaries, and alerts on percentiles directly.

Antipatterns

What to do this week

Three moves. (1) Apply this pattern to your slowest production endpoint. (2) Measure p99 before/after. (3) Document the win and ship the runbook so the team can reproduce.