The number of items waiting in a queue, a leading indicator of saturation that fires before latency does.
Queue depth is the count of items currently waiting to be processed in a queue, a Kafka topic lag, an SQS approximate-number-of-messages, a database connection-pool acquire-queue length, an HTTP request backlog inside a server. Queue depth typically rises before latency does, because a growing queue means the producer is faster than the consumer, which eventually shows up as user-visible slowness. Alerting on queue depth gives you minutes of headroom that latency-only alerting doesn't.
Most saturation incidents follow a recognizable shape: queue depth grows slowly for several minutes, then latency climbs, then errors start. Teams that alert on queue depth catch the incident in the first phase and scale or shed before customers notice. Teams that alert only on latency catch it in the third phase and spend MTTR-minutes draining the backlog.
See the part of the platform that handles queue depth in production.