Connection Bottlenecks
Pool exhaustion.
Identifying connection bottlenecks
Connection bottlenecks show up as application latency, not service latency. The discipline is recognising the symptoms before reaching for the wrong fix; "the app is slow" can mean the app is queueing for connections, not that the downstream is slow.
- Latency spike without service issue. Unexplained application latency rise; pool waits show up at the app layer, not the service layer.
- Pool wait time metric above zero. Application queueing for connections; the metric tells you the pool is the bottleneck before logs do.
- Connection refused errors. Downstream refused at max connections or short-lived socket churn; the symptom shape decides the fix.
- Saturation alarm per app. Pool-utilisation alarm catches it before users do; without the alarm, the page comes from the customer.
Connection pool sizing
Pool sizing is its own discipline. Defaults are usually wrong, idle-in-transaction is a trap, and the math depends on worker model and downstream capacity together.
- Sizing formula. Per-worker max connections, total connections divided by app processes; allocate conservatively, monitor the ceiling.
- Defaults are usually wrong. Postgres' 100-connection default is too low for many production workloads; pgbouncer multiplexes, right-size based on real concurrency.
- Idle in transaction. Long-running transactions hold connections; pool exhaustion follows. Watch
pg_stat_activityfor the pattern. - Timeout config per pool. Acquire and statement timeouts; without them, stuck queries take the pool down with them.
TCP-level bottlenecks
Below the application sits TCP. File descriptors, time-wait state, and ephemeral ports all bottleneck silently and produce confusing symptoms when they hit their limits.
- OS file-descriptor limits. Default
ulimit -n1024 is too low for production; raise to 65536 or higher on busy hosts. - Time-Wait accumulation. Short-lived socket churn fills the time-wait pool; tune
tcp_tw_reusecarefully if the workload pattern matches. - Ephemeral port range. 32768-60999 default caps simultaneous connections to a single destination; raise the range or use connection pooling.
- somaxconn. Listen-queue depth; default 128 produces silent accept backlog drops on busy servers.
Multiplexing patterns
Multiplexing replaces N connections with 1 by sharing the underlying socket. HTTP/2, gRPC, and database poolers all use the pattern; reaching for it before raising connection limits is usually the right call.
- HTTP/2 multiplexing. Many concurrent requests over a single connection; saves connection overhead at scale.
- gRPC over HTTP/2. Connection count collapses from hundreds to tens; per-service connection budget shrinks dramatically.
- Database pooler. pgbouncer or ProxySQL multiplexes client connections to fewer database connections; standard pattern for database-heavy apps.
- Transaction-mode choice per pool. Transaction versus session mode; transaction mode multiplexes harder but breaks session-scoped features.
Monitoring connection state
Monitoring closes the loop. Pool utilisation, acquire latency, per-destination count, and eviction count together tell you the connection layer's actual health.
- Pool utilisation. In-use versus pool-size ratio per app; above 80 percent sustained means the pool is undersized.
- Acquire latency. Connection-wait timer per app; time the application spends waiting before getting a connection.
- Per-destination connection count. Connection share per downstream; detects when one dependency is hogging connections.
- Eviction count per pool. Kicked-out connections per pool; the metric catches connection-leak patterns before they exhaust the pool.