TCP vs UDP: When Each Wins, in Plain Terms
TCP and UDP are not interchangeable. The choice is workload-shaped; the consequences are operational.
What each guarantees
TCP and UDP make opposite trade-offs. Pick by what your workload actually needs the network to do, not by habit.
- TCP delivery. Ordered, reliable, retransmitted on loss; the kernel handles it.
- TCP control. Congestion control adapts to network conditions; fairness with other flows.
- UDP delivery. Best-effort fire-and-forget; if you need ordering or reliability, the application implements it.
- UDP cost. Lower per-packet overhead, no connection state; the price is that loss is the application's problem.
When TCP wins
- Anything where loss = wrong outcome.
- Most app traffic; APIs; databases.
- HTTP/1.1; HTTP/2.
When UDP wins
UDP wins when retransmission is worse than loss, when state is small, or when the application can do reliability better than the kernel.
- Real-time media. Voice and video; occasional packet loss is tolerable, retransmitted late audio is worse than no audio.
- Game state. Latest position matters more than every position; old packets are garbage on arrival.
- DNS. Small, single-packet exchanges; TCP handshake is overhead the protocol avoids.
- Custom reliability. Protocols like QUIC implement reliability in user space, often better than the kernel TCP stack.
HTTP/3 case study
HTTP/3 on QUIC is the canonical example of UDP done right. The application owns reliability and gains capabilities TCP cannot offer.
- Built on QUIC. Custom reliability and congestion control implemented in user space over UDP.
- No head-of-line blocking. Independent streams within a connection do not stall each other on packet loss.
- Lossy network advantage. Mobile and Wi-Fi networks lose packets often; HTTP/3 outperforms HTTP/2 measurably there.
- Connection migration. Connection IDs survive IP changes; mobile clients keep the connection across network switches.
Antipatterns
- UDP for ‘speed’ without handling loss. Bugs.
- TCP for fire-and-forget telemetry. Overkill.
- Mixing both for the same flow. Confusion.
What to do this week
Three moves. (1) Apply this pattern to your highest-risk network path. (2) Measure the failure mode rate before/after. (3) Document the change so the next incident-responder inherits the knowledge.