Kafka Throughput Tuning
Producer/consumer settings.
Overview
Kafka throughput tuning matches producer, consumer, and broker settings to workload. Default settings are conservative; the workload-specific tuning is what unlocks the throughput Kafka actually offers.
- Producer/consumer settings.
batch.size,linger.ms,fetch.min.bytes,max.poll.records; the four levers. - Producer batching. Larger batches produce higher throughput; the trade is latency for throughput.
- Consumer fetch sizing. Larger fetches reduce per-fetch overhead; matches consumer processing capacity.
- Broker partitions plus compression. More partitions support more parallel consumers; per-topic compression cuts wire bytes.
The approach
The practical approach: profile first to find the bottleneck, batch aggressively on the producer, size fetches on the consumer, compress per topic. The team’s discipline produces matched throughput, not blanket tuning.
- Profile first. Identify producer or consumer bottleneck; the data tells you which lever to pull.
- Producer batching.
linger.ms=10-100,batch.size=32KB+; trades 10-100ms latency for multiples of throughput. - Consumer fetch sizing.
fetch.min.bytes,max.poll.records; matches consumer capacity to upstream rate. - Compression.
lz4orzstd; cuts wire bytes by 60-80% on text payloads; near-zero CPU cost. - Document the tuning. Per-topic rationale committed to the repo; supports operational reviews and re-tuning.
Why this compounds
Kafka tuning discipline compounds across topics. Each tuned topic produces ongoing throughput gain; the team’s streaming expertise accrues; new topics inherit the muscle.
- Better throughput. Right tuning supports load; the cluster runs at its design capacity, not its default.
- Better cost efficiency. Higher throughput per broker means fewer brokers; the bill drops as tuning matures.
- Better resilience. Compression reduces bandwidth; the cluster handles spikes the untuned version would not.
- Institutional knowledge. Each tuning teaches Kafka patterns; the team’s event-driven engineering muscle grows.
Kafka tuning discipline is an operational discipline that pays off across years. Nova AI Ops integrates with streaming telemetry, surfaces patterns, and supports the team’s event-driven discipline.