Kubernetes
Practical
By Samson Tanimawo, PhD
Published Apr 17, 2026
4 min read
HPA Tuning for Real Workloads
Default HPA settings are conservative. The tuning that catches bursts.
Live workflow · 3 working · 1 queuedLive
Signal · gather Working
Decide · pick action Working
Apply · with verify Working
Learn · update playbook Queued
Metrics
CPU is default. Custom metrics for actual load (RPS, queue depth).
Latency-based HPA for user-facing services.
Thresholds
Scale-up at 60-70% utilization. Scale-down delay 5 min minimum.
Conservative on scale-down; eager on scale-up.
Avoid
Aggressive scale-down. Flapping wastes capacity.
Stabilization windows: tune by service.