Back to glossary
GLOSSARY · Q

QPS (Queries Per Second)

The throughput rate of requests against a service, the headline traffic metric in capacity planning and rate-limit design.

Definition

QPS (Queries Per Second), sometimes RPS (Requests Per Second) or TPS (Transactions Per Second), is the rate at which a service handles incoming work. It's the headline traffic dimension in load testing, capacity planning, and rate-limit design. A service rated at 10K QPS at p99 latency 100ms means it can sustain that rate without falling out of its latency SLO; at 12K QPS the latency starts to climb, that's the saturation point.

Why it matters

QPS is meaningful only at a specified latency SLO. 'We can do 100K QPS' is meaningless without 'at p99 200ms.' Most capacity-planning errors come from mistaking the burst maximum (the rate the service touched once for ten seconds) for the sustainable maximum (the rate it can hold without breaking the SLO). Load test for the right number, then provision against it with margin.

How Nova handles it

See the part of the platform that handles qps (queries per second) in production.

Nova capacity analysis