Database Load Testing: Realistic Patterns
Load tests that don’t match production traffic are theatre. Realistic test design matters more than the tool.
Why synthetic tests miss
Naive load tests pass and naive load tests are wrong. Production traffic has shape that uniform synthetic loops never reproduce.
- Uniform queries. Synthetic loops hit a small set of identical queries; production has hundreds of shapes.
- Data skew. Production has hot keys and cold keys; uniform random data hides hot-key contention entirely.
- Transactional shape. Production has multi-statement transactions; tests that skip them miss lock contention.
- Verdict. Without realistic shape, tests pass; production fails on the first traffic spike.
Four-component realistic test
- 1. Mixed query shapes (read/write ratio).
- 2. Realistic data distribution (skew matches production).
- 3. Concurrent transactions (with retries).
- 4. Failure injection (network blips; one node down).
Production-like data shape
The data shape decides whether the test reproduces production failure modes. Volume without shape is noise.
- Anonymised production dump. Real distributions, real index hot spots; the gold standard for test data.
- Synthetic with skew. Generate data with Pareto or Zipf distributions to mimic real-world key access patterns.
- Volume alone fails. 10x more rows with uniform distribution misses every real production bottleneck.
- Refresh cadence. Anonymised dump regenerated quarterly so distributions stay current.
Test cadence
Load testing only counts when it runs predictably. Two cadences cover the cost-effective ground.
- Quarterly full test. Full load profile against the next-major release; surfaces architectural regressions.
- Pre-deploy smoke. Smaller load test on major changes; catches the worst regressions before prod.
- Continuous baseline. Lightweight nightly run trends performance over time; alerts on trend break.
- Failure-mode tests. Quarterly chaos plus load combined; the failure shape under load is what matters in real incidents.
Antipatterns
- Uniform synthetic data. Misses hot-key issues.
- Read-only tests. Misses contention.
- Test in dev DB. Wrong scale; wrong data.
What to do this week
Three moves. (1) Apply this pattern to your most-loaded table. (2) Measure query latency / write throughput before/after. (3) Document the win and the constraint so the next refactor inherits the knowledge.