Database Cost Engineering: Rules of Thumb
Database costs sneak up. Five rules of thumb cap them without sacrificing performance.
Why DB costs sneak
Database costs grow silently across multiple dimensions. Each one looks small in isolation; together they compound into a cost story nobody planned.
- Storage growth. Tables grow with usage; indexes grow with tables; backups grow with both.
- Instance class drift. Provisioned size creeps up over years; rarely reviewed once stable.
- Backup accumulation. Retention policy unset or too generous; old snapshots accrue indefinitely.
- Replica pile-on. Each new requirement adds a replica; nobody decommissions the old ones.
Five rules of thumb
- 1. Right-size instance class quarterly.
- 2. Tier old data to cheaper storage.
- 3. Prune unused indexes.
- 4. Cap backup retention.
- 5. Audit replicas for actual usage.
Per-rule savings
Each rule produces measurable savings. Run them in order of impact; the first two recover most of the available savings.
- Right-size. 20-40% saving on instance cost; the highest-impact lever for most teams.
- Tier old data. 50-80% saving on archived data; cold storage is fraction of hot price.
- Index prune. 5-15% storage saving plus write throughput improvement; a double win.
- Backup retention + replica audit. 30-50% of backup spend, 10-30% of replica spend; smaller but recurring.
Order of operations
Order matters. Right-size first because it dominates everything; archive second; index pruning third; backups and replicas last.
- Right-size first. Highest impact; the easy win that funds the next conversations.
- Archive second. Old data tiered down; the bill drops without app changes.
- Index third. Pruning unused indexes shrinks storage and speeds writes.
- Quarterly cadence. Each rule applied every quarter; the cumulative effect across years is substantial.
Antipatterns
- Database growth as inevitable. Mostly preventable.
- One-time cost cut. Drift returns.
- Cost vs performance as binary. Most fixes do both.
What to do this week
Three moves. (1) Apply this pattern to your most-loaded table. (2) Measure query latency / write throughput before/after. (3) Document the win and the constraint so the next refactor inherits the knowledge.