Postgres Vacuum and Bloat
Bloat causes slow queries.
Overview
Postgres’ MVCC design leaves dead tuples behind on every UPDATE and DELETE. Without VACUUM running often enough, those dead tuples accumulate as bloat, slowing queries and inflating storage. Autovacuum handles most of it on healthy workloads, but bulk operations and high-churn tables need explicit attention. The discipline is monitoring bloat per table and tuning autovacuum to match the workload rather than relying on defaults.
- Bloat causes slow queries. Dead tuples force the planner to read more pages. p99 query latency rises with the bloat ratio.
- Per-table bloat monitoring. Dead-tuple percentage tracked as a standing metric. Hot spots surface before they become incidents.
- Autovacuum tuning. Per-database settings matched to workload. Defaults work; tuning for high-churn tables works better.
- Manual VACUUM plus VACUUM FULL caution. Manual VACUUM after bulk operations; VACUUM FULL reserved for emergencies because it acquires an ACCESS EXCLUSIVE lock.
The approach
Three habits keep Postgres healthy at scale: per-table bloat monitoring as a standing dashboard, autovacuum settings tuned per-database, and explicit manual VACUUM after bulk operations.
- Per-table bloat monitoring. Dead-tuple percentage on a dashboard. Hot tables identified early.
- Autovacuum tuning. Per-database settings tuned to actual workload. Vacuum_cost_delay, autovacuum_vacuum_threshold, autovacuum_vacuum_scale_factor.
- Manual VACUUM after bulk. Bulk INSERT, UPDATE, or DELETE jobs followed by explicit VACUUM. Autovacuum cannot keep up alone.
- VACUUM FULL caution plus documented policy. Lock-acquiring VACUUM FULL only when necessary; per-table the vacuum strategy in the runbook.
Why this compounds
Each managed table preserves query performance and prevents the silent transaction-wraparound failure mode. The team’s Postgres operational fluency deepens; new tables ship with vacuum policy on day one.
- Query performance preserved. Less bloat means faster queries. p99 stays in spec.
- Stability protected. Wraparound avoided; freezing kept current. The catastrophic failure mode disappears.
- Operational fit. Right vacuum matched to workload. Disk usage stays predictable.
- Year-one investment, year-two habit. First tuning is heavy lift. By the third high-churn table, the policy is settled.