Auto-vacuum Tuning
Postgres-specific.
Why auto-vacuum tuning matters
Postgres needs to reclaim dead tuples to keep tables and indexes healthy. The defaults are conservative on purpose; production write-heavy workloads usually need more aggressive tuning to avoid the bloat that produces slow queries, oversized indexes, and eventual full-table rewrites.
- Dead tuples and bloat. Per-table dead-tuple accumulation; bloat causes slow queries, oversized indexes, and eventually expensive full-table rewrites.
- Defaults are conservative. General-purpose defaults work for low-write databases; production write-heavy workloads need aggressive tuning.
- Bloat math. 30-50 percent bloat on untuned write-heavy tables; slower queries, higher I/O cost, larger backups.
- Bloat dashboard per cluster. Trend chart per cluster catches drift before queries start timing out.
The key parameters
Three parameters are the main levers: scale-factor, threshold, and max-workers. Each tunes a different aspect of when and how aggressively autovacuum runs.
- autovacuum_vacuum_scale_factor. Default 0.2 means vacuum runs after 20% of rows change; 0.05 for write-heavy tables runs four times as often.
- autovacuum_vacuum_threshold. Default 50; trigger fires at
threshold + scale_factor * table_rows; useful for small tables where percentage-based triggers fire too rarely. - autovacuum_max_workers. Default 3; raise to 6-10 for clusters with many active tables, watching I/O bandwidth so vacuum does not starve regular traffic.
- Cost-limit tuning per cluster.
autovacuum_vacuum_cost_limitraises throughput on capable storage; default is conservative for shared spinning disks that no longer exist.
Per-table tuning
Per-table tuning is the surgical layer. Hot tables get aggressive overrides; cold tables stay on defaults; one autovacuum policy across thousands of tables wastes either I/O or freshness.
- ALTER TABLE override.
ALTER TABLE foo SET (autovacuum_vacuum_scale_factor = 0.05); per-table parameter overrides the global default. - Hot versus cold tables. Activity-driven choice; hot tables earn aggressive tuning, cold tables stay on defaults.
- Identify candidates.
SELECT * FROM pg_stat_user_tables ORDER BY n_dead_tup DESC LIMIT 20; the top 20 names the tuning targets. - Named owner per table. Responsible team per heavily-tuned table; non-default settings need someone to maintain them.
Monitoring vacuum activity
Monitoring closes the loop. Vacuum stats, bloat estimation, and in-flight progress together tell you whether tuning is keeping up with the write rate.
- pg_stat_user_tables.
last_vacuum,last_autovacuum,n_dead_tupper table; alert when last_autovacuum age exceeds threshold. - Bloat estimation queries. pgstattuple or community-query bloat report scheduled monthly; trend over time matters more than any single snapshot.
- Long-running vacuum.
pg_stat_progress_vacuumview; investigate any vacuum running beyond 30 minutes for lock contention or undersized cost limit. - Vacuum-skip alarm per cluster. Canceled-autovacuum alert catches lock contention that prevents vacuum from completing.
Emergency interventions
Emergency interventions are the last resort when tuning has fallen behind reality. VACUUM FREEZE, VACUUM FULL, and pg_repack each suit different blast radii.
- VACUUM FREEZE. Manual trigger per table; blocks briefly but completes the work that autovacuum has been falling behind on.
- VACUUM FULL. Table rewrite with locking and full space reclaim; maintenance window only, not production hours.
- pg_repack. Online table rewrite without long locks; required for write-heavy production tables that need VACUUM FULL effects without the downtime.
- Documented runbook per emergency. Playbook per intervention type so the on-call IC does not need to invent the procedure during the incident.