Read Amplification

Index design impact.

Overview

Read amplification recognises that one logical read can produce many physical reads, and physical reads are what drive performance and IOPS cost. A query that reads "one row" might actually touch the index, fetch from heap, walk through dead tuples (Postgres), or scan multiple SST files (LSM-tree storage). The discipline is in measuring per-query physical reads, designing indexes that minimise heap fetches, and using covering indexes for read-heavy paths.

The approach

The practical approach is to measure per-query physical reads (pg_stat_statements, EXPLAIN BUFFERS, equivalent in other engines), design covering indexes for read-heavy paths so queries hit only the index, monitor buffer cache hit rate as the leading signal of amplification pressure, run VACUUM aggressively on Postgres tables where dead tuples accumulate, and document the per-table read strategy so the design is reviewable.

Why this compounds

Read amplification discipline compounds across queries and tables. Each measured query reveals real physical IO; each covering index reduces ongoing IOPS for the queries it serves; the team builds intuition for which queries amplify and which scan-free. Without the discipline, slow-query investigations focus on logical query shape and miss the physical-IO patterns that actually drive cost and latency.

Read amplification discipline is a database discipline that pays off across years. Nova AI Ops integrates with database telemetry, surfaces amplification patterns, and supports the team’s database engineering discipline.