Partitioning data across multiple database instances, the way to scale writes past what a single instance can handle.
Sharding is the practice of splitting a logical dataset across multiple physical database instances (shards) by some partitioning key, typically user ID, tenant ID, or a hash. Each shard owns a subset of the data; queries are routed to the shard that holds the relevant key; writes happen on one shard at a time. Sharding scales writes (each shard handles a fraction of total writes) where read replicas only scale reads. The cost is operational: rebalancing, cross-shard queries, and the rare-but-painful resharding migration.
Read replicas are easy. Write scaling is hard, and sharding is the canonical answer past a few thousand writes per second on a single primary. Picking the right partition key matters disproportionately: a key that distributes evenly avoids hot shards (where one shard takes all the load), and a key that aligns with your access patterns avoids cross-shard joins (which are slow at best, broken at worst).
See the part of the platform that handles sharding in production.