Normalization vs Denormalization
Trade-offs.
Normalisation strengths
Normalisation is the default for transactional systems. One fact, one row, one place to update; the schema enforces consistency that application code cannot.
- Single source of truth. Each fact lives in one canonical row; update one place, every read sees the new value.
- Storage efficiency. No duplicated data; smaller database, cheaper storage, faster backups.
- Transactional consistency. Schema-level business rules; database constraints catch errors application code missed.
- Foreign-key contract. Per-FK referential integrity; orphans become impossible at the storage layer.
Denormalisation strengths
Denormalisation is what makes reads fast. Pre-join the data once at write time; trade write complexity for read simplicity. The right choice for read-dominated workloads.
- Read performance. Pre-joined rows; queries do not need joins; reads are simple and fast.
- Predictable query plans. Simpler queries produce simpler plans; less optimiser variance, fewer surprises in production.
- Analytics workloads. Per-warehouse pre-joined fact tables; read patterns dominate writes by orders of magnitude.
- Materialised views. Per-hot-query materialisation; the user-facing read path stays fast even as the source schema grows.
When to choose which
Default to normalised; denormalise the hot read paths once measured. Pure denormalised is reserved for analytics warehouses where reads dominate writes by 1000x.
- Default to normalised. OLTP starts normalised; matches transaction reality; the safe baseline.
- Denormalise specific paths. Per-hot-read materialised view or summary table; targeted, not blanket.
- Pure denormalised for analytics. Star or snowflake schema in the warehouse; reads dominate; writes are batch.
- Per-table explicit choice. Document the rationale per table; supports operational reviews and onboarding.
Hybrid patterns
Hybrid patterns get the consistency of normalisation and the read speed of denormalisation. ETL or CDC maintains the read-optimised view; eventual consistency is the trade.
- Normalised primary, denormalised replica. Writes hit the normalised primary; reads hit the denormalised replica; the classic split.
- CQRS. Per-domain separate write and read models; modern microservices pattern; matches event-driven architectures.
- ETL maintains the view. Per-refresh ETL pipeline; eventual consistency at the refresh interval; predictable lag.
- CDC stream. Per-table change data capture into the read model; the modern, low-latency form of the same idea.