Cassandra vs MongoDB
Decision criteria.
Overview
Cassandra and MongoDB are both NoSQL databases solving different problems. Cassandra is a wide-column store optimised for write-heavy distributed workloads (IoT, time-series, audit logs); MongoDB is a document store optimised for flexible schema and rich queries (catalogue, content, application data). Picking the right one at design time avoids the migration that costs an engineer-quarter at year three.
- Cassandra: write-heavy, time-series. High write throughput across many nodes. Designed for IoT, time-series, audit logs at very large scale.
- MongoDB: document-heavy, flexible. JSON documents with rich queries. Designed for catalogue, content, and application data with variable shape.
- Cassandra: linear scale plus tunable consistency. Add nodes for linear throughput growth; per-query consistency level matches CAP trade-offs.
- MongoDB: rich indexing. Secondary indexes, aggregation pipelines, full-text search. Default for query-heavy workloads.
The approach
Workload-driven choice, prototype with real data, plan the data model upfront because both stores reward design-time access-pattern thinking. Cassandra rewards getting the partition key right; MongoDB rewards getting the secondary indexes right. Document the rationale per database.
- Cassandra for write-heavy time-series. Writes dominate, queries are predictable. Default for time-series and audit workloads.
- MongoDB for variable-schema documents. Documents vary across users, queries are rich. Default for application data with evolving shape.
- Partition key plus index planning. Cassandra's partition key drives access; MongoDB secondary indexes drive query performance. Get them right at design time.
- Documented choice per database. Per-database rationale captured. Future investigation has the breadcrumb.
Why this compounds
The right database choice compounds across years. Wrong choices pay performance or migration penalties indefinitely; right choices pay neither. Cross-database tooling (operational runbooks, capacity planning) gets built once per engine and reused. By year two the choice is automatic per workload.
- Better performance. Right database for the workload. Queries stay fast as data grows.
- Better operational fit. Cassandra's repair operations vs MongoDB's replica sets. Match operational model to team expertise.
- Reduced migration cost. Right choice up front avoids the engineer-quarter migration at year three. Stability follows.
- Year-one investment, year-two habit. First year builds the patterns; subsequent decisions are mechanical.