Data Platform Cost Optimization
Snowflake/BigQuery/Databricks cost.
Compute cost dimensions
Data-platform compute pricing falls into two shapes: per-query (BigQuery, Athena) and per-cluster (Snowflake, Databricks). The same workload runs at very different cost on the two models.
- Per-query cost. BigQuery, Athena charge per byte scanned. Cheap on small partitioned queries; expensive on full-table scans.
- Per-cluster cost. Snowflake, Databricks charge per runtime hour. Cheap when right-sized and auto-suspended; expensive when oversized or always-on.
- Auto-suspend. Default suspend-after-idle to 5 to 10 minutes. Always-on warehouses are the most common cost leak.
- Per-workload class. Heavy ETL gets a dedicated warehouse; ad-hoc queries share a smaller one. Mixed workloads on one warehouse hide cost attribution.
Storage cost dimensions
Storage costs are smaller than compute but they accumulate quietly. The cost leaks usually live in snapshot retention and forgotten warehouse internal storage.
- Object storage. S3, GCS, Azure Blob at per-GB-month rates. Predictable; rarely the actual problem.
- Snapshot storage. Snapshots accumulate without explicit retention policy. Bound the cost with a written policy.
- Warehouse internal storage. Snowflake or BigQuery internal storage is usually cheaper than raw object storage at scale, but worth measuring.
- Lifecycle policies. S3 IA and Glacier transitions for cold data. The first 90 days hot, then transition.
Optimisation patterns
Three patterns produce most of the realised cost savings: query optimisation, workload separation, and warehouse right-sizing. Each compounds across the platform.
- Query optimisation. Partition pruning, columnar reads, materialised views. 5x to 10x cost reductions are common on the long-tail of queries.
- Workload separation. Heavy ETL on dedicated warehouses; ad-hoc queries on smaller ones. Mixing them hides the cost driver.
- Right-size warehouse. Bigger is not always faster; the math depends on query shape. Re-evaluate sizing quarterly.
- Clustering and partitioning. Per-table clustering keys cut bytes scanned dramatically on common access patterns.
Monitoring data platform cost
Monitoring is where the discipline lives. Without per-team attribution and per-query visibility, optimisation work targets the wrong things.
- Per-team chargeback. Tag queries and warehouses by team. Allocate cost so investment conversations have data, not anecdotes.
- Per-query cost visibility. Slack notification when a query crosses a cost threshold. Engineers get feedback on the queries they wrote.
- Quarterly cost review. Walk the top-consumer list each quarter. Optimisation effort goes to the actual cost drivers.
- Per-warehouse budget alarm. Alert when monthly spend crosses a threshold. Cost drift surfaces before the bill arrives.