SLO & Reliability Practical By Samson Tanimawo, PhD Published Jan 19, 2026 4 min read

SLOs on Data Pipelines

Pipelines need different SLOs than APIs.

Three pipeline SLO dimensions

Freshness. How old is the data the pipeline produced. Critical for downstream consumers that depend on recent data.

Completeness. Did the pipeline process all expected records. Drops indicate upstream issues or transformation bugs.

Correctness. Sample-based: pick a small subset, verify outputs match expected. Hardest to measure but most important.

Express in time: 95% of partitions arrive within 30 minutes of source. Specific; comparable across pipelines.

Per-pipeline lag tracked continuously. Alert when sustained lag exceeds threshold.

Match SLO to downstream needs. Real-time dashboards: under 5 minutes. Daily reports: under 1 hour. Nightly batches: same day.

Expected record counts per partition or run. Compare actual to expected; alert on shortfall.

Causes of incompleteness: upstream missing data, schema validation drops, transformation errors, dead-letter queue arrivals.

Audit dead-letter rates monthly. Silent drops via try/except logging are the worst pattern; fix at source.

Per-pipeline dashboard with all three dimensions plus burn-rate.

Couple lag SLO to autoscale where possible. Lag breach triggers consumer scale-up.

Quarterly review. Pipeline workloads grow; SLOs that fit last year may be tight today.