Defining SLIs for Data Pipelines

Pipeline SLIs differ from request-response SLIs. The three dimensions that matter, the metric definitions, and the alerting that catches drift.

Freshness

SLIs (service-level indicators) for data pipelines are different from SLIs for online services. Pipelines are not request-driven; their outputs are datasets, not responses. The relevant dimensions are freshness, completeness, and correctness. Each dimension produces an SLI that the pipeline team can target and report against.

What freshness measures:

Freshness is the most user-visible pipeline SLI. Downstream consumers feel the freshness directly.

Completeness

Completeness measures whether the pipeline processed all the data it was supposed to. A pipeline that runs successfully but processed only 80% of the expected records has a completeness problem. The SLI captures this.

Completeness is the data-integrity SLI. Without it, partial pipeline failures masquerade as success.

Correctness

Correctness is the hardest pipeline SLI to measure. It requires comparing the pipeline's output to ground truth; the ground truth is not always available. Sample-based verification is the practical compromise.

SLI for pipelines is a discipline that brings the rigor of online-service SLOs to data pipelines. Nova AI Ops integrates with pipeline orchestrators and data quality tools, surfaces freshness, completeness, and correctness trends, and produces the per-pipeline SLO report that the data engineering team uses to drive quality.