CI/CD for Machine Learning: How MLOps Differs
MLOps is CI/CD with extra stages: data validation, model eval, drift monitoring. Same discipline; broader scope.
Why ML CI/CD differs
ML pipelines pass data through training; output models; eval against benchmarks; deploy.
Each stage has failure modes that traditional CI/CD does not handle.
Four extra stages
- 1. Data validation. Schema, distribution, freshness.
- 2. Model training. Reproducible; tracked experiments.
- 3. Model evaluation. Against benchmark + production-like sets.
- 4. Deployment + drift monitoring. Watch for performance degradation.
Tooling per stage
Validation: Great Expectations, Pandera.
Training: MLflow, Weights & Biases, Kubeflow.
Eval: custom + standard benchmarks.
Drift: Evidently, Arize, WhyLabs.
Team structure
Cross-functional team: data engineers, ML engineers, platform engineers. Each owns a stage.
Without cross-functional ownership, models stall between data team and platform team.
Antipatterns
- ML deployment via standard CI. Misses validation stages.
- No drift monitoring. Models silently degrade.
- One person owns the full pipeline. Bus factor 1.
What to do this week
Three moves. (1) Apply this to one pipeline first. (2) Measure deploy frequency / MTTR before/after. (3) Document the outcome so the next team starts from data.