AI & ML Intermediate By Samson Tanimawo, PhD Published Sep 16, 2025 8 min read

MLOps: 12 Things You'll Wish You Built Earlier

Every team that ships ML in production hits the same set of operational gaps within 12 months. Building these now saves the future you a quarter of fire-fighting.

The 12 things

  1. Experiment tracking: every training run logged with hyperparameters, metrics, and artifacts.
  2. Data versioning: every dataset has a version pinned to training runs.
  3. Model registry: every model has a version, lineage, and stage (dev / staging / prod).
  4. Reproducible training: any past run can be rerun bit-exact within a week.
  5. Eval harness: standardised eval scripts run against every candidate model.
  6. Pre-deploy gates: model cannot promote without passing eval thresholds.
  7. Canary deployment: new model serves a fraction of traffic first.
  8. Production monitoring: live metrics on prediction quality, not just latency.
  9. Drift detection: alarms when input distribution shifts from training distribution.
  10. Rollback path: previous model version one click away.
  11. Audit trail: who deployed what when, with what reasoning.
  12. Cost dashboard: $ per prediction visible to the team that owns the model.

In what order

Most teams build them roughly in this priority:

  1. Experiment tracking + model registry. Without these, nothing else works.
  2. Eval harness + pre-deploy gates. Stops obvious regressions.
  3. Production monitoring + canary deployment. Catches real-world failures.
  4. Rollback + audit trail. For when something goes wrong.
  5. Drift detection + data versioning. For long-running stability.
  6. Cost dashboard + reproducibility. For mature optimisation.

Build incrementally. Each step pays for itself before the next is needed.

The lite version (small team)

If you have one ML engineer and a quarter, here’s a minimum viable MLOps:

That covers 6 of the 12. It’s a week of work, and it shrinks your incident surface area dramatically.

Anti-patterns

Three patterns to avoid:

What to build first this week

If you’ve done none of this: experiment tracking. MLflow takes an afternoon to set up. Suddenly every run is logged. The next week, build the model registry on top.

If you have tracking but no eval: write the eval script. Make it a CI step on the model repo. Block merges that fail evals.

The rest grows from there. Each piece feeds the next.