AI & ML Intermediate By Samson Tanimawo, PhD Published Oct 7, 2025 7 min read

Feature Stores: What, Why, When

A feature store is a database optimised for serving the same features at training time and serving time. If those two paths diverge, your model breaks. The store is the fix.

The training/serving skew problem

Training data is computed offline, often in a data warehouse. Serving data is computed online, often in microseconds, against live state. If the computation differs in any way, rounding, time zones, null handling, aggregation window, the model sees different distributions in training and serving. Accuracy collapses.

This is “training/serving skew.” The single biggest source of silent ML failures in production.

A feature store solves it by being the canonical computation path for every feature, used by both training pipelines and serving pipelines. Same code, same definition, same result.

Online vs offline

Feature stores split into two storage paths:

The feature store handles synchronisation: when a feature is computed offline, the latest values are streamed to the online store so serving sees the same definition.

Popular options in 2025

For most teams: Feast self-hosted is the first stop. Migrate when team or feature complexity demands it.

When you don’t need a feature store

Three signs you can skip it:

Most teams below 5 ML engineers don’t need one. Most teams above 15 do. The middle is judgement.

How to start

  1. Identify your top 5 features by impact. Document their definitions.
  2. Migrate those into Feast first. Confirm offline and online values match.
  3. Update one model’s training and serving paths to use the feature store. Measure: did skew drop?
  4. If yes, expand. If no, keep iterating, the feature store isn’t the bottleneck.

The mistake to avoid: a six-month feature-store project with no measurable improvement. Start small, measure, expand. The investment compounds when it works and is salvageable when it doesn’t.