CI/CD & GitOps Practical By Samson Tanimawo, PhD Published Feb 9, 2026 4 min read

Flaky Test Replay

Captured flake reproduced.

Capture

The reason flaky tests are hard to fix is that the failure is gone by the time you go looking for it. Hit retry, the test passes, and you move on. The bug is still there, you just lost the only useful evidence. The fix is to capture everything you would need to reproduce the failure at the moment the failure happens, not after.

What to capture, every failed run:

Bundle all of this into a single artifact named for the test ID and the run ID, attach it to the failed CI run, and retain it for at least 30 days. The capture itself should add no more than a couple of seconds to the test run. The cost is small. The value when you actually need it is enormous.

Replay

Once the capture exists, the next move is to run the test against the captured state on a developer machine and watch it fail in slow motion. This is where the bug actually gets understood, because now you have a reproducer instead of a mystery.

Replay turns flaky-test debugging from a guessing game into a mechanical process. The first time it pays off, the team becomes loud advocates for it.

Validate

The last step is the one most teams skip: prove the fix actually works against the original failure, not just against your hopeful new test case.

Capture, replay, validate is the difference between a team that lives with flaky tests forever and a team that retires them. Nova AI Ops captures process and external state on every CI failure, packages a replay artifact you can run locally, and tracks the cohort of validated fixes so you can see flake rate dropping per quarter instead of guessing whether it is improving.