CI/CD & GitOps Practical By Samson Tanimawo, PhD Published Nov 1, 2025 4 min read

Test Flakiness Budget

Cap on flaky tests. Forcing fixing.

What a flake budget is

Maximum acceptable percent of test runs that flake. Above the budget, no new tests merge until cleanup.

Typical budget: 1% of CI runs experience a flake.

Forces ownership. The team that adds the flake also fixes it.

Measuring flakes

Re-run failed tests on the same SHA. If they pass on retry, mark as flake.

Track per-suite flake rate. Some suites (browser tests, integration) are inherently flakier.

Tools: BuildPulse, Trunk.io, GitHub's flaky test detection.

When you blow the budget

Halt new test additions. Existing tests may continue, but no new tests until flake count drops.

Quarantine the worst offenders. Move to a non-blocking suite.

Allocate engineering time. Flake fixes don't ship features; they need explicit prioritization.

Preventing new flakes

Code review checks: explicit synchronization, no `sleep()` in tests, deterministic test data.

Pre-merge: run new tests 10 times in CI before allowing the merge.

Dedicated reviewer for test quality on tier-1 services.

How to set the budget

Start with current flake rate as baseline. Budget = current rate * 0.5 over 6 months.

Tighter for unit tests (under 0.1% flake rate is achievable). Looser for end-to-end tests (1-5% may be necessary).

Publish the budget and current rate weekly. Visibility drives the cleanup.