Pipeline Fail-Fast Patterns
Fail fast, signal early.
Why fail fast
A 30-minute pipeline that fails at minute 28 wastes 28 minutes.
Fail-fast: cheap checks first, expensive checks last. The pipeline aborts as soon as a check fails.
Reduces feedback time, reduces compute cost, reduces engineer wait time.
Stage ordering
Lint and format: 30 seconds.
Type check: 1-3 minutes.
Unit tests: 3-10 minutes.
Integration tests: 10-30 minutes.
Security scans, image builds, deploy: only if everything above passed.
Parallel where it helps
Lint, type check, and unit tests can run in parallel for different modules.
Don't parallelize at the cost of clarity. A 50-job matrix that nobody understands is worse than a 5-job sequential pipeline.
Cap parallelism. CI runners are not infinite; over-parallelizing hits queueing limits.
Fast feedback to author
On failure, notify the PR author immediately. Don't wait for the full pipeline to finish.
Slack notification with the failure stage and a link to the log line.
Annotate the PR with the failure: GitHub Checks, GitLab merge request comments.
How to roll this out
Audit current pipeline for stages that always fail late. Move fast checks earlier.
Set a target: 90% of failures detected in the first 5 minutes.
Track time-to-failure as a CI metric. Improvements compound across thousands of builds.