CI Cost Attribution
Per-team CI cost.
Why attribute CI cost
CI bills can hit $50k-500k per month at engineering scale. Without attribution, the cost is a single line item and nobody owns optimisation; attribution surfaces who consumes what (often one team’s slow integration tests dominate the bill); chargeback makes budget conversations concrete because the team that complains about cost becomes the team that optimises.
- Bill scale. $50k-500k per month at engineering scale; the line item is real money.
- Single line item problem. Without attribution, nobody owns optimisation; the cost stays opaque.
- Attribution surfaces concentration. Often one team’s slow integration tests dominate the bill; visibility drives action.
- Chargeback enables ownership. Each team sees their spend; the complainer becomes the optimiser.
Tagging and tracking
The mechanism is per-run tagging plus cost computation. Tag every workflow run with team, repo, environment (GitHub Actions and CircleCI support custom tags; self-hosted runners need explicit tagging); per-run cost is minutes times runner cost ($0.005-0.02 per minute on spot self-hosted, $0.008-0.16 per minute for hosted GitHub); aggregate daily into a cost table for per-team dashboards.
- Per-workflow tags. Team, repo, environment; GitHub Actions and CircleCI support custom tags; self-hosted runners need explicit tagging.
- Per-run cost computed. Minutes × runner cost; spot self-hosted $0.005-0.02/min, hosted GitHub $0.008-0.16/min.
- Daily aggregation. Cost table built daily; per-team dashboard surfaces monthly spend, trend, top workflows.
- Quarterly review. Surface at engineering reviews; the cost data informs investment.
Optimisation patterns
Three patterns drive most CI savings. Caching (layer, dependency, test result) is the highest-leverage win, cutting time 50-80%; skip unchanged paths via path filters or change-detection scripts; right-size runners by profiling build CPU because 4-core often beats 2-core on wall-clock despite higher per-minute cost.
- Caching highest-leverage. Docker layer cache, dependency cache (npm, pip, cargo), test result cache; cuts time 50-80%.
- Skip unchanged paths. Path filters or change-detection scripts; if only docs changed, skip integration tests.
- Right-size runners. 4-core often finishes faster than 2-core despite higher per-minute cost; profile CPU usage.
- Per-pattern measurement. Each optimisation measured for actual savings; supports continued investment.
Budget enforcement
Budget enforcement makes attribution actionable. Per-team monthly cap with soft warning at 80% and hard stop at 100% (manager override) forces conversations early; per-PR budget flags single PRs consuming $100+ which is often a missing cache or runaway test loop; quarterly recalibration justifies cap changes against actual workflow growth.
- Per-team monthly cap. Soft warning at 80%, hard stop at 100% requires manager override; forces conversations early.
- Per-PR budget. PRs consuming $100+ are flagged; often a missing cache or runaway test loop.
- Quarterly recalibration. Teams that outgrow caps justify the increase; teams under can take on more workflows.
- Per-budget review owner. Each cap has a named owner who can defend changes; supports accountability.
Operational discipline
Three disciplines compound CI cost ownership. Owner per workflow because workflows without owners are technical debt; cost in PR comments because visibility creates awareness; track CI cost as a percentage of engineering payroll because healthy is 1-3% and above 5% means CI optimisation pays back fast.
- Owner per workflow. Workflows without owners are technical debt; quarterly review surfaces orphans.
- Cost in PR comments. CI tool reports per-PR cost back to the engineer; visibility drives optimisation.
- Cost as % of payroll. Healthy 1-3%; above 5%, CI optimisation pays back fast.
- Per-quarter discipline review. Each discipline measured for adherence; supports continued investment.