Self-Hosted Runners vs Cloud Runners: Cost and Security
At small scale, cloud runners win on simplicity. At scale, self-hosted earn back their operational cost in budget alone.
Cloud runners pros/cons
Cloud runners (GitHub-hosted, GitLab SaaS, CircleCI) win on simplicity. Pay per minute, zero ops, no security boundary work; the trade-off is per-minute cost at scale.
- Zero ops. No nodes to provision, patch, or secure; the vendor handles it.
- Pay-per-minute. $0.008/min for GitHub-hosted; transparent and elastic.
- Heavy-use cost. $250/runner-month at sustained load; the bill scales linearly with CI volume.
- Security boundary. Vendor handles isolation between jobs; no shared mutable state to leak.
Self-hosted pros/cons
- Self-hosted: pay only the underlying compute; you own ops.
- Cost: $30-100/runner-month for the EC2/GKE node; minus the spare capacity you can pack on.
Cost crossover
The cost comparison flattens at a measurable threshold. Below it, cloud wins; above it, self-hosted wins on dollar math even after ops cost.
- Crossover point. Roughly 50,000 CI minutes per month for the team; the threshold is workload-dependent.
- Below crossover. Cloud is cheaper once you account for self-hosted operational cost.
- Above crossover. Self-hosted wins the dollar math; the savings compound with scale.
- Hybrid. Cloud for low-volume teams, self-hosted for the heavy hitters; per-team optimisation, not org-wide flag day.
Security model
Self-hosted runners are a real security surface. The discipline is ephemerality, network isolation, and OIDC for cloud auth.
- Ephemeral runners. Deleted after each job; one bad job cannot poison subsequent ones.
- Private network. Runners in a private subnet; not internet-exposed; egress through controlled paths.
- OIDC for cloud auth. No long-lived AWS or GCP keys on runners; OIDC tokens scoped per workflow.
- Anti-pattern. Static long-lived runners with secrets in env are the typical vulnerability shape.
Antipatterns
- Self-hosted without ephemerality. One bad job poisons all subsequent.
- Cloud runners forever as you scale. Cost balloons.
- Self-hosted in default-allow networks. Lateral movement risk.
What to do this week
Three moves. (1) Apply this to one pipeline first. (2) Measure deploy frequency / MTTR before/after. (3) Document the outcome so the next team starts from data.