CI Runner Strategy
Hosted, self-hosted, or hybrid runners.
Hosted
The decision about where CI runs has cascading consequences for cost, security, performance, and operational overhead. Most teams default to whatever their CI provider offers and discover the trade-offs only when scale forces a reconsideration. The hosted option is the right starting point and the right long-term answer for most teams.
What hosted runners actually offer:
- GitHub-hosted, GitLab-hosted, CircleCI cloud.: The CI provider operates the runner infrastructure. You configure workflows; they spin up ephemeral VMs or containers to execute them. The runners are fresh per job, automatically scaled, OS-patched, and tooling-updated.
- Lowest setup overhead.: Adopting hosted runners is configuration work, not infrastructure work. A team can have CI running within an hour of signing up. There is no cluster to size, no VM template to maintain, no patching schedule to manage.
- Fixed cost per minute.: Hosted runners bill by the minute, with predictable per-minute pricing. Costs are easy to forecast, easy to attribute per pipeline, easy to optimize. Teams know exactly what their CI bill is, broken down by repo, workflow, and time of day.
- Security boundary the provider owns.: The provider is responsible for runner isolation, host patching, network segmentation. This shifts the security burden but also the trust boundary; you are trusting the provider with your build secrets and your code. Most teams accept this trade.
- Limits on customization.: Hosted runners cannot reach into your private network without VPN or self-hosted proxies. They cannot use proprietary GPU SKUs the provider does not stock. They cannot mount specific filesystems or cache infrastructure you control. For most teams these limits do not matter; for some they are blockers.
Start with hosted runners. The cost-benefit math favors them for teams with bounded build volumes and standard tech stacks, which describes most engineering organizations.
Self-hosted
Self-hosted runners are the answer when hosted does not work. The decision is rarely about pure cost; it is about specific constraints (network access, GPU types, regulatory requirements) that hosted cannot meet. The price for solving those is taking on operational responsibility for the runner infrastructure.
- Run on your VM or Kubernetes cluster.: The CI provider's agent runs inside your infrastructure. The agent polls the provider for jobs to execute. When a job arrives, the agent runs it on your hardware, on your network, with your credentials.
- Cheaper at scale.: For large build volumes (thousands of jobs per day, hundreds of thousands per month), self-hosted is significantly cheaper per minute than hosted. The break-even is typically around 10,000 minutes per month for general-purpose workloads, lower for specialized hardware.
- Operational overhead is real.: Patching the runner host, managing the autoscaler, handling agent crashes, debugging network issues, rotating runner credentials. The team that adopts self-hosted gets a new platform component to operate, with the same reliability expectations as any other production system.
- Network access advantages.: Self-hosted runners can reach private VPCs, on-prem services, internal artifact registries that hosted runners cannot. For builds that need to talk to private resources, self-hosted is often the only option.
- Regulatory or data-residency requirements.: Some industries (healthcare, defense, regulated finance) have requirements about where build artifacts and source code can be processed. Self-hosted in a controlled environment is sometimes the only legal option.
The decision to go self-hosted is usually forced by a constraint, not chosen optimistically. Teams that go self-hosted without a clear constraint typically end up with the operational cost without the corresponding benefit.
Hybrid
Most mature teams converge on a hybrid model: hosted runners for the default workload, self-hosted runners for the cases that need it. The split is by job type, not by team or repo, and it captures the best trade-off across the dimensions that matter.
- Hosted for default jobs.: Lint, test, build, package. The vast majority of CI work fits within hosted runner capabilities. The team gets the operational simplicity for the bulk of their pipeline.
- Self-hosted for high-volume or specialized jobs.: Long integration test suites that consume hours of CI time per run. Builds that need GPUs, ARM, or specific kernel versions. Jobs that need to talk to private network resources. Each gets its own self-hosted runner pool, scaled independently.
- Pool tagging.: Workflows specify which runner pool they need via tags. The default tag points at hosted; specialized tags route to self-hosted pools. Engineers writing workflows do not have to know the infrastructure; they just need the right tag.
- Most teams converge here.: The pure hosted model breaks down for teams with specific needs. The pure self-hosted model is operationally expensive without justification. The hybrid splits the cost: hosted handles the volume, self-hosted handles the specialized cases. This is where teams end up after enough time and scale.
- Migration is incremental.: Moving to hybrid does not require rewriting workflows. It means setting up the self-hosted pools for the specific jobs that need them and tagging those workflows accordingly. The default workflows keep using hosted with no change.
The hybrid CI runner strategy is the long-term architecture for most engineering teams beyond a certain scale. Nova AI Ops watches per-pool runner utilization, surfaces the cases where workloads are running on the wrong pool (specialized jobs on hosted, default jobs on self-hosted) and tracks the cost trajectory across both so the team can see when the trade-offs are shifting.