Blue-Green vs Canary vs Rolling: Deployment Strategies Compared

Three deployment strategies; three different bets on blast radius vs cost.

What each does

Three strategies, three different bets on blast radius versus cost. Each one ships the new version differently and the rollback story differs accordingly.

Blue-green. Stand up V2 alongside V1; flip the load balancer atomically; rollback is the inverse flip.
Canary. Ramp V2 gradually (5%, 25%, 50%, 100%); watch metrics; abort if degraded.
Rolling. Replace pods incrementally; default in Kubernetes Deployments and most ASG configurations.
Shadow. Optional fourth pattern: V2 sees real traffic but does not respond; useful for validation before any of the above.

Cost per strategy

Blue-green: 2x infrastructure during cutover.
Canary: 1.05-1.3x during ramp.
Rolling: 1x or slightly above; the cheapest.

Rollback story

Rollback speed is what separates a good day from a bad one. Each strategy has a different rollback time profile.

Blue-green. Instant; flip the load balancer back to blue; existing connections may need draining.
Canary. Instant for the percentage just promoted; already-served traffic is unaffected.
Rolling. Minutes; depends on rollout cadence; the ongoing rollout reverses one pod at a time.
Auto-rollback. Argo Rollouts and Flagger watch metrics during canary and abort automatically; the human is the slow path.

Picking correctly

Pick by blast radius and rollback need, not by trend. Most deploys want canary; routine work wants rolling; specific cases want blue-green.

Blue-green. Schema migrations, breaking API changes, expensive validation; pay 2x infra during cutover.
Canary. Most deploys with observable user impact; the production-grade default.
Rolling. Routine deploys with mature health-check coverage; the cheapest option that works.
Per-service. Different services can use different strategies; do not force one org-wide.

Antipatterns

Blue-green for routine deploys. Pay for capacity you do not need.
Canary without good metrics. Cannot detect when to abort.
Rolling with bad health checks. Bad pods replace good ones silently.

What to do this week

Three moves. (1) Apply this to one pipeline first. (2) Measure deploy frequency / MTTR before/after. (3) Document the outcome so the next team starts from data.