Deployment Strategies: Canary vs Rolling
Two strategies; different shapes.
Rolling
Rolling and canary are two deployment strategies for Kubernetes. Rolling is the default, simple option; canary adds risk-mitigation through gradual traffic shifting. The choice depends on the change's risk profile and the team's tooling.
What rolling provides:
- Replace pods one at a time.: The Deployment controller replaces pods one at a time (or a small batch). Old pods terminate; new pods start; the workload is updated incrementally.
- Default.: Kubernetes Deployments default to rolling updates. The team gets rolling without configuration; the standard is built-in.
- Simple.: Rolling does not require additional tooling. The Deployment object handles the rollout; rollback is via the Deployment's revision history.
- Standard.: Most production deployments use rolling. The pattern is well-understood; the operational story is straightforward; new team members learn it quickly.
- maxSurge and maxUnavailable.: The rolling parameters control the rate. maxSurge allows extra pods during rollout; maxUnavailable allows reduced capacity. The defaults work for most workloads.
Rolling is the right default. The simplicity matches most deployment needs.
Canary
Canary deployments shift traffic gradually. A small percentage of traffic goes to the new version; metrics are observed; if healthy, traffic increases; if not, the deployment rolls back.
- 5% traffic to new.: The canary starts with a small traffic percentage. 5% is typical; some teams start lower; the small initial exposure bounds the blast radius.
- Verify.: Metrics are observed during the canary stage. Latency, error rate, business metrics all are checked; healthy canaries proceed; unhealthy ones rollback.
- Ramp.: The traffic percentage increases over time. 5% to 25% to 50% to 100%; each stage is verified; the rollout completes when 100% traffic is on the new version.
- More controlled.: The control is granular. Each ramp stage is a checkpoint; rollback is fast; the safety is real.
- Needs Argo Rollouts or similar.: Canary requires more sophisticated tooling than the standard Deployment. Argo Rollouts, Flagger, or similar tools handle the traffic management and verification.
Canary is the safety-focused option. The additional tooling is justified when the change risk warrants it.
Decide
The choice depends on the change. Routine deployments use rolling; risky changes use canary. The team's standard is rolling with canary as an option for specific cases.
- Rolling for routine.: Most deployments are routine. Bug fixes, feature additions, minor changes all use rolling; the simpler pattern fits the lower risk.
- Canary for risky changes.: High-risk changes use canary. Major refactors, performance-critical changes, customer-facing breaking changes warrant the additional safety.
- Cost vs safety trade.: Canary requires tooling and operational time. The cost is justified for high-risk changes; routine changes do not warrant the cost.
- Document the policy.: The team's policy is documented. Which changes use canary; which use rolling; the decision is consistent across the team.
- Some teams use canary by default.: Teams with strong observability and tooling investment use canary for everything. The cost is amortized; the safety is universal; the team's culture supports it.
Deployment strategies (canary vs rolling) is one of those Kubernetes operational choices that affects deploy safety. Nova AI Ops integrates with deployment platforms, surfaces deploy outcomes, and helps teams identify when their strategy choices match the actual change risk.