Progressive Delivery Tools
Argo Rollouts, Flagger. Beyond Deployment.
Argo Rollouts
Argo Rollouts is the Kubernetes-native progressive delivery controller. It replaces the default Deployment resource with a Rollout that supports canary, blue-green, and weighted traffic shifting strategies as first-class concepts. For teams running Kubernetes and using Argo CD for GitOps, Rollouts is usually the path of least resistance.
What Argo Rollouts gives you:
- Canary deploys with traffic weighting.: Define a series of steps: 5% of traffic for 5 minutes, then 25% for 10 minutes, then 50%, then 100%. Each step automatically gates on metric analysis. If error rate or latency degrades during a step, the rollout halts and rolls back.
- Blue-green with manual or automatic promotion.: Deploy the new version alongside the old, route synthetic traffic to the new version for verification, then flip live traffic. The flip is atomic and the rollback is the same flip in reverse.
- Metric analysis as a gate.: Argo Rollouts queries Prometheus, Datadog, New Relic, or any custom metric source during each canary step. The gate is a real number against a real threshold, not a manual approval. This is what turns canary from a procedure into a system.
- Kubernetes-native.: The Rollout resource extends Kubernetes via a CRD, integrates with HPA, PDB, and ingress controllers, and works with the GitOps tooling teams already use. The integration cost is minimal for teams already on Kubernetes.
- Best for service-mesh-agnostic deployments.: Argo Rollouts can use multiple traffic providers (NGINX Ingress, AWS ALB, Istio, SMI, Ambassador). If your traffic plane is heterogeneous or you do not have a mesh, Argo Rollouts is the safer bet because it supports more providers natively.
The trade-off is that you adopt a new resource type and a new controller, and your existing Deployment manifests need conversion. The conversion is mechanical but it is real work for a large fleet.
Flagger
Flagger is the Flux-ecosystem alternative to Argo Rollouts, with an emphasis on integration with service meshes (Istio, Linkerd, App Mesh, Open Service Mesh). It does the same kind of progressive delivery but leans more heavily on the mesh for traffic shaping and observability.
- Mesh-friendly traffic shifting.: When you already run a service mesh, Flagger uses the mesh's traffic-splitting primitives directly. Canary weights map to VirtualService rules in Istio, TrafficSplit in SMI, and so on. The integration is tighter than Argo's mesh support and the configuration is closer to mesh-idiomatic.
- Built-in metric providers.: Flagger ships with Prometheus, Datadog, CloudWatch, Stackdriver, and others. Custom providers via webhook. The metric query is part of the Canary CRD, so the deploy gate and the analysis live together in source control.
- Webhook hooks for testing.: Run conformance tests, load tests, or smoke tests at each canary step. The webhook returns success or failure and Flagger uses the result to decide whether to advance. This makes Flagger especially good for teams who want to run scripted verification beyond pure metric analysis.
- FluxCD ecosystem fit.: If you are already running Flux for GitOps, Flagger's Canary resources slot into the same operational pattern. Argo Rollouts integrates with Flux too but is a more natural fit with Argo CD.
- Choose Flagger when the mesh is already there.: If you have invested in Istio or Linkerd, Flagger leverages that investment. If you do not have a mesh, Flagger's value is reduced and Argo Rollouts becomes more attractive.
The choice between Argo Rollouts and Flagger usually comes down to which ecosystem you are already in (Argo CD vs Flux) and whether you have a service mesh. Both tools have feature parity for the basic canary and blue-green cases.
Native
Vanilla Kubernetes Deployments support rolling updates and not much else. For many use cases, that is enough. For anything that needs canary analysis, blue-green, or traffic-weighted shifts, the native controller is too limited.
- Default Deployment is rolling-only.: The Deployment resource ships with a RollingUpdate strategy that replaces pods incrementally and a Recreate strategy that takes the service down between versions. There is no canary primitive, no traffic weighting, no analysis gate.
- Rolling is fine for most internal services.: If the service has consistent behavior across versions, no schema-incompatible changes, and bounded blast radius, rolling deploys with proper readiness probes and a small surge buffer are usually sufficient. Most internal services live here.
- Limited for stateful or revenue-path services.: When the cost of a regression in 10% of traffic for 30 seconds is meaningful, rolling alone is not enough. You need the metric-gated canary that Argo Rollouts or Flagger provides. Trying to bolt this onto vanilla Deployments produces fragile DIY solutions.
- Upgrade when the cost of a bad deploy crosses a threshold.: The right time to add Argo Rollouts or Flagger is when the team has experienced the second or third bad deploy that a 5% canary would have caught. Before then, the operational complexity of progressive delivery is not worth the investment. After, it pays for itself in a single avoided incident.
- Migration is incremental.: Both Argo Rollouts and Flagger let you migrate one service at a time. Start with the most reliability-critical workload, prove the pattern, then expand. There is no big-bang migration and no need to convert the entire fleet at once.
Progressive delivery tools turn deploy-time risk into a measurable, gated process. Nova AI Ops integrates with Argo Rollouts, Flagger, and the underlying SLO data they query, so the canary analysis is using the same SLO definitions your team already trusts and the deploy gate fires on the same burn-rate threshold that pages the on-call.