The Canary Cookbook for High-Stakes Changes
Three canary patterns: percentage-based, geo-based, customer-segment-based. When to use each, with worked examples and the gotchas.
Percentage-based
Percentage canary splits traffic by random population: 5 percent on the new version, 95 percent on the old. Best for changes that affect all customers similarly (model swap, DB client upgrade, refactor); misses important-customer-specific issues because the random sample may not include them.
- 5 percent new, 95 percent old. Standard traffic split per canary. Measure metrics; compare against the old version.
- Best for uniform changes. Model swap, DB client upgrade, refactor. Random split is the right shape.
- Gotcha: misses important customers. Random 5 percent may include zero enterprise customers. Their issues stay invisible.
- Documented coverage check per canary. "Did this hit our key customers" review per canary. Catches the random-split blind spot.
Geo-based
Geo canary rolls out region by region. Best for regional infrastructure changes (CDN tweaks, regional capacity); regions are not interchangeable, so per-region soak windows are mandatory. us-east passing tells you nothing about eu-west under different traffic shapes.
- One region first, measure, expand. Regional progression per canary. Blast-radius stays bounded.
- Best for regional infrastructure. CDN tweaks, regional capacity changes. Regional changes deserve regional canaries.
- Gotcha: regions are not interchangeable. us-east passing does not imply eu-west passing. Different traffic shapes hide different bugs.
- Per-region soak window. Documented per-region observation duration. Catches the assume-uniformity error.
Customer-segment-based
Segment canary rolls out by user population: internal first, beta second, everyone third. Best for UX or workflow changes where surprise responses matter. Internal users are biased toward acceptance because they understand the product context customers do not have, so weight customer feedback more.
- Internal first, beta second, everyone third. Staged segment progression per canary. Trust before scale.
- Best for UX or workflow changes. User-facing features where surprise responses are the risk. Canary catches "what does this even do" reactions.
- Gotcha: internal users are biased. Internal feedback skews toward acceptance. Weight customer feedback heavier.
- Named promotion gate per stage. "Promote when X" criteria per stage. Catches premature expansion.