Feature Flags as Deployment Strategy

Feature flags shipped right are continuous-delivery’s killer feature. Shipped wrong, they are technical debt that compounds.

Why decouple deploy from release

Deploy puts code on prod; release puts the change in front of users. Without flags, the two happen together; with flags, they decouple. Decoupling unlocks dark launches, gradual ramps, instant kills, and A/B tests as routine product capabilities.

Ship dark. Code on prod, flag off; the team verifies in production without exposing the change.
Ramp gradually. 1%, 10%, 50%, 100%; observe at each stage; catch problems while blast radius is small.
Kill instantly. Flag flip beats deploy rollback; mitigation in seconds, not minutes.
A/B test. Feature flag becomes the experiment surface; treatment and control coexist in production.

The four-flag taxonomy

Four flag types cover the legitimate use cases. Mixing types in one flag system is the most common cause of flag debt; each type has its own lifecycle.

Release flag. Wraps a new feature; toggles per user, segment, or percentage; lifespan is the rollout window.
Operational flag. Kill switch for risky features; the off-switch the on-call needs at 3am.
Permission flag. Gates by entitlement; tied to billing tier, role, or contract; lifespan matches the product.
Experiment flag. A/B test infrastructure; lifespan equals experiment duration; remove on conclusion.

Per-type lifecycle

Release flags: 30-90 days lifespan; remove after full ramp.

Operational flags: indefinite; reviewed annually.

Permission flags: tied to product roadmap.

Experiment flags: lifespan = experiment duration; remove on conclusion.

Cleanup discipline

Each flag has expiry tagged at creation. Quarterly: report on flags past expiry; remove or rationalise.

Without expiry discipline, flags accumulate; flag count exceeds 100 in 6 months.

Antipatterns

Flags without expiry. Debt compounds.
Permission logic in release flags. Wrong flag type.
Flag everything. Configuration becomes the codebase.

What to do this week

Three moves. (1) Apply this to one pipeline first. (2) Measure deploy frequency / MTTR before/after. (3) Document the outcome so the next team starts from data.