Rollback vs Roll-Forward

Two recovery strategies.

Rollback brings the old version back

Rollback restores the last known-good state. It is the fastest, most predictable recovery path when the failure is real and the fix is not obvious.

Roll-forward fixes the bug live

Roll-forward ships a fix on top of the failing version. It is the right call when rollback is harder than fixing, but it concentrates risk during an active incident.

How to decide

The default is rollback. Roll-forward is the exception, and the exception needs to be argued for explicitly in the incident channel.

Schema-aware rollback

Schema changes break the rollback story. Design migrations so code can roll back independently of data; the upfront cost pays back the first time you need it.

Operational rules

Rollback only works if the team has done it before. Practice and automation turn it from a 3am scramble into a one-button operation.