Database Migration in Cloud: The Three-Phase Rule
Cloud database migrations have specific risks. The three-phase pattern adapted for cloud-native databases.
Three phases
The database migration rule of three is a specific phased approach for moving data between database systems with low risk. The pattern works because each phase is reversible until the final cutover; problems found at any phase can be addressed without the rollback being catastrophic. The discipline applies to cloud-to-cloud migrations, on-prem-to-cloud migrations, and engine changes.
What the three phases look like:
- Phase 1: dual-write to old and new.: The application writes to both databases. The old database remains the read source; the new database accumulates the same writes. Replication tooling backfills historical data so the new database catches up to the old.
- Backfill historical.: The historical data is replicated from old to new. The replication tooling (AWS DMS, GCP DataStream, debezium, similar) handles this; the team configures and monitors. Once backfill completes and the dual-write is in place, the new database is current.
- Phase 2: dual-read; primary is the new.: Reads start going to the new database. The old database remains as fallback; if the new database returns errors or stale data, the application can fall back. The dual-read provides safety during validation.
- Phase 3: drop old.: Once the new database is fully validated, the old database is decommissioned. The dual-write stops; reads come exclusively from the new database; the old infrastructure is removed.
- Each phase is a deliberate decision.: The team verifies the prior phase before moving forward. Phase 1 to Phase 2 requires confidence that dual-write is working; Phase 2 to Phase 3 requires confidence that the new database serves all traffic correctly.
The three phases produce a managed migration. Each phase is bounded; each transition is deliberate; rollback is available until the final phase.
Cloud-specific
Cloud database migrations have specific tooling and patterns. The cloud providers offer managed migration services that handle the dual-write and replication mechanics; the team uses these rather than building custom replication.
- Use AWS DMS, GCP DataStream, or equivalent.: AWS Database Migration Service handles the dual-write and replication for AWS-bound migrations. GCP DataStream covers similar territory for GCP. Both support a wide range of source and target databases; both handle ongoing replication after backfill.
- Verify replication lag during phase 1.: The tooling reports replication lag. The team monitors it; lag spikes indicate problems; sustained low lag confirms the migration is progressing. The lag metric is the primary health indicator during phase 1.
- Production traffic should not see staleness.: Reads continue from the old database in phase 1, so users see the up-to-date data. The replication lag matters for phase 2 transition (where reads start hitting the new database); during phase 1 it is mostly an internal indicator.
- Schema differences require attention.: Migrations between different engine types (e.g., Oracle to PostgreSQL) often involve schema translation. The migration tool handles common patterns; complex schema constructs sometimes need custom handling. Plan time for schema review.
- Test the cutover in non-production.: A non-production environment exercises the same three-phase migration first. Issues surface in non-production; the production migration follows the validated path.
The cloud-specific tooling is mature. The team's job is configuration and monitoring; the heavy lifting is automated.
Rollback
Each phase has a defined rollback path. Knowing the rollback path before starting each phase produces confidence; without rollback paths, the migration becomes high-stakes at every step.
- Phase 1: drop new tables.: Rollback from phase 1 means stopping the dual-write and dropping the new database. The old database has been the source of truth throughout; nothing is lost. The migration restarts from scratch when issues are addressed.
- Phase 2: flip primary back.: Rollback from phase 2 means flipping reads back to the old database. The dual-write continues; the old database is current; the rollback is fast. Issues found during phase 2 do not produce data loss.
- Phase 3: no rollback.: Once phase 3 completes (old database decommissioned), rollback is no longer available. The team verifies thoroughly before phase 3; the verification is the substitute for rollback.
- Plan accordingly.: The phase 3 transition is the high-stakes moment. Phase 2 should run long enough to catch any latent issues. The team accepts the irreversibility consciously; phase 3 happens when they are confident.
- Document the verification.: The verification before phase 3 is documented. What was checked? What metrics confirmed the new database was healthy? Future audits and postmortems can reference the documentation.
Database migration rule of three for cloud is one of the most reliable patterns for moving data with low risk. Nova AI Ops integrates with database migration tooling, surfaces replication lag and dual-write health, and produces the per-phase verification report that informs each transition decision.