Skip to content

Expand-Contract Migrations

Code can be deployed and rolled back in minutes. Data cannot. A database schema migration, once applied, has reshaped the state that every running instance of the application depends on. If the migration is wrong, the running instances behave incorrectly. If the migration is irreversible, the recovery options shrink.

Expand-contract is the standard pattern for changing schemas (or any other piece of long-lived shared state) safely, without taking the system offline and without forcing a coordinated deploy.1 The pattern has many names in different contexts: parallel change, expand-and-contract, or just safe schema migration. The shape is the same.

The Pattern

A safe schema change unfolds in four stages, each deployed and verified independently:

  1. Expand. Change the schema in a way that is backward-compatible. Add the new column, the new table, the new index, the new field. The old code continues to work because nothing it depends on has been removed.
  2. Dual-write or migrate code. Deploy application code that writes to both the old and the new shape, or that can read either. The system now has both representations available; nothing has been removed.
  3. Backfill. Move existing data to the new shape. This can run in the background, over time, without blocking writes. Old code keeps writing to the old shape; new code reads from the new shape; the backfill makes the two consistent.
  4. Contract. Once the team is confident that nothing depends on the old shape and the new shape is correct, remove the old code paths and then the old schema elements. This is the only stage that destroys information.

Each stage is reversible until the contract step. If any stage uncovers a problem, the team can pause and fix it without committing to the whole migration.

Why It Matters

A few of the problems expand-contract solves:

  • Zero-downtime deployment. A rolling or blue-green deployment requires that old and new code can run simultaneously. They cannot if the new code has destroyed the schema the old code depends on.
  • No coordinated cutover. With expand-contract, the team is not aiming at a single moment when everything must flip together. The migration is a sequence of independent small changes, each verifiable on its own.
  • Recoverability. A normal deploy can be rolled back. A schema change in the middle of a normal deploy is much harder to roll back. Expand-contract keeps rollback cheap until the final contract step.
  • Incremental learning. Each stage tells the team something. The backfill might reveal that the data is messier than expected; the dual-read might reveal a code path nobody knew about. Spreading the change in time lets the team learn while there is still time to adjust.

When the Pattern Gets Harder

Expand-contract is not free. Some situations make it more expensive or more involved:

  • Constraints and indexes. Adding a constraint, a unique index, or a foreign key may require validating existing data, which may be slow on a large table. The expand step has to be split further: add the structure without enforcement, validate, then enforce.
  • Very large tables. Backfills on tables with hundreds of millions of rows have to be paced and chunked, often running over days or weeks. The team needs to monitor for lock contention, replication lag, and the cost of the migration itself.
  • Schemas shared across services. When more than one service writes to the same shape, the dual-write and code-migration steps require coordination across teams. Expand-contract still works, but the calendar grows.
  • Strict consistency requirements. Systems where stale data is dangerous (financial ledgers, compliance audit trails) may require additional invariants during the dual-write phase: explicit reconciliation, transactional dual-writes, or write-through patterns to keep the two shapes synchronized.
  • Irreversible operations. Some changes, like dropping a column or deleting a table, are inherently destructive. The contract step is always the highest-risk stage. The team's discipline before pulling that lever determines whether the previous three stages bought any safety.

Common Anti-Patterns

  • "Migrate during the maintenance window." Taking the system offline to run a schema change. Defensible occasionally; rarely the cheapest option once expand-contract is in the team's toolkit.
  • "We'll just deploy them together." Trying to ship application code and the breaking schema change in a single deploy. Works until the first time it doesn't.
  • Skipping the dual-write stage. Going directly from expand to contract because the volume seems small. The skipped stage was the safety net.
  • Forgetting the contract. Stopping at stage 2 or 3 because the new shape is working. The schema accumulates old columns and tables, each of which is a permanent piece of clutter and a potential source of future confusion.
  • No verification before contract. Removing the old shape without explicit evidence that nothing depends on it. The "evidence" should include queries against application logs, code search, and analytics, not just an assumption.

What This Looks Like in Practice

  • Treat schema changes as first-class deploys. Each stage of expand-contract is its own deployable change, reviewed and shipped on its own.
  • Make the contract step a deliberate decision. Track which migrations are mid-flight and which are awaiting contract. Plan the contract step explicitly rather than letting it slip.
  • Build backfill tooling that is observable and pausable. Long-running backfills should report progress, expose metrics, and allow the team to slow them down or stop them.
  • Coordinate schema changes through migrations, not ad-hoc SQL. Versioned, reviewed, repeatable migration scripts are part of the deploy artifact, not a separate human process.
  • Practice the rollback paths. Stages 1 through 3 should be reversible. Verify this is true before you find out it isn't.

Expensive lesson

Schema changes are engineering changes, product changes, and operational changes simultaneously. A schema change shipped without expand-contract discipline can take the system offline, corrupt data, or force a coordinated cutover that costs much more than the work itself.

See also: Deployment Strategies, Trunk-Based Development, Feature Flags, Fix Forward, Maintainability.



  1. Martin Fowler, ParallelChange (also called "expand and contract"): https://martinfowler.com/bliki/ParallelChange.html. The pattern's canonical articulation, generalized beyond schema migrations to any change that affects long-lived shared state. The discipline of making the change reversible at every stage until the final contract step is the durable insight.