database-migrations

Compare original and translation side by side

🇺🇸

Original

English
🇨🇳

Translation

Chinese

Database Migration Patterns

数据库迁移模式

Safe, reversible database schema changes for production systems.
面向生产系统的安全、可回滚数据库Schema变更方案。

When to Activate

适用场景

  • Creating or altering database tables
  • Adding/removing columns or indexes
  • Running data migrations (backfill, transform)
  • Planning zero-downtime schema changes
  • Setting up migration tooling for a new project
  • 创建或修改数据库表
  • 添加/删除列或索引
  • 执行数据迁移(回填、转换)
  • 规划零停机Schema变更
  • 为新项目搭建迁移工具链

Core Principles

核心原则

  1. Every change is a migration — never alter production databases manually
  2. Migrations are forward-only in production — rollbacks use new forward migrations
  3. Schema and data migrations are separate — never mix DDL and DML in one migration
  4. Test migrations against production-sized data — a migration that works on 100 rows may lock on 10M
  5. Migrations are immutable once deployed — never edit a migration that has run in production
  1. 所有变更均通过迁移实现 —— 绝不要手动修改生产数据库
  2. 生产环境中迁移仅向前推进 —— 回滚需使用新的向前迁移
  3. Schema迁移与数据迁移分离 —— 绝不要在单个迁移中混合DDL和DML操作
  4. 针对生产规模的数据测试迁移 —— 在100行数据上可行的迁移,在1000万行数据上可能会导致锁表
  5. 已部署的迁移不可修改 —— 绝不要编辑已在生产环境中执行过的迁移

Migration Safety Checklist

迁移安全检查清单

Before applying any migration:
  • Migration has both UP and DOWN (or is explicitly marked irreversible)
  • No full table locks on large tables (use concurrent operations)
  • New columns have defaults or are nullable (never add NOT NULL without default)
  • Indexes created concurrently (not inline with CREATE TABLE for existing tables)
  • Data backfill is a separate migration from schema change
  • Tested against a copy of production data
  • Rollback plan documented
在应用任何迁移前:
  • 迁移包含UP和DOWN操作(或明确标记为不可逆)
  • 不对大表执行全表锁(使用并发操作)
  • 新列带有默认值或允许为空(绝不要添加无默认值的NOT NULL列)
  • 并发创建索引(对于现有表,不要在CREATE TABLE语句中内联创建索引)
  • 数据回填与Schema变更分为单独的迁移
  • 已针对生产数据副本测试过迁移
  • 已记录回滚方案

PostgreSQL Patterns

PostgreSQL 迁移模式

Adding a Column Safely

安全添加列

sql
-- GOOD: Nullable column, no lock
ALTER TABLE users ADD COLUMN avatar_url TEXT;

-- GOOD: Column with default (Postgres 11+ is instant, no rewrite)
ALTER TABLE users ADD COLUMN is_active BOOLEAN NOT NULL DEFAULT true;

-- BAD: NOT NULL without default on existing table (requires full rewrite)
ALTER TABLE users ADD COLUMN role TEXT NOT NULL;
-- This locks the table and rewrites every row
sql
-- 推荐:允许为空的列,无锁
ALTER TABLE users ADD COLUMN avatar_url TEXT;

-- 推荐:带默认值的列(Postgres 11+ 可即时完成,无需重写表)
ALTER TABLE users ADD COLUMN is_active BOOLEAN NOT NULL DEFAULT true;

-- 不推荐:在现有表上添加无默认值的NOT NULL列(需要全表重写)
ALTER TABLE users ADD COLUMN role TEXT NOT NULL;
-- 此操作会锁表并重写每一行数据

Adding an Index Without Downtime

无停机添加索引

sql
-- BAD: Blocks writes on large tables
CREATE INDEX idx_users_email ON users (email);

-- GOOD: Non-blocking, allows concurrent writes
CREATE INDEX CONCURRENTLY idx_users_email ON users (email);

-- Note: CONCURRENTLY cannot run inside a transaction block
-- Most migration tools need special handling for this
sql
-- 不推荐:在大表上会阻塞写入操作
CREATE INDEX idx_users_email ON users (email);

-- 推荐:非阻塞,允许并发写入
CREATE INDEX CONCURRENTLY idx_users_email ON users (email);

-- 注意:CONCURRENTLY 不能在事务块内执行
-- 大多数迁移工具对此需要特殊处理

Renaming a Column (Zero-Downtime)

零停机重命名列

Never rename directly in production. Use the expand-contract pattern:
sql
-- Step 1: Add new column (migration 001)
ALTER TABLE users ADD COLUMN display_name TEXT;

-- Step 2: Backfill data (migration 002, data migration)
UPDATE users SET display_name = username WHERE display_name IS NULL;

-- Step 3: Update application code to read/write both columns
-- Deploy application changes

-- Step 4: Stop writing to old column, drop it (migration 003)
ALTER TABLE users DROP COLUMN username;
绝不要在生产环境中直接重命名列,使用“扩展-收缩”模式:
sql
-- 步骤1:添加新列(迁移001)
ALTER TABLE users ADD COLUMN display_name TEXT;

-- 步骤2:回填数据(迁移002,数据迁移)
UPDATE users SET display_name = username WHERE display_name IS NULL;

-- 步骤3:更新应用代码以同时读写新旧列
-- 部署应用变更

-- 步骤4:停止写入旧列,删除旧列(迁移003)
ALTER TABLE users DROP COLUMN username;

Removing a Column Safely

安全删除列

sql
-- Step 1: Remove all application references to the column
-- Step 2: Deploy application without the column reference
-- Step 3: Drop column in next migration
ALTER TABLE orders DROP COLUMN legacy_status;

-- For Django: use SeparateDatabaseAndState to remove from model
-- without generating DROP COLUMN (then drop in next migration)
sql
-- 步骤1:移除应用中对该列的所有引用
-- 步骤2:部署移除了列引用的应用版本
-- 步骤3:在下次迁移中删除列
ALTER TABLE orders DROP COLUMN legacy_status;

-- 对于Django:使用SeparateDatabaseAndState从模型中移除列
-- 不自动生成DROP COLUMN语句(之后在下次迁移中删除列)

Large Data Migrations

大规模数据迁移

sql
-- BAD: Updates all rows in one transaction (locks table)
UPDATE users SET normalized_email = LOWER(email);

-- GOOD: Batch update with progress
DO $$
DECLARE
  batch_size INT := 10000;
  rows_updated INT;
BEGIN
  LOOP
    UPDATE users
    SET normalized_email = LOWER(email)
    WHERE id IN (
      SELECT id FROM users
      WHERE normalized_email IS NULL
      LIMIT batch_size
      FOR UPDATE SKIP LOCKED
    );
    GET DIAGNOSTICS rows_updated = ROW_COUNT;
    RAISE NOTICE 'Updated % rows', rows_updated;
    EXIT WHEN rows_updated = 0;
    COMMIT;
  END LOOP;
END $$;
sql
-- 不推荐:在单个事务中更新所有行(锁表)
UPDATE users SET normalized_email = LOWER(email);

-- 推荐:分批更新并跟踪进度
DO $$
DECLARE
  batch_size INT := 10000;
  rows_updated INT;
BEGIN
  LOOP
    UPDATE users
    SET normalized_email = LOWER(email)
    WHERE id IN (
      SELECT id FROM users
      WHERE normalized_email IS NULL
      LIMIT batch_size
      FOR UPDATE SKIP LOCKED
    );
    GET DIAGNOSTICS rows_updated = ROW_COUNT;
    RAISE NOTICE 'Updated % rows', rows_updated;
    EXIT WHEN rows_updated = 0;
    COMMIT;
  END LOOP;
END $$;

Prisma (TypeScript/Node.js)

Prisma(TypeScript/Node.js)

Workflow

工作流

bash
undefined
bash
undefined

Create migration from schema changes

根据Schema变更创建迁移

npx prisma migrate dev --name add_user_avatar
npx prisma migrate dev --name add_user_avatar

Apply pending migrations in production

在生产环境中应用待处理的迁移

npx prisma migrate deploy
npx prisma migrate deploy

Reset database (dev only)

重置数据库(仅开发环境)

npx prisma migrate reset
npx prisma migrate reset

Generate client after schema changes

Schema变更后生成客户端

npx prisma generate
undefined
npx prisma generate
undefined

Schema Example

Schema示例

prisma
model User {
  id        String   @id @default(cuid())
  email     String   @unique
  name      String?
  avatarUrl String?  @map("avatar_url")
  createdAt DateTime @default(now()) @map("created_at")
  updatedAt DateTime @updatedAt @map("updated_at")
  orders    Order[]

  @@map("users")
  @@index([email])
}
prisma
model User {
  id        String   @id @default(cuid())
  email     String   @unique
  name      String?
  avatarUrl String?  @map("avatar_url")
  createdAt DateTime @default(now()) @map("created_at")
  updatedAt DateTime @updatedAt @map("updated_at")
  orders    Order[]

  @@map("users")
  @@index([email])
}

Custom SQL Migration

自定义SQL迁移

For operations Prisma cannot express (concurrent indexes, data backfills):
bash
undefined
对于Prisma无法表达的操作(并发索引、数据回填):
bash
undefined

Create empty migration, then edit the SQL manually

创建空迁移,然后手动编辑SQL

npx prisma migrate dev --create-only --name add_email_index

```sql
-- migrations/20240115_add_email_index/migration.sql
-- Prisma cannot generate CONCURRENTLY, so we write it manually
CREATE INDEX CONCURRENTLY IF NOT EXISTS idx_users_email ON users (email);
npx prisma migrate dev --create-only --name add_email_index

```sql
-- migrations/20240115_add_email_index/migration.sql
-- Prisma无法生成CONCURRENTLY语句,因此手动编写
CREATE INDEX CONCURRENTLY IF NOT EXISTS idx_users_email ON users (email);

Drizzle (TypeScript/Node.js)

Drizzle(TypeScript/Node.js)

Workflow

工作流

bash
undefined
bash
undefined

Generate migration from schema changes

根据Schema变更生成迁移

npx drizzle-kit generate
npx drizzle-kit generate

Apply migrations

应用迁移

npx drizzle-kit migrate
npx drizzle-kit migrate

Push schema directly (dev only, no migration file)

直接推送Schema变更(仅开发环境,无迁移文件)

npx drizzle-kit push
undefined
npx drizzle-kit push
undefined

Schema Example

Schema示例

typescript
import { pgTable, text, timestamp, uuid, boolean } from "drizzle-orm/pg-core";

export const users = pgTable("users", {
  id: uuid("id").primaryKey().defaultRandom(),
  email: text("email").notNull().unique(),
  name: text("name"),
  isActive: boolean("is_active").notNull().default(true),
  createdAt: timestamp("created_at").notNull().defaultNow(),
  updatedAt: timestamp("updated_at").notNull().defaultNow(),
});
typescript
import { pgTable, text, timestamp, uuid, boolean } from "drizzle-orm/pg-core";

export const users = pgTable("users", {
  id: uuid("id").primaryKey().defaultRandom(),
  email: text("email").notNull().unique(),
  name: text("name"),
  isActive: boolean("is_active").notNull().default(true),
  createdAt: timestamp("created_at").notNull().defaultNow(),
  updatedAt: timestamp("updated_at").notNull().defaultNow(),
});

Django (Python)

Django(Python)

Workflow

工作流

bash
undefined
bash
undefined

Generate migration from model changes

根据模型变更生成迁移

python manage.py makemigrations
python manage.py makemigrations

Apply migrations

应用迁移

python manage.py migrate
python manage.py migrate

Show migration status

查看迁移状态

python manage.py showmigrations
python manage.py showmigrations

Generate empty migration for custom SQL

生成空迁移用于自定义SQL

python manage.py makemigrations --empty app_name -n description
undefined
python manage.py makemigrations --empty app_name -n description
undefined

Data Migration

数据迁移

python
from django.db import migrations

def backfill_display_names(apps, schema_editor):
    User = apps.get_model("accounts", "User")
    batch_size = 5000
    users = User.objects.filter(display_name="")
    while users.exists():
        batch = list(users[:batch_size])
        for user in batch:
            user.display_name = user.username
        User.objects.bulk_update(batch, ["display_name"], batch_size=batch_size)

def reverse_backfill(apps, schema_editor):
    pass  # Data migration, no reverse needed

class Migration(migrations.Migration):
    dependencies = [("accounts", "0015_add_display_name")]

    operations = [
        migrations.RunPython(backfill_display_names, reverse_backfill),
    ]
python
from django.db import migrations

def backfill_display_names(apps, schema_editor):
    User = apps.get_model("accounts", "User")
    batch_size = 5000
    users = User.objects.filter(display_name="")
    while users.exists():
        batch = list(users[:batch_size])
        for user in batch:
            user.display_name = user.username
        User.objects.bulk_update(batch, ["display_name"], batch_size=batch_size)

def reverse_backfill(apps, schema_editor):
    pass  # 数据迁移无需回滚

class Migration(migrations.Migration):
    dependencies = [("accounts", "0015_add_display_name")]

    operations = [
        migrations.RunPython(backfill_display_names, reverse_backfill),
    ]

SeparateDatabaseAndState

SeparateDatabaseAndState

Remove a column from the Django model without dropping it from the database immediately:
python
class Migration(migrations.Migration):
    operations = [
        migrations.SeparateDatabaseAndState(
            state_operations=[
                migrations.RemoveField(model_name="user", name="legacy_field"),
            ],
            database_operations=[],  # Don't touch the DB yet
        ),
    ]
从Django模型中移除列,但不立即从数据库中删除:
python
class Migration(migrations.Migration):
    operations = [
        migrations.SeparateDatabaseAndState(
            state_operations=[
                migrations.RemoveField(model_name="user", name="legacy_field"),
            ],
            database_operations=[],  # 暂不修改数据库
        ),
    ]

golang-migrate (Go)

golang-migrate(Go)

Workflow

工作流

bash
undefined
bash
undefined

Create migration pair

创建迁移文件对

migrate create -ext sql -dir migrations -seq add_user_avatar
migrate create -ext sql -dir migrations -seq add_user_avatar

Apply all pending migrations

应用所有待处理的迁移

migrate -path migrations -database "$DATABASE_URL" up
migrate -path migrations -database "$DATABASE_URL" up

Rollback last migration

回滚上一次迁移

migrate -path migrations -database "$DATABASE_URL" down 1
migrate -path migrations -database "$DATABASE_URL" down 1

Force version (fix dirty state)

强制设置版本(修复脏状态)

migrate -path migrations -database "$DATABASE_URL" force VERSION
undefined
migrate -path migrations -database "$DATABASE_URL" force VERSION
undefined

Migration Files

迁移文件示例

sql
-- migrations/000003_add_user_avatar.up.sql
ALTER TABLE users ADD COLUMN avatar_url TEXT;
CREATE INDEX CONCURRENTLY idx_users_avatar ON users (avatar_url) WHERE avatar_url IS NOT NULL;

-- migrations/000003_add_user_avatar.down.sql
DROP INDEX IF EXISTS idx_users_avatar;
ALTER TABLE users DROP COLUMN IF EXISTS avatar_url;
sql
-- migrations/000003_add_user_avatar.up.sql
ALTER TABLE users ADD COLUMN avatar_url TEXT;
CREATE INDEX CONCURRENTLY idx_users_avatar ON users (avatar_url) WHERE avatar_url IS NOT NULL;

-- migrations/000003_add_user_avatar.down.sql
DROP INDEX IF EXISTS idx_users_avatar;
ALTER TABLE users DROP COLUMN IF EXISTS avatar_url;

Zero-Downtime Migration Strategy

零停机迁移策略

For critical production changes, follow the expand-contract pattern:
Phase 1: EXPAND
  - Add new column/table (nullable or with default)
  - Deploy: app writes to BOTH old and new
  - Backfill existing data

Phase 2: MIGRATE
  - Deploy: app reads from NEW, writes to BOTH
  - Verify data consistency

Phase 3: CONTRACT
  - Deploy: app only uses NEW
  - Drop old column/table in separate migration
对于关键生产变更,遵循“扩展-收缩”模式:
阶段1:扩展
  - 添加新列/表(允许为空或带默认值)
  - 部署:应用同时写入新旧列
  - 回填现有数据

阶段2:迁移
  - 部署:应用仅从新列读取,同时写入新旧列
  - 验证数据一致性

阶段3:收缩
  - 部署:应用仅使用新列
  - 在单独的迁移中删除旧列/表

Timeline Example

时间线示例

Day 1: Migration adds new_status column (nullable)
Day 1: Deploy app v2 — writes to both status and new_status
Day 2: Run backfill migration for existing rows
Day 3: Deploy app v3 — reads from new_status only
Day 7: Migration drops old status column
第1天:迁移添加new_status列(允许为空)
第1天:部署应用v2 —— 同时写入status和new_status
第2天:执行数据回填迁移处理现有行
第3天:部署应用v3 —— 仅从new_status读取
第7天:迁移删除旧status列

Anti-Patterns

反模式

Anti-PatternWhy It FailsBetter Approach
Manual SQL in productionNo audit trail, unrepeatableAlways use migration files
Editing deployed migrationsCauses drift between environmentsCreate new migration instead
NOT NULL without defaultLocks table, rewrites all rowsAdd nullable, backfill, then add constraint
Inline index on large tableBlocks writes during buildCREATE INDEX CONCURRENTLY
Schema + data in one migrationHard to rollback, long transactionsSeparate migrations
Dropping column before removing codeApplication errors on missing columnRemove code first, drop column next deploy
反模式失败原因更佳方案
生产环境中手动执行SQL无审计追踪,不可重复始终使用迁移文件
编辑已部署的迁移导致环境间数据漂移创建新的迁移替代
添加无默认值的NOT NULL列锁表并重写所有行先添加允许为空的列,回填数据后再添加约束
在大表上内联创建索引构建期间阻塞写入使用CREATE INDEX CONCURRENTLY
Schema变更与数据迁移合并在一个迁移中难以回滚,事务耗时过长分为单独的迁移
未移除代码就删除列应用因列缺失报错先移除代码,下次部署时再删除列