database-migrations

Compare original and translation side by side

🇺🇸

Original

English
🇨🇳

Translation

Chinese

Database Migration Patterns

数据库迁移模式

Safe, reversible database schema changes for production systems.
面向生产系统的安全、可回滚数据库架构变更方案。

When to Activate

适用场景

  • Creating or altering database tables
  • Adding/removing columns or indexes
  • Running data migrations (backfill, transform)
  • Planning zero-downtime schema changes
  • Setting up migration tooling for a new project
  • 创建或修改数据库表
  • 添加/删除列或索引
  • 执行数据迁移(回填、转换)
  • 规划零停机架构变更
  • 为新项目配置迁移工具

Core Principles

核心原则

  1. Every change is a migration — never alter production databases manually
  2. Migrations are forward-only in production — rollbacks use new forward migrations
  3. Schema and data migrations are separate — never mix DDL and DML in one migration
  4. Test migrations against production-sized data — a migration that works on 100 rows may lock on 10M
  5. Migrations are immutable once deployed — never edit a migration that has run in production
  1. 所有变更都通过迁移实现 — 绝不手动修改生产数据库
  2. 生产环境中迁移仅向前推进 — 回滚使用新的向前迁移
  3. 架构迁移与数据迁移分离 — 绝不在单个迁移中混合DDL和DML操作
  4. 针对生产规模的数据测试迁移 — 在100行数据上可行的迁移可能会在1000万行数据上锁表
  5. 已部署的迁移不可修改 — 绝不编辑已在生产环境中运行过的迁移

Migration Safety Checklist

迁移安全检查清单

Before applying any migration:
  • Migration has both UP and DOWN (or is explicitly marked irreversible)
  • No full table locks on large tables (use concurrent operations)
  • New columns have defaults or are nullable (never add NOT NULL without default)
  • Indexes created concurrently (not inline with CREATE TABLE for existing tables)
  • Data backfill is a separate migration from schema change
  • Tested against a copy of production data
  • Rollback plan documented
在应用任何迁移之前:
  • 迁移包含UP和DOWN操作(或明确标记为不可逆)
  • 不对大型表执行全表锁(使用并发操作)
  • 新列带有默认值或为可空(绝不添加无默认值的NOT NULL列)
  • 并发创建索引(对于现有表,不要在CREATE TABLE语句中内联创建索引)
  • 数据回填与架构变更分为单独的迁移
  • 已针对生产数据副本测试过
  • 已记录回滚计划

PostgreSQL Patterns

PostgreSQL 迁移模式

Adding a Column Safely

安全添加列

sql
-- GOOD: Nullable column, no lock
ALTER TABLE users ADD COLUMN avatar_url TEXT;

-- GOOD: Column with default (Postgres 11+ is instant, no rewrite)
ALTER TABLE users ADD COLUMN is_active BOOLEAN NOT NULL DEFAULT true;

-- BAD: NOT NULL without default on existing table (requires full rewrite)
ALTER TABLE users ADD COLUMN role TEXT NOT NULL;
-- This locks the table and rewrites every row
sql
-- 推荐:可空列,无锁
ALTER TABLE users ADD COLUMN avatar_url TEXT;

-- 推荐:带默认值的列(Postgres 11+ 可立即完成,无需重写表)
ALTER TABLE users ADD COLUMN is_active BOOLEAN NOT NULL DEFAULT true;

-- 不推荐:在现有表上添加无默认值的NOT NULL列(需要全表重写)
ALTER TABLE users ADD COLUMN role TEXT NOT NULL;
-- 该操作会锁表并重写每一行数据

Adding an Index Without Downtime

无停机添加索引

sql
-- BAD: Blocks writes on large tables
CREATE INDEX idx_users_email ON users (email);

-- GOOD: Non-blocking, allows concurrent writes
CREATE INDEX CONCURRENTLY idx_users_email ON users (email);

-- Note: CONCURRENTLY cannot run inside a transaction block
-- Most migration tools need special handling for this
sql
-- 不推荐:在大型表上会阻塞写入
CREATE INDEX idx_users_email ON users (email);

-- 推荐:非阻塞,允许并发写入
CREATE INDEX CONCURRENTLY idx_users_email ON users (email);

-- 注意:CONCURRENTLY 无法在事务块内运行
-- 大多数迁移工具需要对此进行特殊处理

Renaming a Column (Zero-Downtime)

零停机重命名列

Never rename directly in production. Use the expand-contract pattern:
sql
-- Step 1: Add new column (migration 001)
ALTER TABLE users ADD COLUMN display_name TEXT;

-- Step 2: Backfill data (migration 002, data migration)
UPDATE users SET display_name = username WHERE display_name IS NULL;

-- Step 3: Update application code to read/write both columns
-- Deploy application changes

-- Step 4: Stop writing to old column, drop it (migration 003)
ALTER TABLE users DROP COLUMN username;
绝不直接在生产环境中重命名列,使用扩展-收缩模式:
sql
-- 步骤1:添加新列(迁移001)
ALTER TABLE users ADD COLUMN display_name TEXT;

-- 步骤2:回填数据(迁移002,数据迁移)
UPDATE users SET display_name = username WHERE display_name IS NULL;

-- 步骤3:更新应用代码以读写两个列
-- 部署应用变更

-- 步骤4:停止写入旧列,删除旧列(迁移003)
ALTER TABLE users DROP COLUMN username;

Removing a Column Safely

安全删除列

sql
-- Step 1: Remove all application references to the column
-- Step 2: Deploy application without the column reference
-- Step 3: Drop column in next migration
ALTER TABLE orders DROP COLUMN legacy_status;

-- For Django: use SeparateDatabaseAndState to remove from model
-- without generating DROP COLUMN (then drop in next migration)
sql
-- 步骤1:移除所有对该列的应用引用
-- 步骤2:部署移除列引用后的应用
-- 步骤3:在下一次迁移中删除列
ALTER TABLE orders DROP COLUMN legacy_status;

-- 对于Django:使用SeparateDatabaseAndState从模型中移除列
-- 不生成DROP COLUMN语句,之后在下一次迁移中删除列

Large Data Migrations

大规模数据迁移

sql
-- BAD: Updates all rows in one transaction (locks table)
UPDATE users SET normalized_email = LOWER(email);

-- GOOD: Batch update with progress
DO $$
DECLARE
  batch_size INT := 10000;
  rows_updated INT;
BEGIN
  LOOP
    UPDATE users
    SET normalized_email = LOWER(email)
    WHERE id IN (
      SELECT id FROM users
      WHERE normalized_email IS NULL
      LIMIT batch_size
      FOR UPDATE SKIP LOCKED
    );
    GET DIAGNOSTICS rows_updated = ROW_COUNT;
    RAISE NOTICE 'Updated % rows', rows_updated;
    EXIT WHEN rows_updated = 0;
    COMMIT;
  END LOOP;
END $$;
sql
-- 不推荐:在单个事务中更新所有行(锁表)
UPDATE users SET normalized_email = LOWER(email);

-- 推荐:分批更新并显示进度
DO $$
DECLARE
  batch_size INT := 10000;
  rows_updated INT;
BEGIN
  LOOP
    UPDATE users
    SET normalized_email = LOWER(email)
    WHERE id IN (
      SELECT id FROM users
      WHERE normalized_email IS NULL
      LIMIT batch_size
      FOR UPDATE SKIP LOCKED
    );
    GET DIAGNOSTICS rows_updated = ROW_COUNT;
    RAISE NOTICE '已更新 % 行数据', rows_updated;
    EXIT WHEN rows_updated = 0;
    COMMIT;
  END LOOP;
END $$;

Prisma (TypeScript/Node.js)

Prisma (TypeScript/Node.js)

Workflow

工作流程

bash
undefined
bash
undefined

Create migration from schema changes

根据架构变更创建迁移

npx prisma migrate dev --name add_user_avatar
npx prisma migrate dev --name add_user_avatar

Apply pending migrations in production

在生产环境中应用待处理的迁移

npx prisma migrate deploy
npx prisma migrate deploy

Reset database (dev only)

重置数据库(仅开发环境)

npx prisma migrate reset
npx prisma migrate reset

Generate client after schema changes

架构变更后生成客户端

npx prisma generate
undefined
npx prisma generate
undefined

Schema Example

架构示例

prisma
model User {
  id        String   @id @default(cuid())
  email     String   @unique
  name      String?
  avatarUrl String?  @map("avatar_url")
  createdAt DateTime @default(now()) @map("created_at")
  updatedAt DateTime @updatedAt @map("updated_at")
  orders    Order[]

  @@map("users")
  @@index([email])
}
prisma
model User {
  id        String   @id @default(cuid())
  email     String   @unique
  name      String?
  avatarUrl String?  @map("avatar_url")
  createdAt DateTime @default(now()) @map("created_at")
  updatedAt DateTime @updatedAt @map("updated_at")
  orders    Order[]

  @@map("users")
  @@index([email])
}

Custom SQL Migration

自定义SQL迁移

For operations Prisma cannot express (concurrent indexes, data backfills):
bash
undefined
对于Prisma无法表达的操作(并发索引、数据回填):
bash
undefined

Create empty migration, then edit the SQL manually

创建空迁移,然后手动编辑SQL

npx prisma migrate dev --create-only --name add_email_index

```sql
-- migrations/20240115_add_email_index/migration.sql
-- Prisma cannot generate CONCURRENTLY, so we write it manually
CREATE INDEX CONCURRENTLY IF NOT EXISTS idx_users_email ON users (email);
npx prisma migrate dev --create-only --name add_email_index

```sql
-- migrations/20240115_add_email_index/migration.sql
-- Prisma无法生成CONCURRENTLY语句,因此手动编写
CREATE INDEX CONCURRENTLY IF NOT EXISTS idx_users_email ON users (email);

Drizzle (TypeScript/Node.js)

Drizzle (TypeScript/Node.js)

Workflow

工作流程

bash
undefined
bash
-- 根据架构变更生成迁移
npx drizzle-kit generate

-- 应用迁移
npx drizzle-kit migrate

-- 直接推送架构(仅开发环境,不生成迁移文件)
npx drizzle-kit push

Generate migration from schema changes

架构示例

npx drizzle-kit generate
typescript
import { pgTable, text, timestamp, uuid, boolean } from "drizzle-orm/pg-core";

export const users = pgTable("users", {
  id: uuid("id").primaryKey().defaultRandom(),
  email: text("email").notNull().unique(),
  name: text("name"),
  isActive: boolean("is_active").notNull().default(true),
  createdAt: timestamp("created_at").notNull().defaultNow(),
  updatedAt: timestamp("updated_at").notNull().defaultNow(),
});

Apply migrations

Kysely (TypeScript/Node.js)

工作流程(kysely-ctl)

npx drizzle-kit migrate
bash
undefined

Push schema directly (dev only, no migration file)

初始化配置文件(kysely.config.ts)

npx drizzle-kit push
undefined
kysely init

Schema Example

创建新的迁移文件

typescript
import { pgTable, text, timestamp, uuid, boolean } from "drizzle-orm/pg-core";

export const users = pgTable("users", {
  id: uuid("id").primaryKey().defaultRandom(),
  email: text("email").notNull().unique(),
  name: text("name"),
  isActive: boolean("is_active").notNull().default(true),
  createdAt: timestamp("created_at").notNull().defaultNow(),
  updatedAt: timestamp("updated_at").notNull().defaultNow(),
});
kysely migrate make add_user_avatar

Kysely (TypeScript/Node.js)

应用所有待处理的迁移

Workflow (kysely-ctl)

bash
undefined
kysely migrate latest

Initialize config file (kysely.config.ts)

回滚上一次迁移

kysely init
kysely migrate down

Create a new migration file

查看迁移状态

kysely migrate make add_user_avatar
kysely migrate list
undefined

Apply all pending migrations

迁移文件

kysely migrate latest
typescript
// migrations/2024_01_15_001_create_user_profile.ts
import { type Kysely, sql } from 'kysely'

// 重要提示:始终使用Kysely<any>,不要使用类型化的DB接口。
// 迁移是固定不变的,不能依赖当前架构类型。
export async function up(db: Kysely<any>): Promise<void> {
  await db.schema
    .createTable('user_profile')
    .addColumn('id', 'serial', (col) => col.primaryKey())
    .addColumn('email', 'varchar(255)', (col) => col.notNull().unique())
    .addColumn('avatar_url', 'text')
    .addColumn('created_at', 'timestamp', (col) =>
      col.defaultTo(sql`now()`).notNull()
    )
    .execute()

  await db.schema
    .createIndex('idx_user_profile_avatar')
    .on('user_profile')
    .column('avatar_url')
    .execute()
}

export async function down(db: Kysely<any>): Promise<void> {
  await db.schema.dropTable('user_profile').execute()
}

Rollback last migration

程序化迁移器

kysely migrate down
typescript
import { Migrator, FileMigrationProvider } from 'kysely'
import { promises as fs } from 'fs'
import * as path from 'path'
// 仅支持ESM — CJS可直接使用__dirname
import { fileURLToPath } from 'url'
const migrationFolder = path.join(
  path.dirname(fileURLToPath(import.meta.url)),
  './migrations',
)

// `db` 是你的Kysely<any>数据库实例
const migrator = new Migrator({
  db,
  provider: new FileMigrationProvider({
    fs,
    path,
    migrationFolder,
  }),
  // 警告:仅在开发环境中启用。禁用时间戳顺序验证
  // 这可能会导致环境之间的架构漂移。
  // allowUnorderedMigrations: true,
})

const { error, results } = await migrator.migrateToLatest()

results?.forEach((it) => {
  if (it.status === 'Success') {
    console.log(`迁移 "${it.migrationName}" 执行成功`)
  } else if (it.status === 'Error') {
    console.error(`迁移 "${it.migrationName}" 执行失败`)
  }
})

if (error) {
  console.error('迁移失败', error)
  process.exit(1)
}

Show migration status

Django (Python)

工作流程

kysely migrate list
undefined
bash
undefined

Migration File

根据模型变更生成迁移

typescript
// migrations/2024_01_15_001_create_user_profile.ts
import { type Kysely, sql } from 'kysely'

// IMPORTANT: Always use Kysely<any>, not your typed DB interface.
// Migrations are frozen in time and must not depend on current schema types.
export async function up(db: Kysely<any>): Promise<void> {
  await db.schema
    .createTable('user_profile')
    .addColumn('id', 'serial', (col) => col.primaryKey())
    .addColumn('email', 'varchar(255)', (col) => col.notNull().unique())
    .addColumn('avatar_url', 'text')
    .addColumn('created_at', 'timestamp', (col) =>
      col.defaultTo(sql`now()`).notNull()
    )
    .execute()

  await db.schema
    .createIndex('idx_user_profile_avatar')
    .on('user_profile')
    .column('avatar_url')
    .execute()
}

export async function down(db: Kysely<any>): Promise<void> {
  await db.schema.dropTable('user_profile').execute()
}
python manage.py makemigrations

Programmatic Migrator

应用迁移

typescript
import { Migrator, FileMigrationProvider } from 'kysely'
import { promises as fs } from 'fs'
import * as path from 'path'
// ESM only — CJS can use __dirname directly
import { fileURLToPath } from 'url'
const migrationFolder = path.join(
  path.dirname(fileURLToPath(import.meta.url)),
  './migrations',
)

// `db` is your Kysely<any> database instance
const migrator = new Migrator({
  db,
  provider: new FileMigrationProvider({
    fs,
    path,
    migrationFolder,
  }),
  // WARNING: Only enable in development. Disables timestamp-ordering
  // validation, which can cause schema drift between environments.
  // allowUnorderedMigrations: true,
})

const { error, results } = await migrator.migrateToLatest()

results?.forEach((it) => {
  if (it.status === 'Success') {
    console.log(`migration "${it.migrationName}" executed successfully`)
  } else if (it.status === 'Error') {
    console.error(`failed to execute migration "${it.migrationName}"`)
  }
})

if (error) {
  console.error('migration failed', error)
  process.exit(1)
}
python manage.py migrate

Django (Python)

查看迁移状态

Workflow

bash
undefined
python manage.py showmigrations

Generate migration from model changes

生成空迁移用于自定义SQL

python manage.py makemigrations
python manage.py makemigrations --empty app_name -n description
undefined

Apply migrations

数据迁移

python manage.py migrate
python
from django.db import migrations

def backfill_display_names(apps, schema_editor):
    User = apps.get_model("accounts", "User")
    batch_size = 5000
    users = User.objects.filter(display_name="")
    while users.exists():
        batch = list(users[:batch_size])
        for user in batch:
            user.display_name = user.username
        User.objects.bulk_update(batch, ["display_name"], batch_size=batch_size)

def reverse_backfill(apps, schema_editor):
    pass  # 数据迁移,无需回滚

class Migration(migrations.Migration):
    dependencies = [("accounts", "0015_add_display_name")]

    operations = [
        migrations.RunPython(backfill_display_names, reverse_backfill),
    ]

Show migration status

SeparateDatabaseAndState

python manage.py showmigrations
立即从Django模型中移除列,但不立即从数据库中删除:
python
class Migration(migrations.Migration):
    operations = [
        migrations.SeparateDatabaseAndState(
            state_operations=[
                migrations.RemoveField(model_name="user", name="legacy_field"),
            ],
            database_operations=[],  # 暂时不修改数据库
        ),
    ]

Generate empty migration for custom SQL

golang-migrate (Go)

工作流程

python manage.py makemigrations --empty app_name -n description
undefined
bash
undefined

Data Migration

创建迁移文件对

python
from django.db import migrations

def backfill_display_names(apps, schema_editor):
    User = apps.get_model("accounts", "User")
    batch_size = 5000
    users = User.objects.filter(display_name="")
    while users.exists():
        batch = list(users[:batch_size])
        for user in batch:
            user.display_name = user.username
        User.objects.bulk_update(batch, ["display_name"], batch_size=batch_size)

def reverse_backfill(apps, schema_editor):
    pass  # Data migration, no reverse needed

class Migration(migrations.Migration):
    dependencies = [("accounts", "0015_add_display_name")]

    operations = [
        migrations.RunPython(backfill_display_names, reverse_backfill),
    ]
migrate create -ext sql -dir migrations -seq add_user_avatar

SeparateDatabaseAndState

应用所有待处理的迁移

Remove a column from the Django model without dropping it from the database immediately:
python
class Migration(migrations.Migration):
    operations = [
        migrations.SeparateDatabaseAndState(
            state_operations=[
                migrations.RemoveField(model_name="user", name="legacy_field"),
            ],
            database_operations=[],  # Don't touch the DB yet
        ),
    ]
migrate -path migrations -database "$DATABASE_URL" up

golang-migrate (Go)

回滚上一次迁移

Workflow

bash
undefined
migrate -path migrations -database "$DATABASE_URL" down 1

Create migration pair

强制设置版本(修复脏状态)

migrate create -ext sql -dir migrations -seq add_user_avatar
migrate -path migrations -database "$DATABASE_URL" force VERSION
undefined

Apply all pending migrations

迁移文件

migrate -path migrations -database "$DATABASE_URL" up
sql
-- migrations/000003_add_user_avatar.up.sql
ALTER TABLE users ADD COLUMN avatar_url TEXT;
CREATE INDEX CONCURRENTLY idx_users_avatar ON users (avatar_url) WHERE avatar_url IS NOT NULL;

-- migrations/000003_add_user_avatar.down.sql
DROP INDEX IF EXISTS idx_users_avatar;
ALTER TABLE users DROP COLUMN IF EXISTS avatar_url;

Rollback last migration

零停机迁移策略

migrate -path migrations -database "$DATABASE_URL" down 1
对于关键生产变更,遵循扩展-收缩模式:
阶段1:扩展
  - 添加新列/表(可空或带默认值)
  - 部署:应用同时写入旧列和新列
  - 回填现有数据

阶段2:迁移
  - 部署:应用仅读取新列,同时写入新旧列
  - 验证数据一致性

阶段3:收缩
  - 部署:应用仅使用新列
  - 在单独的迁移中删除旧列/表

Force version (fix dirty state)

时间线示例

migrate -path migrations -database "$DATABASE_URL" force VERSION
undefined
第1天:迁移添加new_status列(可空)
第1天:部署应用v2 — 同时写入status和new_status
第2天:运行回填迁移处理现有行
第3天:部署应用v3 — 仅读取new_status
第7天:迁移删除旧status列

Migration Files

反模式

sql
-- migrations/000003_add_user_avatar.up.sql
ALTER TABLE users ADD COLUMN avatar_url TEXT;
CREATE INDEX CONCURRENTLY idx_users_avatar ON users (avatar_url) WHERE avatar_url IS NOT NULL;

-- migrations/000003_add_user_avatar.down.sql
DROP INDEX IF EXISTS idx_users_avatar;
ALTER TABLE users DROP COLUMN IF EXISTS avatar_url;
反模式失败原因更佳方案
生产环境中手动执行SQL无审计轨迹,不可重复始终使用迁移文件
编辑已部署的迁移导致环境间架构漂移创建新的迁移替代
添加无默认值的NOT NULL列锁表并重写所有行先添加可空列,回填数据,再添加约束
在大型表上内联创建索引构建期间阻塞写入使用CREATE INDEX CONCURRENTLY
单个迁移中混合架构与数据操作难以回滚,事务时间长分离为单独的迁移
在移除代码前删除列应用因列缺失报错先移除代码,下次部署时删除列

Zero-Downtime Migration Strategy

For critical production changes, follow the expand-contract pattern:
Phase 1: EXPAND
  - Add new column/table (nullable or with default)
  - Deploy: app writes to BOTH old and new
  - Backfill existing data

Phase 2: MIGRATE
  - Deploy: app reads from NEW, writes to BOTH
  - Verify data consistency

Phase 3: CONTRACT
  - Deploy: app only uses NEW
  - Drop old column/table in separate migration

Timeline Example

Day 1: Migration adds new_status column (nullable)
Day 1: Deploy app v2 — writes to both status and new_status
Day 2: Run backfill migration for existing rows
Day 3: Deploy app v3 — reads from new_status only
Day 7: Migration drops old status column

Anti-Patterns

Anti-PatternWhy It FailsBetter Approach
Manual SQL in productionNo audit trail, unrepeatableAlways use migration files
Editing deployed migrationsCauses drift between environmentsCreate new migration instead
NOT NULL without defaultLocks table, rewrites all rowsAdd nullable, backfill, then add constraint
Inline index on large tableBlocks writes during buildCREATE INDEX CONCURRENTLY
Schema + data in one migrationHard to rollback, long transactionsSeparate migrations
Dropping column before removing codeApplication errors on missing columnRemove code first, drop column next deploy