migration-planner
Compare original and translation side by side
🇺🇸
Original
English🇨🇳
Translation
ChineseMigration Planner
迁移规划器
Execute safe, zero-downtime migrations with validation and rollback plans.
通过验证机制和回滚计划实现安全、零停机的迁移操作。
Migration Patterns
迁移模式
1. Feature Flag Migration (Safest)
1. Feature Flag迁移(最安全)
Phase 1: Deploy new code (disabled)
Phase 2: Enable for 1% traffic
Phase 3: Ramp to 10%, 50%, 100%
Phase 4: Remove old codePhase 1: Deploy new code (disabled)
Phase 2: Enable for 1% traffic
Phase 3: Ramp to 10%, 50%, 100%
Phase 4: Remove old code2. Dual Write Migration
2. 双写迁移
Phase 1: Write to both old and new
Phase 2: Backfill old → new
Phase 3: Read from new (write both)
Phase 4: Stop writing to old
Phase 5: Decommission oldPhase 1: Write to both old and new
Phase 2: Backfill old → new
Phase 3: Read from new (write both)
Phase 4: Stop writing to old
Phase 5: Decommission old3. Blue-Green Deployment
3. 蓝绿部署
Blue (current) → Green (new)
Switch traffic: Blue → Green
Rollback available: Green → BlueBlue (current) → Green (new)
Switch traffic: Blue → Green
Rollback available: Green → BlueComplete Migration Plan Template
完整迁移计划模板
markdown
undefinedmarkdown
undefinedMigration Plan: MySQL → PostgreSQL
迁移计划:MySQL → PostgreSQL
Overview
概述
What: Migrate user database from MySQL to PostgreSQL
Why: Better JSON support, improved performance
When: Q1 2024
Owner: Database Team
Risk Level: HIGH
内容:将用户数据库从MySQL迁移至PostgreSQL
原因:更好的JSON支持、性能提升
时间:2024年第一季度
负责人:数据库团队
风险等级:高
Current State
当前状态
- MySQL 8.0
- 500GB data
- 100K users
- 1000 writes/min
- 10,000 reads/min
- MySQL 8.0
- 500GB数据
- 10万用户
- 每分钟1000次写入
- 每分钟10000次读取
Target State
目标状态
- PostgreSQL 15
- Same data model
- No downtime
- Data validation 100% match
- PostgreSQL 15
- 相同的数据模型
- 零停机
- 数据验证100%匹配
Phases
迁移阶段
Phase 1: Dual Write (Week 1-2)
阶段1:双写(第1-2周)
Goal: Write to both databases
Steps:
- Deploy PostgreSQL cluster
- Create schema in PostgreSQL
- Deploy dual-write code
- Enable dual writes (MySQL primary, PostgreSQL secondary)
Code:
typescript
async function createUser(data: CreateUserDto) {
// Write to MySQL (primary)
const mysqlUser = await mysql.users.create(data);
// Write to PostgreSQL (secondary, fire and forget)
postgres.users.create(data).catch((err) => {
logger.error("PostgreSQL write failed", err);
});
return mysqlUser; // Still trust MySQL
}
**Validation:**
- Monitor PostgreSQL write success rate
- Compare row counts daily
- Alert if drift >0.1%
**Rollback:** Disable PostgreSQL writes目标:将数据写入两个数据库
步骤:
- 部署PostgreSQL集群
- 在PostgreSQL中创建数据库schema
- 部署双写代码
- 启用双写(MySQL为主库,PostgreSQL为从库)
代码:
typescript
async function createUser(data: CreateUserDto) {
// Write to MySQL (primary)
const mysqlUser = await mysql.users.create(data);
// Write to PostgreSQL (secondary, fire and forget)
postgres.users.create(data).catch((err) => {
logger.error("PostgreSQL write failed", err);
});
return mysqlUser; // Still trust MySQL
}
**验证措施**:
- 监控PostgreSQL写入成功率
- 每日对比行数
- 若数据偏差>0.1%则触发告警
**回滚方案**:禁用PostgreSQL写入Phase 2: Backfill (Week 3-4)
阶段2:数据回填(第3-4周)
Goal: Copy historical data
Steps:
- Take MySQL snapshot
- Run backfill script in batches
- Validate data integrity
- Resume from failure automatically
Script:
python
def backfill():
last_id = get_last_migrated_id()
batch_size = 1000
while True:
users = mysql.query(
"SELECT * FROM users WHERE id > %s LIMIT %s",
[last_id, batch_size]
)
if not users:
break
postgres.bulk_insert(users)
last_id = users[-1]['id']
save_checkpoint(last_id)
time.sleep(0.1) # Rate limitValidation:
- Row count match
- Random sample comparison (1000 rows)
- Checksum comparison
Rollback: Delete PostgreSQL data
目标:复制历史数据
步骤:
- 生成MySQL快照
- 分批运行回填脚本
- 验证数据完整性
- 支持从失败点自动恢复
脚本:
python
def backfill():
last_id = get_last_migrated_id()
batch_size = 1000
while True:
users = mysql.query(
"SELECT * FROM users WHERE id > %s LIMIT %s",
[last_id, batch_size]
)
if not users:
break
postgres.bulk_insert(users)
last_id = users[-1]['id']
save_checkpoint(last_id)
time.sleep(0.1) # Rate limit验证措施:
- 行数匹配
- 随机样本对比(1000行)
- 校验和对比
回滚方案:删除PostgreSQL中的数据
Phase 3: Dual Read (Week 5)
阶段3:双读(第5周)
Goal: Validate PostgreSQL reads
Steps:
- Deploy shadow read code
- Read from both (MySQL primary)
- Compare results
- Log mismatches
Code:
typescript
async function getUser(id: string) {
const mysqlUser = await mysql.users.findById(id);
// Shadow read from PostgreSQL
postgres.users.findById(id).then((pgUser) => {
if (!deepEqual(mysqlUser, pgUser)) {
logger.warn("Data mismatch", { id, mysqlUser, pgUser });
metrics.increment("migration.mismatch");
}
});
return mysqlUser; // Still trust MySQL
}Validation:
- Mismatch rate <0.01%
- PostgreSQL query performance acceptable
Rollback: Remove shadow reads
目标:验证PostgreSQL读取能力
步骤:
- 部署影子读取代码
- 从两个数据库读取数据(MySQL为主库)
- 对比结果
- 记录数据不匹配情况
代码:
typescript
async function getUser(id: string) {
const mysqlUser = await mysql.users.findById(id);
// Shadow read from PostgreSQL
postgres.users.findById(id).then((pgUser) => {
if (!deepEqual(mysqlUser, pgUser)) {
logger.warn("Data mismatch", { id, mysqlUser, pgUser });
metrics.increment("migration.mismatch");
}
});
return mysqlUser; // Still trust MySQL
}验证措施:
- 数据不匹配率<0.01%
- PostgreSQL查询性能符合要求
回滚方案:移除影子读取代码
Phase 4: Flip Read Traffic (Week 6)
阶段4:切换读取流量(第6周)
Goal: Read from PostgreSQL
Steps:
- Feature flag: read from PostgreSQL (1% traffic)
- Monitor errors, latency
- Ramp: 1% → 10% → 50% → 100%
- Still writing to both
Code:
typescript
async function getUser(id: string) {
if (featureFlags.readFromPostgres) {
return postgres.users.findById(id);
}
return mysql.users.findById(id);
}Validation:
- Error rate unchanged
- Latency p95 <500ms
- No user complaints
Rollback: Flip feature flag off
目标:从PostgreSQL读取数据
步骤:
- 通过功能标志:从PostgreSQL读取(1%流量)
- 监控错误率与延迟
- 逐步扩容:1% → 10% → 50% → 100%
- 仍保持双写状态
代码:
typescript
async function getUser(id: string) {
if (featureFlags.readFromPostgres) {
return postgres.users.findById(id);
}
return mysql.users.findById(id);
}验证措施:
- 错误率无变化
- 95分位延迟<500ms
- 无用户投诉
回滚方案:关闭功能标志
Phase 5: Stop MySQL Writes (Week 7)
阶段5:停止MySQL写入(第7周)
Goal: PostgreSQL is now primary
Steps:
- Stop writing to MySQL
- Keep MySQL running (read-only)
- Monitor for issues
Code:
typescript
async function createUser(data: CreateUserDto) {
return postgres.users.create(data);
// No longer writing to MySQL
}Validation:
- All operations working
- MySQL not receiving writes
Rollback: Re-enable MySQL writes
目标:PostgreSQL成为主库
步骤:
- 停止向MySQL写入数据
- 保持MySQL运行(只读状态)
- 监控运行状况
代码:
typescript
async function createUser(data: CreateUserDto) {
return postgres.users.create(data);
// No longer writing to MySQL
}验证措施:
- 所有操作正常运行
- MySQL不再接收写入请求
回滚方案:重新启用MySQL写入
Phase 6: Decommission (Week 8)
阶段6:下线旧系统(第8周)
Goal: Remove MySQL
Steps:
- Archive MySQL data
- Shutdown MySQL cluster
- Remove MySQL client code
Rollback: Not available (point of no return)
目标:移除MySQL
步骤:
- 归档MySQL数据
- 关闭MySQL集群
- 移除MySQL客户端代码
回滚方案:不可用(已到不可逆转点)
Validation Strategy
验证策略
Data Integrity Checks
数据完整性检查
python
def validate_migration():
# Row counts
mysql_count = mysql.query("SELECT COUNT(*) FROM users")[0]
pg_count = postgres.query("SELECT COUNT(*) FROM users")[0]
assert mysql_count == pg_count
# Random sampling
sample = mysql.query("SELECT * FROM users ORDER BY RAND() LIMIT 1000")
for row in sample:
pg_row = postgres.query("SELECT * FROM users WHERE id = %s", [row['id']])
assert row == pg_row
# Checksums
mysql_checksum = mysql.query("SELECT MD5(GROUP_CONCAT(id, email)) FROM users")
pg_checksum = postgres.query("SELECT MD5(STRING_AGG(id::text || email, '')) FROM users")
assert mysql_checksum == pg_checksumpython
def validate_migration():
# Row counts
mysql_count = mysql.query("SELECT COUNT(*) FROM users")[0]
pg_count = postgres.query("SELECT COUNT(*) FROM users")[0]
assert mysql_count == pg_count
# Random sampling
sample = mysql.query("SELECT * FROM users ORDER BY RAND() LIMIT 1000")
for row in sample:
pg_row = postgres.query("SELECT * FROM users WHERE id = %s", [row['id']])
assert row == pg_row
# Checksums
mysql_checksum = mysql.query("SELECT MD5(GROUP_CONCAT(id, email)) FROM users")
pg_checksum = postgres.query("SELECT MD5(STRING_AGG(id::text || email, '')) FROM users")
assert mysql_checksum == pg_checksumRollback Plans
回滚计划
Phase 1-3 Rollback (Easy)
阶段1-3回滚(简单)
- Disable PostgreSQL writes
- No impact to users
- Data in MySQL still valid
- 禁用PostgreSQL写入
- 对用户无影响
- MySQL中的数据仍有效
Phase 4 Rollback (Medium)
阶段4回滚(中等)
- Flip feature flag
- Route reads back to MySQL
- Minor user impact (seconds)
- 关闭功能标志
- 将读取流量切回MySQL
- 对用户影响极小(数秒)
Phase 5+ Rollback (Hard)
阶段5及以后回滚(复杂)
- Must re-enable MySQL writes
- Potential data loss (writes since phase 5)
- Requires dual-write resumption
- 必须重新启用MySQL写入
- 可能存在数据丢失(阶段5之后的写入数据)
- 需要恢复双写机制
Risk Mitigation
风险缓解
Risk 1: Data Loss
风险1:数据丢失
Mitigation:
- Dual writes until validated
- Transaction logs captured
- Continuous backups
缓解措施:
- 在验证完成前保持双写
- 捕获事务日志
- 持续备份
Risk 2: Performance Degradation
风险2:性能下降
Mitigation:
- Load test PostgreSQL
- Query optimization
- Connection pooling
缓解措施:
- 对PostgreSQL进行压力测试
- 查询优化
- 连接池配置
Risk 3: Schema Differences
风险3:Schema差异
Mitigation:
- Schema validation script
- Test migrations in staging
- Document data type differences
缓解措施:
- Schema验证脚本
- 在预发布环境测试迁移
- 记录数据类型差异
Communication Plan
沟通计划
Stakeholder Updates
利益相关方更新
markdown
**Week 0:** Migration announced
**Week 2:** Phase 1 complete (dual writes)
**Week 4:** Backfill complete
**Week 6:** Traffic shifted to PostgreSQL
**Week 8:** Migration completemarkdown
**Week 0:** Migration announced
**Week 2:** Phase 1 complete (dual writes)
**Week 4:** Backfill complete
**Week 6:** Traffic shifted to PostgreSQL
**Week 8:** Migration completeStatus Dashboard
状态仪表盘
- Current phase
- Data sync status (%)
- Validation results
- Error rates
- 当前迁移阶段
- 数据同步进度(%)
- 验证结果
- 错误率
Testing Plan
测试计划
Pre-Migration Testing
迁移前测试
- Test in development
- Full migration in staging
- Load test PostgreSQL
- Validate rollback procedures
- 在开发环境测试
- 在预发布环境完成完整迁移
- 对PostgreSQL进行压力测试
- 验证回滚流程
During Migration
迁移中测试
- Continuous monitoring
- Automated validation
- Manual spot checks
- User acceptance testing
- 持续监控
- 自动化验证
- 手动抽查
- 用户验收测试
Best Practices
最佳实践
- Small batches: Migrate incrementally
- Dual write: Keep both systems synchronized
- Feature flags: Control rollout
- Validate continuously: Don't trust, verify
- Rollback ready: Plan for worst case
- Monitor closely: Track metrics
- Communicate often: Keep stakeholders informed
- 小批量迁移:分批次完成迁移
- 双写机制:保持两个系统数据同步
- 功能标志:控制迁移进度
- 持续验证:不盲目信任,主动验证
- 做好回滚准备:提前规划最坏情况
- 密切监控:跟踪关键指标
- 及时沟通:保持利益相关方知情
Output Checklist
输出检查清单
- Migration phases defined (5-7 phases)
- Dual write implementation
- Backfill script ready
- Validation strategy
- Feature flags configured
- Rollback plans per phase
- Risk mitigation strategies
- Communication plan
- Monitoring dashboard
- Testing checklist
- 定义迁移阶段(5-7个阶段)
- 实现双写机制
- 准备好回填脚本
- 制定验证策略
- 配置功能标志
- 为每个阶段制定回滚计划
- 风险缓解策略
- 沟通计划
- 监控仪表盘
- 测试检查清单