disaster-recovery

Compare original and translation side by side

🇺🇸

Original

English

🇨🇳

Translation

Chinese

Disaster Recovery

灾难恢复

Implement disaster recovery strategies and procedures.

实施灾难恢复策略与流程。

DR Metrics

灾难恢复指标

yaml

recovery_metrics:
  RTO: Recovery Time Objective
    - Maximum acceptable downtime
    - How long to restore service
    
  RPO: Recovery Point Objective
    - Maximum acceptable data loss
    - How much data can be lost

yaml

recovery_metrics:
  RTO: Recovery Time Objective
    - 可接受的最长停机时间
    - 恢复服务所需时长
    
  RPO: Recovery Point Objective
    - 可接受的最大数据丢失量
    - 允许丢失的数据量

DR Strategies

灾难恢复策略

Strategy	RTO	RPO	Cost
Backup & Restore	Hours	Hours	$
Pilot Light	Minutes-Hours	Minutes	$$
Warm Standby	Minutes	Seconds	$$$
Multi-Site Active	Near-zero	Near-zero	$$$$

策略	RTO	RPO	成本
备份与恢复	数小时	数小时	低
试点模式	数分钟至数小时	数分钟	中
温备模式	数分钟	数秒	中高
多站点活跃模式	近乎零	近乎零	高

AWS Multi-Region

AWS多区域部署

bash

undefined

bash

undefined

Cross-region RDS replica

aws rds create-db-instance-read-replica
--db-instance-identifier dr-replica
--source-db-instance-identifier prod-db
--source-region us-east-1
--region us-west-2

S3 cross-region replication

aws s3api put-bucket-replication
--bucket source-bucket
--replication-configuration file://replication.json

undefined

aws s3api put-bucket-replication
--bucket source-bucket
--replication-configuration file://replication.json

undefined

DR Testing

灾难恢复测试

yaml

dr_test_schedule:
  tabletop: Quarterly
  component_failover: Monthly
  full_failover: Annually
  
test_checklist:
  - [ ] Verify backup integrity
  - [ ] Test failover procedures
  - [ ] Validate data consistency
  - [ ] Measure actual RTO/RPO
  - [ ] Document lessons learned

yaml

dr_test_schedule:
  tabletop: 每季度
  component_failover: 每月
  full_failover: 每年
  
test_checklist:
  - [ ] 验证备份完整性
  - [ ] 测试故障转移流程
  - [ ] 验证数据一致性
  - [ ] 测量实际RTO/RPO
  - [ ] 记录经验教训

Best Practices

最佳实践

Regular DR testing
Automate failover where possible
Document all procedures
Update runbooks after tests

定期开展灾难恢复测试
尽可能自动化故障转移
记录所有流程
测试后更新运行手册