data-replication-setup
Compare original and translation side by side
🇺🇸
Original
English🇨🇳
Translation
ChineseData Replication Setup
数据库复制设置
Overview
概述
Configure database replication for disaster recovery, load distribution, and high availability. Covers master-slave, multi-master replication, and monitoring strategies.
配置数据库复制以实现灾难恢复、负载分发和高可用性。涵盖主从复制、多主复制以及监控策略。
When to Use
适用场景
- High availability setup
- Disaster recovery planning
- Read replica configuration
- Multi-region replication
- Replication monitoring and maintenance
- Failover automation
- Cross-region backup strategies
- 高可用性架构搭建
- 灾难恢复规划
- 只读副本配置
- 多区域复制
- 复制监控与维护
- 故障转移自动化
- 跨区域备份策略
PostgreSQL Replication
PostgreSQL 复制
Master-Slave (Primary-Standby) Setup
主从(主备)架构设置
PostgreSQL - Configure Primary Server:
sql
-- On primary server: postgresql.conf
-- wal_level = replica
-- max_wal_senders = 10
-- wal_keep_size = 1GB
-- Create replication user
CREATE ROLE replication_user WITH REPLICATION ENCRYPTED PASSWORD 'secure_password';
-- Allow replication connections: pg_hba.conf
-- host replication replication_user standby_ip/32 md5
-- Enable WAL archiving for continuous backup
-- archive_mode = on
-- archive_command = 'test ! -f /archive/%f && cp %p /archive/%f'PostgreSQL - Set Up Standby Server:
bash
undefinedPostgreSQL - 配置主服务器:
sql
-- On primary server: postgresql.conf
-- wal_level = replica
-- max_wal_senders = 10
-- wal_keep_size = 1GB
-- Create replication user
CREATE ROLE replication_user WITH REPLICATION ENCRYPTED PASSWORD 'secure_password';
-- Allow replication connections: pg_hba.conf
-- host replication replication_user standby_ip/32 md5
-- Enable WAL archiving for continuous backup
-- archive_mode = on
-- archive_command = 'test! -f /archive/%f && cp %p /archive/%f'PostgreSQL - 配置备服务器:
bash
undefinedOn standby server
On standby server
1. Stop PostgreSQL if running
1. Stop PostgreSQL if running
sudo systemctl stop postgresql
sudo systemctl stop postgresql
2. Take base backup from primary
2. Take base backup from primary
pg_basebackup -h primary_ip -D /var/lib/postgresql/14/main
-U replication_user -v -P -W
-U replication_user -v -P -W
pg_basebackup -h primary_ip -D /var/lib/postgresql/14/main
-U replication_user -v -P -W
-U replication_user -v -P -W
3. Create standby.signal file
3. Create standby.signal file
touch /var/lib/postgresql/14/main/standby.signal
touch /var/lib/postgresql/14/main/standby.signal
4. Configure recovery: recovery.conf
4. Configure recovery: recovery.conf
primary_conninfo = 'host=primary_ip user=replication_user password=password'
primary_conninfo = 'host=primary_ip user=replication_user password=password'
5. Start PostgreSQL
5. Start PostgreSQL
sudo systemctl start postgresql
**Monitor Replication Status:**
```sql
-- On primary: check connected standbys
SELECT pid, usename, application_name, client_addr, state
FROM pg_stat_replication;
-- On primary: check replication lag
SELECT slot_name, restart_lsn, confirmed_flush_lsn
FROM pg_replication_slots;
-- On standby: check recovery status
SELECT pg_is_wal_replay_paused();
SELECT extract(EPOCH FROM (now() - pg_last_xact_replay_timestamp())) as replication_lag_seconds;sudo systemctl start postgresql
**监控复制状态:**
```sql
-- On primary: check connected standbys
SELECT pid, usename, application_name, client_addr, state
FROM pg_stat_replication;
-- On primary: check replication lag
SELECT slot_name, restart_lsn, confirmed_flush_lsn
FROM pg_replication_slots;
-- On standby: check recovery status
SELECT pg_is_wal_replay_paused();
SELECT extract(EPOCH FROM (now() - pg_last_xact_replay_timestamp())) as replication_lag_seconds;Logical Replication
逻辑复制
PostgreSQL - Logical Replication Setup:
sql
-- On publisher (primary)
CREATE PUBLICATION users_publication FOR TABLE users, orders;
-- Create replication slot
SELECT * FROM pg_create_logical_replication_slot('users_slot', 'pgoutput');
-- On subscriber (standby)
CREATE SUBSCRIPTION users_subscription
CONNECTION 'host=publisher_ip dbname=mydb user=repuser password=pwd'
PUBLICATION users_publication
WITH (copy_data = true);
-- Check subscription status
SELECT subname, subenabled, subconninfo
FROM pg_subscription;
-- Monitor replication status
SELECT slot_name, restart_lsn, confirmed_flush_lsn
FROM pg_replication_slots
WHERE slot_type = 'logical';PostgreSQL - 逻辑复制设置:
sql
-- On publisher (primary)
CREATE PUBLICATION users_publication FOR TABLE users, orders;
-- Create replication slot
SELECT * FROM pg_create_logical_replication_slot('users_slot', 'pgoutput');
-- On subscriber (standby)
CREATE SUBSCRIPTION users_subscription
CONNECTION 'host=publisher_ip dbname=mydb user=repuser password=pwd'
PUBLICATION users_publication
WITH (copy_data = true);
-- Check subscription status
SELECT subname, subenabled, subconninfo
FROM pg_subscription;
-- Monitor replication status
SELECT slot_name, restart_lsn, confirmed_flush_lsn
FROM pg_replication_slots
WHERE slot_type = 'logical';MySQL Replication
MySQL 复制
Master-Slave Setup
主从架构设置
MySQL - Configure Master Server:
sql
-- In MySQL config (my.cnf / my.ini)
-- [mysqld]
-- server-id = 1
-- log-bin = mysql-bin
-- binlog-format = ROW
-- Create replication user
CREATE USER 'replication'@'%' IDENTIFIED BY 'replication_password';
GRANT REPLICATION SLAVE ON *.* TO 'replication'@'%';
FLUSH PRIVILEGES;
-- Get binary log position
SHOW MASTER STATUS;
-- File: mysql-bin.000001
-- Position: 154MySQL - Configure Slave Server:
sql
-- In MySQL config (my.cnf / my.ini)
-- [mysqld]
-- server-id = 2
-- relay-log = mysql-relay-bin
-- binlog-format = ROW
-- Configure replication
CHANGE MASTER TO
MASTER_HOST = '192.168.1.100',
MASTER_USER = 'replication',
MASTER_PASSWORD = 'replication_password',
MASTER_LOG_FILE = 'mysql-bin.000001',
MASTER_LOG_POS = 154;
-- Start replication
START SLAVE;
-- Check slave status
SHOW SLAVE STATUS\G
-- Should show: Slave_IO_Running: Yes, Slave_SQL_Running: YesMonitor MySQL Replication:
sql
-- Check slave replication status
SHOW SLAVE STATUS\G
-- Check for replication errors
SHOW SLAVE STATUS\G
-- Look at Last_Error field
-- Stop and resume replication
STOP SLAVE;
-- Fix any issues...
START SLAVE;
-- Monitor replication lag
SHOW SLAVE STATUS\G
-- Check: Seconds_Behind_MasterMySQL - 配置主服务器:
sql
-- In MySQL config (my.cnf / my.ini)
-- [mysqld]
-- server-id = 1
-- log-bin = mysql-bin
-- binlog-format = ROW
-- Create replication user
CREATE USER 'replication'@'%' IDENTIFIED BY 'replication_password';
GRANT REPLICATION SLAVE ON *.* TO 'replication'@'%';
FLUSH PRIVILEGES;
-- Get binary log position
SHOW MASTER STATUS;
-- File: mysql-bin.000001
-- Position: 154MySQL - 配置从服务器:
sql
-- In MySQL config (my.cnf / my.ini)
-- [mysqld]
-- server-id = 2
-- relay-log = mysql-relay-bin
-- binlog-format = ROW
-- Configure replication
CHANGE MASTER TO
MASTER_HOST = '192.168.1.100',
MASTER_USER = 'replication',
MASTER_PASSWORD = 'replication_password',
MASTER_LOG_FILE = 'mysql-bin.000001',
MASTER_LOG_POS = 154;
-- Start replication
START SLAVE;
-- Check slave status
SHOW SLAVE STATUS\G
-- Should show: Slave_IO_Running: Yes, Slave_SQL_Running: Yes监控MySQL复制:
sql
-- Check slave replication status
SHOW SLAVE STATUS\G
-- Check for replication errors
SHOW SLAVE STATUS\G
-- Look at Last_Error field
-- Stop and resume replication
STOP SLAVE;
-- Fix any issues...
START SLAVE;
-- Monitor replication lag
SHOW SLAVE STATUS\G
-- Check: Seconds_Behind_MasterMulti-Master Replication
多主复制
MySQL - Circular Replication:
sql
-- Server 1 (Master 1)
-- [mysqld]
-- server-id = 1
-- log-bin = mysql-bin
-- auto_increment_increment = 2
-- auto_increment_offset = 1
CHANGE MASTER TO
MASTER_HOST = '192.168.1.101',
MASTER_USER = 'replication',
MASTER_PASSWORD = 'password',
MASTER_LOG_FILE = 'mysql-bin.000001',
MASTER_LOG_POS = 154;
START SLAVE;
-- Server 2 (Master 2)
-- [mysqld]
-- server-id = 2
-- log-bin = mysql-bin
-- auto_increment_increment = 2
-- auto_increment_offset = 2
CHANGE MASTER TO
MASTER_HOST = '192.168.1.100',
MASTER_USER = 'replication',
MASTER_PASSWORD = 'password',
MASTER_LOG_FILE = 'mysql-bin.000001',
MASTER_LOG_POS = 154;
START SLAVE;MySQL - 环形复制:
sql
-- Server 1 (Master 1)
-- [mysqld]
-- server-id = 1
-- log-bin = mysql-bin
-- auto_increment_increment = 2
-- auto_increment_offset = 1
CHANGE MASTER TO
MASTER_HOST = '192.168.1.101',
MASTER_USER = 'replication',
MASTER_PASSWORD = 'password',
MASTER_LOG_FILE = 'mysql-bin.000001',
MASTER_LOG_POS = 154;
START SLAVE;
-- Server 2 (Master 2)
-- [mysqld]
-- server-id = 2
-- log-bin = mysql-bin
-- auto_increment_increment = 2
-- auto_increment_offset = 2
CHANGE MASTER TO
MASTER_HOST = '192.168.1.100',
MASTER_USER = 'replication',
MASTER_PASSWORD = 'password',
MASTER_LOG_FILE = 'mysql-bin.000001',
MASTER_LOG_POS = 154;
START SLAVE;Replication Monitoring
复制监控
PostgreSQL - Replication Health Check:
sql
-- Create monitoring function
CREATE OR REPLACE FUNCTION check_replication_health()
RETURNS TABLE (
slot_name name,
restart_lsn pg_lsn,
confirmed_flush_lsn pg_lsn,
lag_bytes bigint,
status text
) AS $$
BEGIN
RETURN QUERY
SELECT
rs.slot_name,
rs.restart_lsn,
rs.confirmed_flush_lsn,
(pg_wal_lsn_diff(pg_current_wal_lsn(), rs.confirmed_flush_lsn))::bigint,
CASE
WHEN pg_wal_lsn_diff(pg_current_wal_lsn(), rs.confirmed_flush_lsn) < 1048576 THEN 'HEALTHY'
WHEN pg_wal_lsn_diff(pg_current_wal_lsn(), rs.confirmed_flush_lsn) < 10485760 THEN 'WARNING'
ELSE 'CRITICAL'
END
FROM pg_replication_slots rs
WHERE slot_type = 'physical';
END;
$$ LANGUAGE plpgsql;
SELECT * FROM check_replication_health();MySQL - Replication Lag Monitoring:
sql
-- Monitor replication lag across multiple slaves
CREATE TABLE replication_monitoring (
slave_host VARCHAR(50),
slave_port INT,
master_log_file VARCHAR(50),
read_master_log_pos BIGINT,
relay_log_file VARCHAR(50),
relay_log_pos BIGINT,
seconds_behind_master INT,
checked_at TIMESTAMP DEFAULT CURRENT_TIMESTAMP
);
-- Insert monitoring data
INSERT INTO replication_monitoring
SELECT
@@hostname,
@@port,
Master_Log_File,
Read_Master_Log_Pos,
Relay_Log_File,
Relay_Log_Pos,
Seconds_Behind_Master,
CURRENT_TIMESTAMP
FROM INFORMATION_SCHEMA.TABLES
LIMIT 1; -- Use SHOW SLAVE STATUS valuesPostgreSQL - 复制健康检查:
sql
-- Create monitoring function
CREATE OR REPLACE FUNCTION check_replication_health()
RETURNS TABLE (
slot_name name,
restart_lsn pg_lsn,
confirmed_flush_lsn pg_lsn,
lag_bytes bigint,
status text
) AS $$
BEGIN
RETURN QUERY
SELECT
rs.slot_name,
rs.restart_lsn,
rs.confirmed_flush_lsn,
(pg_wal_lsn_diff(pg_current_wal_lsn(), rs.confirmed_flush_lsn))::bigint,
CASE
WHEN pg_wal_lsn_diff(pg_current_wal_lsn(), rs.confirmed_flush_lsn) < 1048576 THEN 'HEALTHY'
WHEN pg_wal_lsn_diff(pg_current_wal_lsn(), rs.confirmed_flush_lsn) < 10485760 THEN 'WARNING'
ELSE 'CRITICAL'
END
FROM pg_replication_slots rs
WHERE slot_type = 'physical';
END;
$$ LANGUAGE plpgsql;
SELECT * FROM check_replication_health();MySQL - 复制延迟监控:
sql
-- Monitor replication lag across multiple slaves
CREATE TABLE replication_monitoring (
slave_host VARCHAR(50),
slave_port INT,
master_log_file VARCHAR(50),
read_master_log_pos BIGINT,
relay_log_file VARCHAR(50),
relay_log_pos BIGINT,
seconds_behind_master INT,
checked_at TIMESTAMP DEFAULT CURRENT_TIMESTAMP
);
-- Insert monitoring data
INSERT INTO replication_monitoring
SELECT
@@hostname,
@@port,
Master_Log_File,
Read_Master_Log_Pos,
Relay_Log_File,
Relay_Log_Pos,
Seconds_Behind_Master,
CURRENT_TIMESTAMP
FROM INFORMATION_SCHEMA.TABLES
LIMIT 1; -- Use SHOW SLAVE STATUS valuesReplication Failover
复制故障转移
PostgreSQL - Promote Standby to Primary:
bash
undefinedPostgreSQL - 将备服务器提升为主服务器:
bash
undefinedOn standby server
On standby server
Promote standby to primary
Promote standby to primary
pg_ctl promote -D /var/lib/postgresql/14/main
pg_ctl promote -D /var/lib/postgresql/14/main
Or use SQL command
Or use SQL command
SELECT pg_promote();
**MySQL - Promote Slave to Master:**
```sql
-- On slave
-- 1. Stop slave and wait for replication to complete
STOP SLAVE;
SHOW SLAVE STATUS\G -- Verify Slave_IO_Running and Slave_SQL_Running are OFF
-- 2. Promote to master
RESET SLAVE ALL;
-- 3. Reset binary log
RESET MASTER;
-- 4. Old master becomes new slave
-- Configure old master as slave of new master
CHANGE MASTER TO
MASTER_HOST = 'new_master_ip',
MASTER_USER = 'replication',
MASTER_PASSWORD = 'password',
MASTER_AUTO_POSITION = 1;
START SLAVE;SELECT pg_promote();
**MySQL - 将从服务器提升为主服务器:**
```sql
-- On slave
-- 1. Stop slave and wait for replication to complete
STOP SLAVE;
SHOW SLAVE STATUS\G -- Verify Slave_IO_Running and Slave_SQL_Running are OFF
-- 2. Promote to master
RESET SLAVE ALL;
-- 3. Reset binary log
RESET MASTER;
-- 4. Old master becomes new slave
-- Configure old master as slave of new master
CHANGE MASTER TO
MASTER_HOST = 'new_master_ip',
MASTER_USER = 'replication',
MASTER_PASSWORD = 'password',
MASTER_AUTO_POSITION = 1;
START SLAVE;Replication Configuration Best Practices
复制配置最佳实践
PostgreSQL - postgresql.conf settings:
conf
undefinedPostgreSQL - postgresql.conf 设置:
conf
undefinedWAL configuration
WAL configuration
wal_level = replica
max_wal_senders = 10
wal_keep_size = 1GB
wal_receiver_timeout = 60s
wal_receiver_status_interval = 10s
wal_level = replica
max_wal_senders = 10
wal_keep_size = 1GB
wal_receiver_timeout = 60s
wal_receiver_status_interval = 10s
Hot standby
Hot standby
hot_standby = on
max_standby_streaming_delay = 3min
hot_standby = on
max_standby_streaming_delay = 3min
Replication timeout
Replication timeout
wal_sender_timeout = 300s
**MySQL - my.cnf settings:**
```conf
[mysqld]wal_sender_timeout = 300s
**MySQL - my.cnf 设置:**
```conf
[mysqld]Replication configuration
Replication configuration
server-id = 1
log-bin = mysql-bin
binlog_format = ROW
binlog-row-image = FULL
server-id = 1
log-bin = mysql-bin
binlog_format = ROW
binlog-row-image = FULL
Slave configuration
Slave configuration
relay-log = mysql-relay-bin
relay-log-index = mysql-relay-bin.index
relay-log-info-repository = TABLE
relay-log = mysql-relay-bin
relay-log-index = mysql-relay-bin.index
relay-log-info-repository = TABLE
Safety
Safety
log_replica_updates = ON
slave_parallel_workers = 4
slave_parallel_type = LOGICAL_CLOCK
undefinedlog_replica_updates = ON
slave_parallel_workers = 4
slave_parallel_type = LOGICAL_CLOCK
undefinedTroubleshooting Replication
复制故障排查
PostgreSQL - Replication Issues:
sql
-- Check for missing files
SELECT slot_name, restart_lsn, wal_status
FROM pg_replication_slots;
-- Restart replication slot
SELECT pg_replication_slot_advance('slot_name', pg_current_wal_lsn());
-- Synchronize replication
SYNCHRONOUS_COMMIT = remote_apply;MySQL - Common Issues:
sql
-- Check duplicate entry error
SHOW SLAVE STATUS\G
-- Look for Last_SQL_Error
-- Skip error
SET GLOBAL SQL_SLAVE_SKIP_COUNTER = 1;
START SLAVE;
-- Reset replication
RESET SLAVE;
RESET MASTER;PostgreSQL - 复制问题:
sql
-- Check for missing files
SELECT slot_name, restart_lsn, wal_status
FROM pg_replication_slots;
-- Restart replication slot
SELECT pg_replication_slot_advance('slot_name', pg_current_wal_lsn());
-- Synchronize replication
SYNCHRONOUS_COMMIT = remote_apply;MySQL - 常见问题:
sql
-- Check duplicate entry error
SHOW SLAVE STATUS\G
-- Look for Last_SQL_Error
-- Skip error
SET GLOBAL SQL_SLAVE_SKIP_COUNTER = 1;
START SLAVE;
-- Reset replication
RESET SLAVE;
RESET MASTER;Replication Verification
复制验证
- Test failover in non-production first
- Verify data consistency after replication
- Monitor replication lag continuously
- Document all replication configurations
- Test backup/recovery procedures
- Schedule regular replication audits
- 先在非生产环境测试故障转移
- 验证复制后的数据一致性
- 持续监控复制延迟
- 记录所有复制配置
- 测试备份/恢复流程
- 定期安排复制审计