designing-multi-region-applications
Compare original and translation side by side
🇺🇸
Original
English🇨🇳
Translation
ChineseDesigning Multi-Region Applications
多区域应用设计
Guides developers through selecting the right multi-region pattern for their CockroachDB application and implementing it with proper validation. Covers the decision model for choosing between regular regional tables, , tables, and manual geo-partitioning, plus a hands-on demo framework for comparing approaches.
REGIONAL BY ROWGLOBALComplement to other skills: For transaction design patterns, see designing-application-transactions. For SQL syntax and schema design, see cockroachdb-sql.
本指南指导开发者为CockroachDB应用选择合适的多区域模式并进行正确验证实现。涵盖普通区域表、、表与手动地理分区的决策模型,以及用于对比不同方案的实操演示框架。
REGIONAL BY ROWGLOBAL与其他技能的互补: 关于事务设计模式,请查看designing-application-transactions。关于SQL语法与schema设计,请查看cockroachdb-sql。
When to Use This Skill
何时使用本技能
- Deciding how to model multi-region read/write behavior in CockroachDB
- Choosing between active-active and active-passive architectures
- Evaluating vs manual geo-partitioning
REGIONAL BY ROW - Understanding table behavior and trade-offs
GLOBAL - Designing for local reads and writes in multiple regions
- Building or presenting a multi-region demo or workshop
- Validating leaseholder placement and zone configurations
- Optimizing cross-region transaction latency
Do not use this skill when the question is only about SQL syntax, indexing, or generic schema design with no multi-region decision involved.
- 决定如何在CockroachDB中建模多区域读写行为
- 在多活与主备架构间做选择
- 评估与手动地理分区的差异
REGIONAL BY ROW - 理解表的行为与权衡
GLOBAL - 为多区域的本地读写做设计
- 构建或展示多区域演示/研讨会
- 验证租约持有者位置与区域配置
- 优化跨区域事务延迟
请勿使用本技能:当问题仅涉及SQL语法、索引或无多区域决策的通用schema设计时。
Prerequisites
前置条件
- Understanding of CockroachDB range architecture and leaseholder concepts
- Multi-region cluster or with locality flags for testing
cockroach demo - Knowledge of application write patterns (single-region vs multi-region)
- 了解CockroachDB的范围架构与租约持有者概念
- 多区域集群或带locality参数的用于测试
cockroach demo - 了解应用的写入模式(单区域 vs 多区域)
Pattern Selection
模式选择
Step 1: Identify the Application Write Model
步骤1:确定应用写入模型
Ask first: is there one write home, or many?
- If the application has one primary region for read/write, start with a primary-region / regular regional-table model or a manually configured active-passive design.
- If the application needs low-latency read/write in multiple regions, evaluate manual geo-partitioning or .
REGIONAL BY ROW - If the table is mostly reference data that should read fast everywhere and the write path is not the main focus, consider tables.
GLOBAL
首先问:是否有一个主要写入区域,还是多个?
- 如果应用有一个主要读写区域,从主区域/普通区域表模型或手动配置的主备设计开始。
- 如果应用需要多区域低延迟读写,评估手动地理分区或。
REGIONAL BY ROW - 如果表主要是参考数据,需要在所有区域快速读取且写入路径不是重点,则考虑表。
GLOBAL
Step 2: Choose the Pattern
步骤2:选择模式
A. Regular Regional Tables (Active-Passive)
A. 普通区域表(主备架构)
Use when:
- The application has one primary region for RW
- Remote regions are secondary or read-mostly
- Simplicity matters more than region-local writes everywhere
Characteristics:
- All leaseholders stay in the active region
- Replicas in other regions provide resiliency and single-region-failure survival
- Indicative latency: ~20ms writes, ~2-5ms reads (local region)
Recommendation: Prefer the higher-level multi-region abstractions first unless the user explicitly needs manual control over partitions, voters, and lease preferences.
适用场景:
- 应用有一个主读写区域
- 远程区域为次要或只读为主
- 简洁性比全域本地写入更重要
特性:
- 所有租约持有者保留在活跃区域
- 其他区域的副本提供弹性与单区域故障恢复能力
- 典型延迟:写入~20ms,读取(本地区域)~2-5ms
建议: 优先使用更高层级的多区域抽象,除非用户明确需要手动控制分区、投票节点与租约偏好。
B. Manual Geo-Partitioning with Region-Specific Leaseholders
B. 带区域特定租约持有者的手动地理分区
Use when:
- The application is active-active
- The data model is region-keyed
- The team wants explicit operational control
- Understanding internal mechanics (partitions, voters, lease preferences) is important
Characteristics:
- Region-specific leaseholder pattern keeps writes around ~20ms and reads around ~2-5ms
- The application must enforce reads and writes for a key in the same region
- More DDL and operational burden
- Best for teaching internals
Example DDL:
sql
CREATE TABLE accounts_manual (
account_id STRING(40),
owner_id STRING(40) NOT NULL,
status STRING(20) NOT NULL,
region STRING(10) NOT NULL,
CONSTRAINT accounts_manual_pkey PRIMARY KEY (region, account_id)
);
ALTER INDEX accounts_manual_pkey
PARTITION BY LIST (region) (
PARTITION na_ne VALUES IN ('NA-NE'),
PARTITION na_mw VALUES IN ('NA-MW'),
PARTITION na_nw VALUES IN ('NA-NW')
);
ALTER PARTITION na_ne OF INDEX accounts_manual_pkey
CONFIGURE ZONE USING
num_replicas = 5,
num_voters = 5,
voter_constraints = '{+region=NA-NE: 2, +region=NA-MW: 2, +region=NA-NW: 1}',
lease_preferences = '[[+region=NA-NE]]';适用场景:
- 应用为多活架构
- 数据模型按区域键划分
- 团队需要明确的操作控制权
- 理解内部机制(分区、投票节点、租约偏好)很重要
特性:
- 区域特定租约持有者模式可保持写入延迟
20ms,读取延迟2-5ms - 应用必须确保同一区域内的键的读写操作
- 更多DDL与操作负担
- 最适合讲解内部原理
示例DDL:
sql
CREATE TABLE accounts_manual (
account_id STRING(40),
owner_id STRING(40) NOT NULL,
status STRING(20) NOT NULL,
region STRING(10) NOT NULL,
CONSTRAINT accounts_manual_pkey PRIMARY KEY (region, account_id)
);
ALTER INDEX accounts_manual_pkey
PARTITION BY LIST (region) (
PARTITION na_ne VALUES IN ('NA-NE'),
PARTITION na_mw VALUES IN ('NA-MW'),
PARTITION na_nw VALUES IN ('NA-NW')
);
ALTER PARTITION na_ne OF INDEX accounts_manual_pkey
CONFIGURE ZONE USING
num_replicas = 5,
num_voters = 5,
voter_constraints = '{+region=NA-NE: 2, +region=NA-MW: 2, +region=NA-NW: 1}',
lease_preferences = '[[+region=NA-NE]]';C. REGIONAL BY ROW
C. REGIONAL BY ROW
Use when:
- The workload is active-active
- Each row naturally belongs to a region
- The team wants local RW in multiple regions without hand-managing partition zone configs
- The goal is the developer-facing multi-region abstraction
Characteristics:
- All configured regions are possible home/leaseholder regions
- Indicative latency: ~20ms writes, ~2-5ms reads (local region)
- Less manual configuration than geo-partitioning
- Default recommendation for region-affine application data
Example DDL:
sql
CREATE DATABASE IF NOT EXISTS example_service_rbr;
ALTER DATABASE example_service_rbr PRIMARY REGION 'NA-NE';
ALTER DATABASE example_service_rbr ADD REGION 'NA-NW';
ALTER DATABASE example_service_rbr ADD REGION 'NA-MW';
ALTER DATABASE example_service_rbr SURVIVE REGION FAILURE;
USE example_service_rbr;
CREATE TABLE accounts_rbr (
account_id STRING(40),
owner_id STRING(40) NOT NULL,
status STRING(20) NOT NULL,
region crdb_internal_region
NOT NULL
DEFAULT gateway_region()::crdb_internal_region,
CONSTRAINT accounts_rbr_pkey PRIMARY KEY (region, account_id)
) LOCALITY REGIONAL BY ROW AS region;Local allocation pattern:
sql
WITH candidate AS (
SELECT id, resource_code
FROM resource_pool
WHERE allocated_at IS NULL
AND region = gateway_region()::crdb_internal_region
ORDER BY random()
LIMIT 1
FOR UPDATE
)
UPDATE resource_pool
SET allocated_at = now()
WHERE id = (SELECT id FROM candidate);适用场景:
- 工作负载为多活架构
- 每行自然归属一个区域
- 团队希望在多区域实现本地读写,无需手动管理分区区域配置
- 目标是面向开发者的多区域抽象
特性:
- 所有配置的区域都可能成为主/租约持有者区域
- 典型延迟:写入~20ms,读取(本地区域)~2-5ms
- 比地理分区的手动配置更少
- 区域关联应用数据的默认推荐方案
示例DDL:
sql
CREATE DATABASE IF NOT EXISTS example_service_rbr;
ALTER DATABASE example_service_rbr PRIMARY REGION 'NA-NE';
ALTER DATABASE example_service_rbr ADD REGION 'NA-NW';
ALTER DATABASE example_service_rbr ADD REGION 'NA-MW';
ALTER DATABASE example_service_rbr SURVIVE REGION FAILURE;
USE example_service_rbr;
CREATE TABLE accounts_rbr (
account_id STRING(40),
owner_id STRING(40) NOT NULL,
status STRING(20) NOT NULL,
region crdb_internal_region
NOT NULL
DEFAULT gateway_region()::crdb_internal_region,
CONSTRAINT accounts_rbr_pkey PRIMARY KEY (region, account_id)
) LOCALITY REGIONAL BY ROW AS region;本地分配模式:
sql
WITH candidate AS (
SELECT id, resource_code
FROM resource_pool
WHERE allocated_at IS NULL
AND region = gateway_region()::crdb_internal_region
ORDER BY random()
LIMIT 1
FOR UPDATE
)
UPDATE resource_pool
SET allocated_at = now()
WHERE id = (SELECT id FROM candidate);D. GLOBAL Tables
D. GLOBAL表
Use when:
- The table is global/reference-style data
- The workload is primarily about broad read locality rather than region-owned writes
Important constraint: tables optimize for fast reads everywhere. Do not position them as an "RW everywhere" pattern without verifying product-specific behavior in the official documentation.
GLOBAL适用场景:
- 表为全局/参考型数据
- 工作负载主要关注广泛的读取locality,而非区域专属写入
重要约束: 表优化的是全域快速读取。在未验证官方文档中的产品特定行为前,不要将其定位为“全域读写”模式。
GLOBALE. Survival Goals
E. 生存目标
Choose the survival goal based on the trade-off between write latency and durability:
sql
-- Survive any single zone failure (default, 3+ zones required):
ALTER DATABASE mydb SURVIVE ZONE FAILURE;
-- Survive an entire region going down (3+ regions required):
ALTER DATABASE mydb SURVIVE REGION FAILURE;| Goal | Requirement | Write Latency | Data Safety |
|---|---|---|---|
| SURVIVE ZONE FAILURE | 3+ zones | Low (local consensus) | Survives 1 zone outage |
| SURVIVE REGION FAILURE | 3+ regions | Higher (cross-region consensus) | Survives 1 region outage |
SURVIVE REGION FAILURE根据写入延迟与耐久性的权衡选择生存目标:
sql
-- 承受任意单区域故障(默认,需3+可用区):
ALTER DATABASE mydb SURVIVE ZONE FAILURE;
-- 承受整个区域故障(需3+区域):
ALTER DATABASE mydb SURVIVE REGION FAILURE;| 目标 | 要求 | 写入延迟 | 数据安全性 |
|---|---|---|---|
| SURVIVE ZONE FAILURE | 3+可用区 | 低(本地共识) | 可承受1个可用区故障 |
| SURVIVE REGION FAILURE | 3+区域 | 较高(跨区域共识) | 可承受1个区域故障 |
SURVIVE REGION FAILUREPattern Comparison
模式对比
| Aspect | Regular Regional | Manual Geo-Partition | REGIONAL BY ROW | GLOBAL |
|---|---|---|---|---|
| Write model | Single primary region | Active-active, region-keyed | Active-active, row-affine | Write from primary region |
| Read locality | Local to primary | Local to partition | Local to row region | All regions |
| Operational burden | Low | High | Medium | Low |
| Configuration | Minimal | Explicit partitions, zones, lease prefs | Database-level abstractions | Table-level declaration |
| Best for | Simple primary-region apps | Full control over mechanics | Developer-facing multi-region | Reference data |
| 维度 | 普通区域表 | 手动地理分区 | REGIONAL BY ROW | GLOBAL表 |
|---|---|---|---|---|
| 写入模型 | 单主区域 | 多活、按区域键划分 | 多活、行关联区域 | 从主区域写入 |
| 读取locality | 主区域本地 | 分区本地 | 行所属区域本地 | 所有区域 |
| 操作负担 | 低 | 高 | 中 | 低 |
| 配置方式 | 最小化 | 显式分区、区域、租约偏好 | 数据库层级抽象 | 表层级声明 |
| 最佳适用场景 | 简单主区域应用 | 对机制完全控制 | 面向开发者的多区域方案 | 参考数据 |
Live Demo Setup
实时演示设置
For workshops and technical walkthroughs, use a 9-node local demo cluster to make multi-region locality observable.
对于研讨会与技术演练,使用9节点本地演示集群来展示多区域locality。
Cluster Setup
集群设置
bash
cockroach demo \
--nodes 9 \
--no-example-database \
--insecure \
--demo-locality=\
region=NA-NE,zone=NA-NE-1:\
region=NA-NE,zone=NA-NE-2:\
region=NA-NE,zone=NA-NE-3:\
region=NA-MW,zone=NA-MW-1:\
region=NA-MW,zone=NA-MW-2:\
region=NA-MW,zone=NA-MW-3:\
region=NA-NW,zone=NA-NW-1:\
region=NA-NW,zone=NA-NW-2:\
region=NA-NW,zone=NA-NW-3bash
cockroach demo \
--nodes 9 \
--no-example-database \
--insecure \
--demo-locality=\
region=NA-NE,zone=NA-NE-1:\
region=NA-NE,zone=NA-NE-2:\
region=NA-NE,zone=NA-NE-3:\
region=NA-MW,zone=NA-MW-1:\
region=NA-MW,zone=NA-MW-2:\
region=NA-MW,zone=NA-MW-3:\
region=NA-NW,zone=NA-NW-1:\
region=NA-NW,zone=NA-NW-2:\
region=NA-NW,zone=NA-NW-3Demo Flow
演示流程
Recommended presentation order:
- Start with the manual geo-partitioning path
- Show explicit partitioning and zone configuration
- Run validation queries and confirm lease homing
- Switch to REGIONAL BY ROW
- Run RBR validations
- Compare operational surface area
推荐展示顺序:
- 从手动地理分区方案开始
- 展示显式分区与区域配置
- 运行验证查询并确认租约归属
- 切换到REGIONAL BY ROW
- 运行RBR验证
- 对比操作复杂度
Validation Queries
验证查询
Manual partitioning validation:
sql
SHOW RANGES FROM INDEX accounts_manual_pkey WITH DETAILS;Check that:
- All expected partition values are present
- Lease holder locality matches partition region
- Mismatches return FAIL, otherwise PASS
RBR validation:
sql
SHOW RANGES FROM TABLE accounts_rbr WITH DETAILS;Check that:
- Leaseholder locality coverage includes the expected regions
- There are no unexpected lease regions
手动分区验证:
sql
SHOW RANGES FROM INDEX accounts_manual_pkey WITH DETAILS;检查要点:
- 所有预期分区值均存在
- 租约持有者locality与分区区域匹配
- 不匹配则返回FAIL,否则返回PASS
RBR验证:
sql
SHOW RANGES FROM TABLE accounts_rbr WITH DETAILS;检查要点:
- 租约持有者locality覆盖预期区域
- 无意外租约区域
Demo Talking Points
演示讲解要点
Manual path:
- Precise control over partitions, voters, replicas, and lease preferences
- More DDL and operational burden
- Best for teaching internals and understanding what the database does under the hood
RBR path:
- Keeps application intent front and center
- Less manual configuration
- Easier to explain for app teams
- Still grounded in the same topology
手动方案:
- 对分区、投票节点、副本与租约偏好的精确控制
- 更多DDL与操作负担
- 最适合讲解内部原理,理解数据库底层工作机制
RBR方案:
- 以应用意图为核心
- 手动配置更少
- 更易向应用团队解释
- 仍基于相同的拓扑结构
Cross-Region Latency Guidance
跨区域延迟指导
Transaction latency increases when the client is remote from the relevant leaseholder/quorum path.
| Client Location | Local RW Latency | Cross-Region RW Latency |
|---|---|---|
| Same region as leaseholder | ~10-20ms | — |
| Different region | — | ~50-150ms+ |
Guidance:
- Place latency-sensitive services close to their primary data locality
- Use follower reads for non-critical display/reporting queries
- Use multi-region table locality and zone configuration intentionally
- Do not assume "distributed" means "same latency everywhere"
当客户端与相关租约持有者/共识路径远程连接时,事务延迟会增加。
| 客户端位置 | 本地读写延迟 | 跨区域读写延迟 |
|---|---|---|
| 与租约持有者同区域 | ~10-20ms | — |
| 不同区域 | — | ~50-150ms+ |
指导建议:
- 将延迟敏感服务部署在其主数据locality附近
- 对非关键展示/报表查询使用follower reads
- 有意使用多区域表locality与区域配置
- 不要假设“分布式”意味着“处处延迟相同”
Output Expectations
输出期望
A strong answer using this skill should include:
- The recommended pattern
- Why it fits the workload
- What the application must do (routing, row affinity, primary-region assumptions)
- What CockroachDB manages automatically vs manually
- Expected latency shape or locality behavior
- A warning when the user is asking for something the chosen pattern does not optimize for
使用本技能给出优质回答应包含:
- 推荐的模式
- 该模式适合工作负载的原因
- 应用需要执行的操作(路由、行关联、主区域假设)
- CockroachDB自动管理与手动管理的内容
- 预期延迟形态或locality行为
- 当用户需求超出所选模式优化范围时的警告
Guardrails
约束规则
- Do not claim that regular primary-region tables provide symmetric low-latency writes from all regions
- Do not claim that is the answer for all-region low-latency writes without supporting documentation
GLOBAL - When comparing manual geo-partitioning vs , explicitly call out control vs simplicity
REGIONAL BY ROW - When the user wants to understand internal mechanics, bias toward explaining the manual model first
- When the user wants the best default application pattern, bias toward for region-affine data
REGIONAL BY ROW - Keep region names and locality labels consistent across all SQL
- Do not mix manual and abstraction approaches in the same explanation unless explicitly comparing them
- Always include validation, not just DDL
- 不要声称普通主区域表能在所有区域提供对称低延迟写入
- 不要在无官方文档支持的情况下声称表是全域低延迟写入的解决方案
GLOBAL - 在对比手动地理分区与时,明确指出控制度与简洁性的差异
REGIONAL BY ROW - 当用户想了解内部机制时,优先讲解手动模型
- 当用户需要最佳默认应用模式时,对区域关联数据优先推荐
REGIONAL BY ROW - 所有SQL中的区域名称与locality标签保持一致
- 除非明确对比,否则不要在同一解释中混合手动与抽象方案
- 始终包含验证步骤,而非仅提供DDL
Multi-Region Migration Checklist
多区域迁移清单
For teams migrating from single-region PostgreSQL/Oracle to multi-region CockroachDB:
- Deploy nodes with
--locality=region=<region>,zone=<zone> - Set primary region:
ALTER DATABASE <db> PRIMARY REGION '<region>' - Add regions: (for each)
ALTER DATABASE <db> ADD REGION '<region>' - Set survival goal:
ALTER DATABASE <db> SURVIVE ZONE|REGION FAILURE - Classify tables: GLOBAL (reference data), REGIONAL BY ROW (row-affine), REGIONAL BY TABLE (default)
- Set localities:
ALTER TABLE <t> SET LOCALITY <locality> - Monitor leaseholder distribution in DB Console
- Test failover: kill a zone/region and verify survival goal holds
对于从单区域PostgreSQL/Oracle迁移至多区域CockroachDB的团队:
- 使用部署节点
--locality=region=<region>,zone=<zone> - 设置主区域:
ALTER DATABASE <db> PRIMARY REGION '<region>' - 添加区域:(每个区域执行一次)
ALTER DATABASE <db> ADD REGION '<region>' - 设置生存目标:
ALTER DATABASE <db> SURVIVE ZONE|REGION FAILURE - 分类表:GLOBAL(参考数据)、REGIONAL BY ROW(行关联)、REGIONAL BY TABLE(默认)
- 设置locality:
ALTER TABLE <t> SET LOCALITY <locality> - 在DB Console中监控租约持有者分布
- 测试故障转移:终止一个可用区/区域并验证生存目标是否生效
Safety Considerations
安全注意事项
- Multi-region configuration changes affect data placement across the cluster
- Test multi-region configurations on demo or staging clusters before production
- Validate leaseholder placement after configuration changes
- Allow time for range rebalancing after topology changes
- 多区域配置变更会影响集群内的数据放置
- 在生产环境前,先在演示或预发布集群测试多区域配置
- 配置变更后验证租约持有者位置
- 拓扑变更后留出时间让范围重新平衡
References
参考资料
- CockroachDB Multi-Region Overview
- REGIONAL BY ROW Tables
- GLOBAL Tables
- Follower Reads Documentation
- CockroachDB Transactions
- Performance Best Practices
- Cross-Regional Latency Impact on Transactions
- Query Parallelism with CockroachDB
- CockroachDB Best Practices & Anti-Patterns Demo -- Demo 10 covers multi-region patterns with runnable examples
- CockroachDB Multi-Region Overview
- REGIONAL BY ROW Tables
- GLOBAL Tables
- Follower Reads Documentation
- CockroachDB Transactions
- Performance Best Practices
- Cross-Regional Latency Impact on Transactions
- Query Parallelism with CockroachDB
- CockroachDB Best Practices & Anti-Patterns Demo -- 演示10包含多区域模式的可运行示例