designing-application-transactions

Compare original and translation side by side

🇺🇸

Original

English

🇨🇳

Translation

Chinese

Designing Application Transactions

应用事务设计

Guides application developers through the design principles and implementation patterns needed to build correct, performant, and resilient applications on CockroachDB. Covers the full spectrum from transaction scoping and retry logic to connection pooling and observability.

Complement to SQL skills: For SQL syntax, schema design, and query optimization, see cockroachdb-sql. For benchmarking transaction formulations under contention, see benchmarking-transaction-patterns.

指导应用开发者掌握在CockroachDB上构建正确、高性能且具备韧性的应用所需的设计原则与实现模式，涵盖从事务范围界定、重试逻辑到连接池配置与可观测性的全流程内容。

SQL技能补充： 关于SQL语法、schema设计和查询优化，请参阅cockroachdb-sql。关于在竞争环境下基准测试事务表述，请参阅benchmarking-transaction-patterns。

When to Use This Skill

何时使用此技能

Designing transaction boundaries for a CockroachDB application
Implementing client-side retry logic with exponential backoff
Deciding between implicit and explicit transactions
Choosing between optimistic and pessimistic concurrency control
Replacing read-modify-write loops with atomic SQL
Configuring connection pools (HikariCP, pgbouncer, etc.)
Implementing keyset pagination instead of OFFSET/LIMIT
Using follower reads for reporting and analytics queries
Separating business orchestration from database transactions
Using prepared statements for performance and security
Selecting explicit column projections instead of SELECT *
Testing application behavior under concurrency
Monitoring application-level database performance

为CockroachDB应用设计事务边界
实现带指数退避的客户端重试逻辑
决定使用隐式还是显式事务
在乐观与悲观并发控制之间做出选择
用原子SQL替代读取-修改-写入循环
配置连接池（HikariCP、pgbouncer等）
实现键集分页而非OFFSET/LIMIT
将follower reads用于报表与分析查询
分离业务编排与数据库事务
使用预处理语句提升性能与安全性
选择显式列投影而非SELECT *
在并发环境下测试应用行为
监控应用层数据库性能

Prerequisites

前置条件

Familiarity with CockroachDB's SERIALIZABLE isolation level
Understanding of ACID transaction semantics
Access to application source code for transaction design changes
SQL connection to a CockroachDB cluster (for testing and validation)

熟悉CockroachDB的SERIALIZABLE隔离级别
理解ACID事务语义
有权限访问应用源代码以修改事务设计
拥有与CockroachDB集群的SQL连接（用于测试与验证）

Steps

步骤

1. Keep Transactions Short-Lived

1. 保持事务短生命周期

Transactions must include only the minimal set of SQL operations needed for one atomic state change. Do not place remote API calls, service-to-service requests, loops, expensive computation, or artificial waits inside a CockroachDB transaction.

Long-lived transactions increase intent lifetime, contention, and retry probability in CockroachDB's distributed, optimistic-concurrency architecture.

Anti-pattern:

java

@Transactional
public void createOrder(Order order) {
    orderRepository.save(order);
    paymentGateway.charge(order); // external call inside TX
}

Correct approach — split the logic:

java

@Transactional
public void createOrderRecord(Order order) {
    orderRepository.save(order);
}

// Outside the transaction
paymentGateway.charge(order);

Why it matters:

Active intents block concurrent writers, reducing cluster throughput
Competing transactions are more likely to encounter
```
40001
```
retry errors
External work inside a retried transaction may run twice, causing duplicate side effects
Long transactions tie up connections and memory, reducing concurrency

事务应仅包含实现一次原子状态变更所需的最少SQL操作。不要将远程API调用、服务间请求、循环、耗时计算或人为等待放入CockroachDB事务中。

在CockroachDB的分布式乐观并发架构中，长生命周期事务会增加意向锁持有时间、竞争概率以及重试可能性。

反模式：

java

@Transactional
public void createOrder(Order order) {
    orderRepository.save(order);
    paymentGateway.charge(order); // external call inside TX
}

正确做法——拆分逻辑：

java

@Transactional
public void createOrderRecord(Order order) {
    orderRepository.save(order);
}

// Outside the transaction
paymentGateway.charge(order);

重要性：

活跃意向锁会阻塞并发写入，降低集群吞吐量
竞争事务更可能遇到
```
40001
```
重试错误
重试事务中的外部工作可能执行两次，导致重复副作用
长事务会占用连接与内存，降低并发能力

2. Use Implicit Transactions for Single Statements

2. 单语句使用隐式事务

CockroachDB automatically wraps each individual SQL statement as a transaction in autocommit mode. For single

INSERT

UPDATE

DELETE

, or

SELECT

statements, do not wrap in explicit

BEGIN

COMMIT

Preferred:

sql

INSERT INTO orders (id, status)
VALUES (gen_random_uuid(), 'open');

Avoid:

sql

BEGIN;
INSERT INTO orders (id, status)
VALUES (gen_random_uuid(), 'open');
COMMIT;

Benefits: Simpler code paths, lower latency (fewer round trips), less resource usage, and fewer retry concerns since single-statement transactions are easier for CockroachDB to retry automatically.

在自动提交模式下，CockroachDB会自动将每个单独的SQL语句包装为事务。对于单个

INSERT

、

UPDATE

、

DELETE

或

SELECT

语句，不要用显式

BEGIN

COMMIT

包裹。

推荐写法：

sql

INSERT INTO orders (id, status)
VALUES (gen_random_uuid(), 'open');

避免写法：

sql

BEGIN;
INSERT INTO orders (id, status)
VALUES (gen_random_uuid(), 'open');
COMMIT;

优势： 代码路径更简单、延迟更低（减少往返次数）、资源占用更少，且单语句事务更便于CockroachDB自动重试，减少重试相关问题。

3. Use Explicit Transactions for Grouped Statements and Handle Retries

3. 分组语句使用显式事务并处理重试

When multiple SQL operations must succeed or fail together, use explicit transactions with

BEGIN

COMMIT

. Because CockroachDB defaults to SERIALIZABLE isolation, transaction retries are a normal part of correct execution under contention.

sql

BEGIN;
  UPDATE accounts SET balance = balance - 100 WHERE id = 1;
  UPDATE accounts SET balance = balance + 100 WHERE id = 2;
COMMIT;

Client-side retry loop with exponential backoff:

python

import random
import time

def execute_with_retry(conn, txn_logic):
    backoff = 0.1
    while True:
        try:
            with conn.transaction() as txn:
                txn_logic(txn)
            return
        except SerializationFailure:
            time.sleep(backoff + random.uniform(0, 0.1))
            backoff = min(backoff * 2, 2.0)

Advanced retry with the cockroach_restart savepoint protocol:

sql

BEGIN;
SAVEPOINT cockroach_restart;
-- transactional work
RELEASE SAVEPOINT cockroach_restart;
COMMIT;

WARNING: Generic savepoints do NOT work as retry mechanisms. CockroachDB aborts the entire transaction on a

serialization failure. Using

ROLLBACK TO SAVEPOINT

on a regular savepoint cannot recover -- the transaction remains in an aborted state. Only the special

SAVEPOINT cockroach_restart

protocol (where the client catches the error, rolls back to the savepoint, and re-executes the work) is supported. For most applications, a full-transaction retry loop is simpler and recommended.

SQLSTATE guidance:

Code	Meaning	Action
`40001`	Serialization / retryable	Retry the entire unit of work with backoff and jitter
`40003`	Ambiguous result / indeterminate commit	Do not blindly replay non-idempotent work
`08xx` / `57xx`	Network or server transient issues	Retry carefully, account for ambiguous commits
`23xxx`	Constraint and application errors	Usually should not be retried

当多个SQL操作必须同时成功或失败时，使用带

BEGIN

COMMIT

的显式事务。由于CockroachDB默认使用SERIALIZABLE隔离级别，事务重试是竞争环境下正确执行的正常环节。

sql

BEGIN;
  UPDATE accounts SET balance = balance - 100 WHERE id = 1;
  UPDATE accounts SET balance = balance + 100 WHERE id = 2;
COMMIT;

带指数退避的客户端重试循环：

python

import random
import time

def execute_with_retry(conn, txn_logic):
    backoff = 0.1
    while True:
        try:
            with conn.transaction() as txn:
                txn_logic(txn)
            return
        except SerializationFailure:
            time.sleep(backoff + random.uniform(0, 0.1))
            backoff = min(backoff * 2, 2.0)

使用cockroach_restart保存点协议的高级重试：

sql

BEGIN;
SAVEPOINT cockroach_restart;
-- transactional work
RELEASE SAVEPOINT cockroach_restart;
COMMIT;

警告：通用保存点不能作为重试机制。 当出现

序列化失败时，CockroachDB会中止整个事务。对常规保存点使用

ROLLBACK TO SAVEPOINT

无法恢复——事务仍处于中止状态。仅支持特殊的

SAVEPOINT cockroach_restart

协议（客户端捕获错误，回滚到保存点，重新执行工作）。对于大多数应用，全事务重试循环更简单且推荐使用。

SQLSTATE指南：

代码	含义	操作
`40001`	序列化 / 可重试	带退避与抖动重试整个工作单元
`40003`	结果模糊 / 提交不确定	不要盲目重放非幂等工作
`08xx` / `57xx`	网络或服务器临时问题	谨慎重试，考虑提交不确定的情况
`23xxx`	约束与应用错误	通常不应重试

4. Mark Read-Only Transactions Where Applicable

4. 适用时标记只读事务

Read-only transactions perform retrieval only and make no writes. Marking them as read-only allows CockroachDB to avoid unnecessary write intents, reduce contention with writers, and enable follower or bounded-staleness reads.

sql

BEGIN;
SET TRANSACTION READ ONLY;
SELECT * FROM customers WHERE region = 'US-East';
COMMIT;

只读事务仅执行检索操作，不进行写入。将其标记为只读可让CockroachDB避免不必要的写入意向锁，减少与写入者的竞争，并支持follower reads或有界过期读取。

sql

BEGIN;
SET TRANSACTION READ ONLY;
SELECT * FROM customers WHERE region = 'US-East';
COMMIT;

5. Push Invariants into SQL — Avoid Read-Modify-Write Loops

5. 将不变式推入SQL——避免读取-修改-写入循环

Do not fetch state into application code, modify it in memory, and write it back. Prefer atomic SQL, constraints, guarded UPDATEs, UPSERT, INSERT ... ON CONFLICT, and CTE-based mutations.

Anti-pattern:

python

balance = db.fetch("SELECT balance FROM accounts WHERE id = 123")
balance += 100
db.execute("UPDATE accounts SET balance = %s WHERE id = 123", (balance,))

Preferred atomic SQL:

sql

UPDATE accounts
SET balance = balance + 100
WHERE id = 123;

Guarded write with invariant enforcement:

sql

UPDATE customer_daily_limits
SET used_total = used_total + $2
WHERE customer_id = $1
  AND day = current_date
  AND used_total + $2 <= daily_limit;

Atomic CTE pattern:

sql

WITH limit_row AS (
  SELECT customer_id, day
  FROM customer_daily_limits
  WHERE customer_id = $1 AND day = current_date
  FOR UPDATE
), spend AS (
  UPDATE customer_daily_limits AS l
  SET remaining_limit = l.remaining_limit - $2,
      used_total = l.used_total + $2
  FROM limit_row
  WHERE l.customer_id = limit_row.customer_id
    AND l.day = limit_row.day
    AND l.remaining_limit >= $2
  RETURNING l.customer_id, l.day
), ins AS (
  INSERT INTO transfers (customer_id, amount, direction, created_at)
  SELECT $1, $2, 'debit', now()
  FROM spend
  RETURNING id AS transfer_id
)
SELECT transfer_id FROM ins;

Key approaches:

Use atomic updates:
```
UPDATE ... SET col = col + 1
```
Use version or timestamp checks in WHERE clauses for optimistic concurrency
Enforce business rules with
```
UNIQUE
```
,
```
CHECK
```
,
```
NOT NULL
```
, and
```
FOREIGN KEY
```
constraints
Use
```
UPSERT
```
or
```
INSERT ... ON CONFLICT
```
instead of read-before-write existence checks
Use CTEs to keep multi-step logic atomic

不要将状态读取到应用代码中，在内存中修改后再写回数据库。优先使用原子SQL、约束、有条件的UPDATE、UPSERT、INSERT ... ON CONFLICT以及基于CTE的变更操作。

反模式：

python

balance = db.fetch("SELECT balance FROM accounts WHERE id = 123")
balance += 100
db.execute("UPDATE accounts SET balance = %s WHERE id = 123", (balance,))

推荐的原子SQL写法：

sql

UPDATE accounts
SET balance = balance + 100
WHERE id = 123;

带不变式约束的有条件写入：

sql

UPDATE customer_daily_limits
SET used_total = used_total + $2
WHERE customer_id = $1
  AND day = current_date
  AND used_total + $2 <= daily_limit;

原子CTE模式：

sql

WITH limit_row AS (
  SELECT customer_id, day
  FROM customer_daily_limits
  WHERE customer_id = $1 AND day = current_date
  FOR UPDATE
), spend AS (
  UPDATE customer_daily_limits AS l
  SET remaining_limit = l.remaining_limit - $2,
      used_total = l.used_total + $2
  FROM limit_row
  WHERE l.customer_id = limit_row.customer_id
    AND l.day = limit_row.day
    AND l.remaining_limit >= $2
  RETURNING l.customer_id, l.day
), ins AS (
  INSERT INTO transfers (customer_id, amount, direction, created_at)
  SELECT $1, $2, 'debit', now()
  FROM spend
  RETURNING id AS transfer_id
)
SELECT transfer_id FROM ins;

核心方法：

使用原子更新：
```
UPDATE ... SET col = col + 1
```
在WHERE子句中使用版本或时间戳检查实现乐观并发
使用
```
UNIQUE
```
、
```
CHECK
```
、
```
NOT NULL
```
和
```
FOREIGN KEY
```
约束执行业务规则
使用
```
UPSERT
```
或
```
INSERT ... ON CONFLICT
```
替代写入前的存在性检查
使用CTE保持多步骤逻辑的原子性

6. Use SELECT ... FOR UPDATE Selectively

6. 选择性使用SELECT ... FOR UPDATE

CockroachDB defaults to optimistic concurrency, which works well for most workloads. For hot rows or contention-heavy read-before-write paths,

SELECT ... FOR UPDATE

reduces retry churn by making contenders wait instead of race.

sql

BEGIN;
SELECT balance FROM accounts WHERE id = 1 FOR UPDATE;
UPDATE accounts SET balance = balance - 100 WHERE id = 1;
COMMIT;

Use when:

The same rows are updated frequently by many concurrent transactions
Optimistic retries are causing thrashing
Consistency before write is required (inventory, financial transfers)

Counterintuitive contention insight: Adding more application pods or threads targeting the same hot rows does NOT increase throughput -- it decreases it. With N concurrent writers on the same row, only 1 can commit per round; the other N-1 are aborted with

and must retry. More concurrency on hot data means more wasted work and lower TPS. Solutions: use

SELECT ... FOR UPDATE

to serialize access, use atomic

UPDATE SET balance = balance + amount

to eliminate the read-modify-write cycle, or distribute writes across multiple rows.

Trade-off: Overusing pessimistic locks can introduce waiting chains or deadlocks. Reserve for hot paths and contention-heavy workloads.

CockroachDB默认使用乐观并发，适用于大多数工作负载。对于热点行或竞争激烈的读取-修改-写入路径，

SELECT ... FOR UPDATE

通过让竞争者等待而非竞争来减少重试次数。

sql

BEGIN;
SELECT balance FROM accounts WHERE id = 1 FOR UPDATE;
UPDATE accounts SET balance = balance - 100 WHERE id = 1;
COMMIT;

适用场景：

同一行被多个并发事务频繁更新
乐观重试导致频繁失败
写入前需要一致性（库存、财务转账）

违反直觉的竞争洞察： 增加针对同一热点行的应用Pod或线程数量并不会提高吞吐量——反而会降低。当N个并发写入者针对同一行时，每轮只有1个能提交；其他N-1个会因

被中止并必须重试。热点数据上的并发越高，无效工作越多，TPS越低。解决方案：使用

SELECT ... FOR UPDATE

序列化访问，使用原子

UPDATE SET balance = balance + amount

消除读取-修改-写入循环，或将写入分散到多个行。

权衡： 过度使用悲观锁会引入等待链或死锁。仅保留用于热点路径和竞争激烈的工作负载。

7. Use Set-Based Operations Over Row-by-Row Loops

7. 使用基于集合的操作而非逐行循环

CockroachDB performs best with set-oriented SQL rather than many small client-driven statements. This reduces round trips, shortens contention windows, and improves throughput.

Row-by-row anti-pattern:

python

for row in rows:
    db.execute(
        "UPDATE accounts SET balance = balance + 10 WHERE id = %s",
        (row.id,)
    )

Set-based preferred:

sql

UPDATE accounts
SET balance = balance + 10
WHERE region = 'US-East';

Batch INSERT:

sql

INSERT INTO trades (id, symbol, price)
VALUES
  (1, 'AAPL', 180),
  (2, 'GOOG', 125),
  (3, 'AMZN', 140);

Batch UPDATE with UNNEST:

sql

WITH incoming AS (
  SELECT *
  FROM UNNEST(
    ARRAY['u1', 'u2', 'u3']::STRING[],
    ARRAY['active', 'inactive', 'active']::STRING[]
  ) AS t(id, new_status)
)
UPDATE users AS u
SET status = incoming.new_status,
    updated_at = now()
FROM incoming
WHERE u.id = incoming.id;

Maintenance batching with LIMIT:

sql

DELETE FROM sessions
WHERE expires_at < now()
LIMIT 10000;

JDBC batching (Java): Use

addBatch

executeBatch

instead of per-row

executeUpdate

. This sends all rows in a single network round trip rather than N individual round trips, eliminating idle time that can account for ~50% of transaction latency in chatty workloads.

Declarative TTL:

sql

ALTER TABLE events
SET (ttl_expiration_expression = 'created_at + INTERVAL ''7 DAY''');

CockroachDB在处理面向集合的SQL时性能最佳，而非大量客户端驱动的小语句。这减少了往返次数，缩短了竞争窗口，提高了吞吐量。

逐行反模式：

python

for row in rows:
    db.execute(
        "UPDATE accounts SET balance = balance + 10 WHERE id = %s",
        (row.id,)
    )

推荐的基于集合写法：

sql

UPDATE accounts
SET balance = balance + 10
WHERE region = 'US-East';

批量INSERT：

sql

INSERT INTO trades (id, symbol, price)
VALUES
  (1, 'AAPL', 180),
  (2, 'GOOG', 125),
  (3, 'AMZN', 140);

使用UNNEST的批量UPDATE：

sql

WITH incoming AS (
  SELECT *
  FROM UNNEST(
    ARRAY['u1', 'u2', 'u3']::STRING[],
    ARRAY['active', 'inactive', 'active']::STRING[]
  ) AS t(id, new_status)
)
UPDATE users AS u
SET status = incoming.new_status,
    updated_at = now()
FROM incoming
WHERE u.id = incoming.id;

带LIMIT的维护批量操作：

sql

DELETE FROM sessions
WHERE expires_at < now()
LIMIT 10000;

JDBC批量操作（Java）： 使用

addBatch

executeBatch

而非逐行

executeUpdate

。这会在单次网络往返中发送所有行，而非N次单独往返，消除了在频繁交互工作负载中可能占事务延迟~50%的空闲时间。

声明式TTL：

sql

ALTER TABLE events
SET (ttl_expiration_expression = 'created_at + INTERVAL ''7 DAY''');

8. Use Follower Reads for Non-Critical Queries

8. 将follower reads用于非关键查询

Many analytics, dashboard, and display-oriented queries do not need the absolute latest value. CockroachDB supports follower reads and bounded-staleness reads from follower replicas with lower latency.

Basic follower read:

sql

SELECT * FROM orders
AS OF SYSTEM TIME '-5s';

Bounded staleness:

sql

SELECT * FROM inventory
AS OF SYSTEM TIME with_max_staleness(INTERVAL '10s');

Read-write split pattern for heavy reads: When a workflow reads a large payload (e.g., KYC JSON document) and then updates a status field, split it into three phases: (1) read outside the transaction with

AS OF SYSTEM TIME

for a conflict-free snapshot, (2) process in the application layer, (3) start a short write-only transaction. This avoids holding write intents during the heavy read.

Use when: Dashboards, analytics, ETL, display-only reads, or large-payload workflows where the read and write can be separated.

Avoid when: The workflow requires the latest transactional state for a subsequent write decision.

许多分析、仪表板和展示类查询不需要绝对最新的值。CockroachDB支持从跟随者副本进行follower reads和有界过期读取，延迟更低。

基础follower read：

sql

SELECT * FROM orders
AS OF SYSTEM TIME '-5s';

有界过期：

sql

SELECT * FROM inventory
AS OF SYSTEM TIME with_max_staleness(INTERVAL '10s');

针对大量读取的读写分离模式： 当工作流读取大负载（如KYC JSON文档）然后更新状态字段时，将其拆分为三个阶段：(1) 在事务外使用

AS OF SYSTEM TIME

读取无冲突快照，(2) 在应用层处理，(3) 启动一个短的仅写入事务。这避免了在大量读取期间持有写入意向锁。

适用场景： 仪表板、分析、ETL、仅展示读取，或读取与写入可分离的大负载工作流。

避免场景： 工作流需要最新事务状态作为后续写入决策的前提。

9. Use Keyset Pagination Instead of OFFSET/LIMIT

9. 使用键集分页而非OFFSET/LIMIT

As the OFFSET grows, CockroachDB must scan and discard more rows. Keyset pagination uses the last row's ordered key values to jump directly to the next page.

OFFSET/LIMIT (inefficient at depth):

sql

SELECT id, order_date, customer_id
FROM orders
ORDER BY id
LIMIT 100 OFFSET 5000;

Keyset pagination (preferred):

sql

SELECT id, order_date, customer_id
FROM orders
WHERE id > 5000
ORDER BY id
LIMIT 100;

Multi-column keyset:

sql

SELECT id, created_at, customer_id
FROM orders
WHERE (created_at, id) > ('2025-01-01 00:00:00', 5000)
ORDER BY created_at, id
LIMIT 100;

Trade-off: Keyset pagination is ideal for next/previous navigation but not for arbitrary "jump to page 73" UX.

随着OFFSET增大，CockroachDB必须扫描并丢弃更多行。键集分页使用最后一行的有序键值直接跳转到下一页。

OFFSET/LIMIT（深度分页低效）：

sql

SELECT id, order_date, customer_id
FROM orders
ORDER BY id
LIMIT 100 OFFSET 5000;

键集分页（推荐）：

sql

SELECT id, order_date, customer_id
FROM orders
WHERE id > 5000
ORDER BY id
LIMIT 100;

多列键集：

sql

SELECT id, created_at, customer_id
FROM orders
WHERE (created_at, id) > ('2025-01-01 00:00:00', 5000)
ORDER BY created_at, id
LIMIT 100;

权衡： 键集分页非常适合下一页/上一页导航，但不适用于任意“跳转到第73页”的用户体验。

10. Use Prepared Statements for Performance and Security

10. 使用预处理语句提升性能与安全性

Prepared statements reuse query structure and bind new values, improving performance through plan reuse and protecting against SQL injection.

Unsafe dynamic string concatenation:

python

query = f"SELECT * FROM users WHERE username = '{user_input}'"
cursor.execute(query)

Prepared / parameterized execution:

python

cursor.execute("SELECT * FROM users WHERE username = %s;", (user_input,))

Plan reuse:

sql

PREPARE get_balance AS
SELECT balance FROM accounts WHERE id = $1;

EXECUTE get_balance(1001);
EXECUTE get_balance(2002);

预处理语句复用查询结构并绑定新值，通过计划复用提升性能，并防止SQL注入。

不安全的动态字符串拼接：

python

query = f"SELECT * FROM users WHERE username = '{user_input}'"
cursor.execute(query)

预处理/参数化执行：

python

cursor.execute("SELECT * FROM users WHERE username = %s;", (user_input,))

计划复用：

sql

PREPARE get_balance AS
SELECT balance FROM accounts WHERE id = $1;

EXECUTE get_balance(1001);
EXECUTE get_balance(2002);

11. Use Column Projections Instead of SELECT *

11. 使用列投影而非SELECT *

Select only the columns you need.

SELECT *

increases network payload, memory usage, CPU cost, and prevents narrower index-only scans.

sql

-- Avoid
SELECT * FROM users WHERE id = 101;

-- Preferred
SELECT name, email FROM users WHERE id = 101;

Schema evolution impact: If a later schema change adds

profile_picture BYTEA

, queries using

SELECT *

automatically pull that extra data. Explicit projections avoid this hidden performance regression.

仅选择需要的列。

SELECT *

会增加网络负载、内存占用、CPU成本，并阻止更窄的索引仅扫描。

sql

-- Avoid
SELECT * FROM users WHERE id = 101;

-- Preferred
SELECT name, email FROM users WHERE id = 101;

schema演进影响： 如果后续schema变更添加

profile_picture BYTEA

，使用

SELECT *

的查询会自动获取额外数据。显式投影可避免这种隐藏的性能退化。

12. Design Keys and Indexes to Distribute Load

12. 设计键与索引以分散负载

Sequential or monotonically increasing primary keys create write hotspots. Keys and indexes should distribute reads and writes across ranges evenly.

Hotspot anti-pattern:

sql

CREATE TABLE orders (
  id SERIAL PRIMARY KEY,
  customer_id UUID,
  region STRING
);

Randomized key:

sql

CREATE TABLE orders (
  id UUID PRIMARY KEY DEFAULT gen_random_uuid(),
  customer_id UUID,
  region STRING
);

Hash-sharded index:

sql

CREATE INDEX orders_by_id_hash
ON orders (id)
USING HASH SHARDED WITH BUCKET_COUNT = 16;

Composite key for natural distribution:

sql

CREATE TABLE sales (
  region_id STRING,
  order_id UUID DEFAULT gen_random_uuid(),
  PRIMARY KEY (region_id, order_id)
);

Enforce explicit PKs cluster-wide:

sql

SET CLUSTER SETTING sql.defaults.require_explicit_primary_keys.enabled = true;

连续或单调递增的主键会造成写入热点。键与索引应将读写均匀分布到各个范围。

热点反模式：

sql

CREATE TABLE orders (
  id SERIAL PRIMARY KEY,
  customer_id UUID,
  region STRING
);

随机化键：

sql

CREATE TABLE orders (
  id UUID PRIMARY KEY DEFAULT gen_random_uuid(),
  customer_id UUID,
  region STRING
);

哈希分片索引：

sql

CREATE INDEX orders_by_id_hash
ON orders (id)
USING HASH SHARDED WITH BUCKET_COUNT = 16;

自然分布的复合键：

sql

CREATE TABLE sales (
  region_id STRING,
  order_id UUID DEFAULT gen_random_uuid(),
  PRIMARY KEY (region_id, order_id)
);

集群范围内强制显式主键：

sql

SET CLUSTER SETTING sql.defaults.require_explicit_primary_keys.enabled = true;

13. Configure Connection Pooling

13. 配置连接池

Opening new database connections is expensive. Pooling reuses live connections to improve performance and prevent overload.

HikariCP guidance:

yaml

maximumPoolSize: (vCPUs * 4) / number_of_pool_instances
minimumIdle: equal to maximumPoolSize
maxLifetime: 30 min (add jitter +/- 5 min)
idleTimeout: 5-10 min typical
keepaliveTime: slightly shorter than infrastructure timeout (~5 min)
connectionTimeout: 10-30 s typical
autoCommit: true unless using explicit transactions only

Example stable configuration:

yaml

maximum-pool-size: 12
minimum-idle: 12
max-lifetime: 1800000
idle-timeout: 600000
keepalive-time: 300000
connection-timeout: 10000
auto-commit: true
pool-name: ingestionPool

打开新数据库连接成本很高。连接池复用活跃连接以提升性能并防止过载。

HikariCP指南：

yaml

maximumPoolSize: (vCPUs * 4) / number_of_pool_instances
minimumIdle: equal to maximumPoolSize
maxLifetime: 30 min (add jitter +/- 5 min)
idleTimeout: 5-10 min typical
keepaliveTime: slightly shorter than infrastructure timeout (~5 min)
connectionTimeout: 10-30 s typical
autoCommit: true unless using explicit transactions only

示例稳定配置：

yaml

maximum-pool-size: 12
minimum-idle: 12
max-lifetime: 1800000
idle-timeout: 600000
keepalive-time: 300000
connection-timeout: 10000
auto-commit: true
pool-name: ingestionPool

14. Separate Business Logic from Database Logic

14. 分离业务逻辑与数据库逻辑

CockroachDB should manage ACID reads, writes, and schema-level integrity. The application layer should orchestrate workflows, external services, queues, and long-running work.

Inside the transaction:

Reads, writes, constraints, short guarded state transitions

Outside the transaction:

HTTP calls, RPC/service calls, email, payment providers, queue publishing

Asynchronous workflow pattern:

python

def handle_order(order):
    db.execute("INSERT INTO orders (id, status) VALUES (%s, %s)", (order.id, 'PENDING'))
    publish_event('process_order', {'order_id': order.id})

CockroachDB应管理ACID读写和schema级完整性。应用层应编排工作流、外部服务、队列和长时间运行的工作。

事务内：

读取、写入、约束、短时间的有条件状态转换

事务外：

HTTP调用、RPC/服务调用、邮件、支付提供商、队列发布

异步工作流模式：

python

def handle_order(order):
    db.execute("INSERT INTO orders (id, status) VALUES (%s, %s)", (order.id, 'PENDING'))
    publish_event('process_order', {'order_id': order.id})

15. Respect the 16MB Transaction Payload Limit

15. 遵守16MB事务负载限制

CockroachDB has a practical limit of ~16MB per transaction payload. This limit applies to the TOTAL data written in a single transaction, not just individual rows.

Two ways to hit the limit:

One large row (e.g., a 15MB JSON document)
Many moderate rows in one transaction (e.g., 25 INSERTs of 500KB each = 12.5MB)

Guidelines:

Keep individual rows under 1MB
Keep total transaction payload under 4MB
Limit transactions to <10 statements
Chunk large documents into 64-256KB pieces
Store blobs >1MB in object storage (S3/GCS) with a database reference
Break multi-statement transactions into smaller batches (commit every 5-10 statements)

Exceeding the limit causes
split failed while applying backpressure to Put
errors: large Raft proposals block consensus, range splits stall, and the system applies backpressure.

CockroachDB的单个事务负载实际限制约为16MB。此限制适用于单个事务中写入的总数据，而非仅单行数据。

触发限制的两种情况：

单行数据过大（如15MB的JSON文档）
单个事务中包含多个中等大小的行（如25次插入500KB的数据，总计12.5MB）

指南：

保持单行数据在1MB以下
保持事务总负载在4MB以下
限制事务包含的语句数<10条
将大文档拆分为64-256KB的片段
将大于1MB的blob存储在对象存储（S3/GCS）中，仅在数据库中保存引用
将多语句事务拆分为更小的批次（每5-10条语句提交一次）

超过限制会导致
split failed while applying backpressure to Put
错误：大型Raft提案会阻塞共识，范围拆分停滞，系统会施加背压。

16. Use Session Guardrails

16. 使用会话防护

Set session-level guardrails to catch runaway queries and missing WHERE clauses during development and testing:

sql

SET transaction_rows_read_err = 10000;
SET transaction_rows_written_err = 1000;

These cause transactions that exceed the thresholds to fail with an explicit error rather than silently consuming cluster resources.

设置会话级防护以在开发和测试期间捕获失控查询和缺失WHERE子句的情况：

sql

SET transaction_rows_read_err = 10000;
SET transaction_rows_written_err = 1000;

这些设置会使超过阈值的事务失败并返回明确错误，而非静默消耗集群资源。

17. Test and Optimize Under Concurrency

17. 在并发环境下测试与优化

Single-user correctness is not sufficient. Test with realistic concurrency to surface retries, hotspots, contention, and workload-specific bottlenecks.

Quick start:

bash

cockroach workload init bank 'postgresql://root@localhost:26257?sslmode=disable'
cockroach workload run bank --concurrency=64 --duration=10m

See monitoring-and-concurrency-testing for detailed contention queries, validation checklists, and Prometheus metrics.

单用户正确性并不足够。使用真实并发场景测试以发现重试、热点、竞争和特定工作负载的瓶颈。

快速开始：

bash

cockroach workload init bank 'postgresql://root@localhost:26257?sslmode=disable'
cockroach workload run bank --concurrency=64 --duration=10m

有关详细的竞争查询、验证清单和Prometheus指标，请参阅monitoring-and-concurrency-testing。

18. Monitor for Performance and Contention

18. 监控性能与竞争

Actively monitor query latency, contention, retries, and data distribution using

EXPLAIN ANALYZE

crdb_internal.transaction_contention_events

, DB Console SQL Activity, and Key Visualizer.

See monitoring-and-concurrency-testing for live contention queries, Prometheus metrics, and external monitoring integration.

使用

EXPLAIN ANALYZE

、

crdb_internal.transaction_contention_events

、DB Console SQL Activity和Key Visualizer主动监控查询延迟、竞争、重试和数据分布。

有关实时竞争查询、Prometheus指标和外部监控集成，请参阅monitoring-and-concurrency-testing。

Decision Guide

决策指南

Scenario	Recommended Pattern
Single SQL statement	Implicit transaction (autocommit)
Multiple statements, all-or-nothing	Explicit transaction with retry loop
Read current state before write on hot rows	`SELECT ... FOR UPDATE`
Historical, display, or reporting read	`AS OF SYSTEM TIME` / follower reads
Batch of records in memory	`UNNEST` / `VALUES` / batch SQL
Multi-step business rule in one operation	Single-statement CTE

场景	推荐模式
单个SQL语句	隐式事务（自动提交）
多个语句，需原子执行	带重试循环的显式事务
热点行写入前读取当前状态	`SELECT ... FOR UPDATE`
历史数据、展示或报表读取	`AS OF SYSTEM TIME` / follower reads
内存中的批量记录	`UNNEST` / `VALUES` / 批量SQL
单操作中的多步骤业务规则	单语句CTE

Safety Considerations

安全注意事项

Always implement retry logic for
```
40001
```
serialization errors
Make operations idempotent so retries do not cause duplicate side effects (use
```
INSERT ... ON CONFLICT DO NOTHING
```
)
Do not use stale snapshot reads as authoritative preconditions for writes
Do not run
```
EXPLAIN ANALYZE
```
on production queries that modify data
Be cautious adding indexes to high-traffic tables during peak hours

始终为
```
40001
```
序列化错误实现重试逻辑
确保操作是幂等的，以便重试不会导致重复副作用（使用
```
INSERT ... ON CONFLICT DO NOTHING
```
）
不要将过时的快照读取作为写入的权威前提
不要在生产环境中对修改数据的查询运行
```
EXPLAIN ANALYZE
```
在高峰时段为高流量表添加索引时要谨慎

References

参考资料

CockroachDB Transactions Documentation
Advanced Client-Side Transaction Retries
SQL Performance Best Practices
Follower Reads and Bounded Staleness
Optimize Statement Performance
Row-Level TTL
Schema Design and Indexes
SQL Injection Prevention
Architecture: Transaction Layer
JPA Best Practices: Explicit and Implicit Transactions
Deep Dive into Transaction Retry Failures
Comparing Multi-Statement vs Single-Statement Transactions
Set-Based Operations with CockroachDB
Bulk Rewrites with the CockroachDB JDBC Driver
What is a Database Hotspot?
CockroachDB Transaction Demo
CockroachDB Best Practices & Anti-Patterns Demo -- 10 runnable Java demos covering retries, batching, PK hotspots, guardrails, chunking, and multi-region
CockroachDB JDBC Wrapper -- automatic retry library for Java/JDBC applications

CockroachDB Transactions Documentation
Advanced Client-Side Transaction Retries
SQL Performance Best Practices
Follower Reads and Bounded Staleness
Optimize Statement Performance
Row-Level TTL
Schema Design and Indexes
SQL Injection Prevention
Architecture: Transaction Layer
JPA Best Practices: Explicit and Implicit Transactions
Deep Dive into Transaction Retry Failures
Comparing Multi-Statement vs Single-Statement Transactions
Set-Based Operations with CockroachDB
Bulk Rewrites with the CockroachDB JDBC Driver
What is a Database Hotspot?
CockroachDB Transaction Demo
CockroachDB Best Practices & Anti-Patterns Demo -- 10个可运行的Java演示，涵盖重试、批量处理、主键热点、防护、分块和多区域
CockroachDB JDBC Wrapper -- 用于Java/JDBC应用的自动重试库