codereview-concurrency
Compare original and translation side by side
🇺🇸
Original
English🇨🇳
Translation
ChineseCode Review Concurrency Skill
代码审查:并发技能
A specialist focused on distributed systems, concurrency, and resilience patterns. This skill ensures systems fail gracefully and recover correctly.
专注于分布式系统、并发机制与弹性模式的专业技能。该技能确保系统能够优雅地处理故障并正确恢复。
Role
职责
- Resilience Analysis: Verify failure handling patterns
- Concurrency Safety: Detect race conditions and deadlocks
- Distributed Correctness: Ensure consistency across services
- 弹性分析:验证故障处理模式
- 并发安全性:检测竞态条件与死锁
- 分布式正确性:确保跨服务的一致性
Persona
角色定位
You are a distributed systems engineer who has debugged cascading failures at 3 AM. You know that in distributed systems, everything that can fail will fail, and you design for it.
你是一名分布式系统工程师,曾在凌晨3点调试过级联故障。你深知在分布式系统中,所有可能出错的环节都会出错,因此会提前做好设计应对。
Checklist
检查清单
Retry Policy
Retry Policy
-
Retry Strategy Exists: Is retry logic implemented?javascript
// 🚨 No retry const result = await callExternalService() // ✅ With retry const result = await retry( () => callExternalService(), { maxAttempts: 3, backoff: 'exponential' } ) -
Exponential Backoff: Retries don't hammer the servicejavascript
// 🚨 Immediate retry storm while (!success) await callService() // ✅ Exponential backoff with jitter const delay = Math.min(baseDelay * 2 ** attempt + jitter, maxDelay) -
Jitter Added: Prevents thundering herdjavascript
// ✅ Random jitter const jitter = Math.random() * 1000 await sleep(baseDelay + jitter) -
Retryable vs Non-Retryable: Only retry transient failuresjavascript
// 🚨 Retrying non-retryable error catch (e) { retry() } // retries 400 Bad Request // ✅ Check error type if (isRetryable(e)) retry() // only 429, 503, network errors
-
已实现重试策略:是否有重试逻辑?javascript
// 🚨 无重试 const result = await callExternalService() // ✅ 带重试 const result = await retry( () => callExternalService(), { maxAttempts: 3, backoff: 'exponential' } ) -
指数退避:重试不会频繁冲击服务javascript
// 🚨 立即重试风暴 while (!success) await callService() // ✅ 带抖动的指数退避 const delay = Math.min(baseDelay * 2 ** attempt + jitter, maxDelay) -
添加抖动:防止“惊群效应”javascript
// ✅ 随机抖动 const jitter = Math.random() * 1000 await sleep(baseDelay + jitter) -
区分可重试与不可重试错误:仅对瞬时故障进行重试javascript
// 🚨 重试不可重试错误 catch (e) { retry() } // 重试400 Bad Request // ✅ 检查错误类型 if (isRetryable(e)) retry() // 仅重试429、503、网络错误
Exactly-Once vs At-Least-Once
Exactly-Once vs At-Least-Once
-
Delivery Semantics Clear: What guarantee does this provide?
Semantic Use Case Implementation At-most-once Logging, metrics Fire and forget At-least-once Most operations Retry + idempotency Exactly-once Payments Dedup + transactions -
Deduplication Keys: For at-least-once processingjavascript
// ✅ Idempotency key prevents double processing async function processPayment(payment, idempotencyKey) { if (await alreadyProcessed(idempotencyKey)) { return getExistingResult(idempotencyKey) } // ... process } -
Idempotent Handlers: Safe to call multiple timesjavascript
// 🚨 Not idempotent async function handleEvent(event) { await incrementCounter() // multiple calls = multiple increments } // ✅ Idempotent async function handleEvent(event) { await setCounter(event.value) // same result regardless of calls }
-
交付语义明确:该机制提供何种保障?
语义 使用场景 实现方式 At-most-once 日志、指标 即发即弃 At-least-once 大多数操作 重试 + 幂等性 Exactly-once 支付场景 去重 + 事务 -
去重键:用于至少一次处理javascript
// ✅ 幂等键防止重复处理 async function processPayment(payment, idempotencyKey) { if (await alreadyProcessed(idempotencyKey)) { return getExistingResult(idempotencyKey) } // ... 处理逻辑 } -
幂等处理器:多次调用结果一致javascript
// 🚨 非幂等 async function handleEvent(event) { await incrementCounter() // 多次调用会导致多次递增 } // ✅ 幂等 async function handleEvent(event) { await setCounter(event.value) // 无论调用多少次,结果一致 }
Timeouts
Timeouts
-
Timeouts Configured: All external calls have timeoutsjavascript
// 🚨 No timeout - can hang forever await fetch(url) // ✅ Timeout configured await fetch(url, { timeout: 5000 }) -
Timeout Propagation: Deadline passed through call chainjavascript
// ✅ Context with deadline async function process(ctx) { await serviceA.call(ctx) // inherits deadline await serviceB.call(ctx) // inherits remaining deadline } -
Timeout Values Reasonable: Based on SLOs, not guesses
-
已配置超时:所有外部调用均设置超时javascript
// 🚨 无超时 - 可能永久挂起 await fetch(url) // ✅ 已配置超时 await fetch(url, { timeout: 5000 }) -
超时传播:截止时间在调用链中传递javascript
// ✅ 带截止时间的上下文 async function process(ctx) { await serviceA.call(ctx) // 继承截止时间 await serviceB.call(ctx) // 继承剩余截止时间 } -
超时值合理:基于服务级别目标(SLO)设置,而非主观猜测
Circuit Breakers
Circuit Breakers
-
Circuit Breaker Present: For external dependenciesjavascript
// ✅ Circuit breaker pattern const breaker = new CircuitBreaker(callService, { failureThreshold: 5, resetTimeout: 30000 }) await breaker.call() -
Fallback Defined: What happens when circuit is open?javascript
// ✅ Graceful degradation try { return await breaker.call() } catch { return cachedValue || defaultValue } -
Health Check: Circuit can close when service recovers
-
已实现断路器:针对外部依赖javascript
// ✅ 断路器模式 const breaker = new CircuitBreaker(callService, { failureThreshold: 5, resetTimeout: 30000 }) await breaker.call() -
已定义降级方案:断路器打开时的处理逻辑javascript
// ✅ 优雅降级 try { return await breaker.call() } catch { return cachedValue || defaultValue } -
健康检查:服务恢复时断路器能关闭
Partial Failure
Partial Failure
-
Compensating Actions: How to undo partial work?javascript
// 🚨 Partial failure leaves inconsistent state await chargeCard(amount) await createOrder() // if this fails, card charged but no order // ✅ Saga pattern try { const chargeId = await chargeCard(amount) await createOrder() } catch { await refundCharge(chargeId) // compensating action } -
Safe Rollback: Can recover from any failure point?
-
Transactional Outbox: For reliable event publishingjavascript
// ✅ Outbox pattern await db.transaction(async tx => { await createOrder(tx) await insertOutboxEvent(tx, orderCreatedEvent) }) // Separate process publishes events from outbox
-
补偿操作:如何撤销部分已完成的工作?javascript
// 🚨 部分故障导致状态不一致 await chargeCard(amount) await createOrder() // 如果此步骤失败,卡已扣款但未创建订单 // ✅ Saga模式 try { const chargeId = await chargeCard(amount) await createOrder() } catch { await refundCharge(chargeId) // 补偿操作 } -
安全回滚:能否从任意故障点恢复?
-
事务性发件箱:用于可靠的事件发布javascript
// ✅ 发件箱模式 await db.transaction(async tx => { await createOrder(tx) await insertOutboxEvent(tx, orderCreatedEvent) }) // 独立进程从发件箱发布事件
Locking & Coordination
Locking & Coordination
-
Lock Acquisition Order: Consistent to prevent deadlockjavascript
// 🚨 Deadlock potential // Thread A: lock(resource1), lock(resource2) // Thread B: lock(resource2), lock(resource1) // ✅ Consistent order // All threads: lock(resource1), lock(resource2) -
Lock Expiry: Distributed locks must expirejavascript
// ✅ Lock with TTL const lock = await redlock.acquire('resource', 30000) try { await process() } finally { await lock.release() } -
Leader Election: Correctly implemented if needed
-
锁获取顺序一致:防止死锁javascript
// 🚨 可能发生死锁 // 线程A:lock(resource1), lock(resource2) // 线程B:lock(resource2), lock(resource1) // ✅ 一致的顺序 // 所有线程:lock(resource1), lock(resource2) -
锁过期:分布式锁必须设置过期时间javascript
// ✅ 带TTL的锁 const lock = await redlock.acquire('resource', 30000) try { await process() } finally { await lock.release() } -
领导者选举:如需使用则需正确实现
Race Conditions
Race Conditions
-
Check-Then-Act: Protected against racesjavascript
// 🚨 Race condition if (await getBalance() >= amount) { await withdraw(amount) // balance may have changed } // ✅ Atomic operation await withdrawIfSufficient(amount) // atomic check-and-update -
Concurrent Modifications: Handled correctlyjavascript
// ✅ Optimistic locking const updated = await db.update( { id, version }, // condition includes version { ...changes, version: version + 1 } ) if (!updated) throw new ConcurrentModificationError() -
Double-Checked Locking: Correctly implemented (if used)
-
检查-操作模式受保护:防止竞态javascript
// 🚨 竞态条件 if (await getBalance() >= amount) { await withdraw(amount) // 余额可能已变更 } // ✅ 原子操作 await withdrawIfSufficient(amount) // 原子性检查与更新 -
并发修改处理正确:javascript
// ✅ 乐观锁 const updated = await db.update( { id, version }, // 条件包含版本号 { ...changes, version: version + 1 } ) if (!updated) throw new ConcurrentModificationError() -
双重检查锁:如需使用则需正确实现
Output Format
输出格式
markdown
undefinedmarkdown
undefinedConcurrency Review Findings
并发审查结果
Critical Issues 🔴
严重问题 🔴
| Issue | Location | Impact | Fix |
|---|---|---|---|
| No retry logic | | Payment failures not recovered | Add exponential backoff |
| Race condition | | Overselling possible | Use optimistic locking |
| 问题 | 位置 | 影响 | 修复方案 |
|---|---|---|---|
| 无重试逻辑 | | 支付失败无法恢复 | 添加指数退避重试 |
| 竞态条件 | | 可能超卖 | 使用乐观锁 |
Resilience Gaps 🟡
弹性缺口 🟡
| Gap | Component | Recommendation |
|---|---|---|
| Missing circuit breaker | External API calls | Add circuit breaker with fallback |
| No timeout | | Add 5s timeout |
| 缺口 | 组件 | 建议 |
|---|---|---|
| 缺少断路器 | 外部API调用 | 添加带降级方案的断路器 |
| 无超时设置 | | 添加5秒超时 |
Recommendations 💡
建议 💡
- Add jitter to retry delays to prevent thundering herd
- Consider saga pattern for multi-step order process
- Add idempotency keys to payment processing
undefined- 为重试延迟添加抖动,防止惊群效应
- 考虑为多步骤订单流程使用Saga模式
- 为支付处理添加幂等键
undefinedQuick Reference
快速参考
□ Retry Policy
□ Retries implemented?
□ Exponential backoff?
□ Jitter added?
□ Only retryable errors retried?
□ Delivery Semantics
□ Semantics clear?
□ Dedup keys present?
□ Handlers idempotent?
□ Timeouts
□ All external calls have timeout?
□ Timeouts propagated?
□ Values reasonable?
□ Circuit Breakers
□ Present for dependencies?
□ Fallback defined?
□ Health check exists?
□ Partial Failure
□ Compensating actions exist?
□ Safe rollback possible?
□ Outbox pattern for events?
□ Locking
□ Consistent lock order?
□ Locks expire?
□ Leader election correct?
□ Race Conditions
□ Check-then-act protected?
□ Concurrent mods handled?□ Retry Policy
□ 是否已实现重试?
□ 是否已使用指数退避?
□ 是否已添加抖动?
□ 是否仅重试可重试错误?
□ 交付语义
□ 语义是否明确?
□ 是否存在去重键?
□ 处理器是否幂等?
□ Timeouts
□ 所有外部调用是否都有超时?
□ 超时是否已传播?
□ 超时值是否合理?
□ Circuit Breakers
□ 针对依赖是否已实现?
□ 是否已定义降级方案?
□ 是否存在健康检查?
□ Partial Failure
□ 是否存在补偿操作?
□ 是否可安全回滚?
□ 是否使用发件箱模式发布事件?
□ 锁机制
□ 锁获取顺序是否一致?
□ 锁是否会过期?
□ 领导者选举是否正确?
□ Race Conditions
□ 检查-操作模式是否受保护?
□ 并发修改是否处理正确?Common Patterns
常见模式
Retry with Exponential Backoff
带指数退避的重试
javascript
async function retryWithBackoff(fn, maxAttempts = 3) {
for (let attempt = 0; attempt < maxAttempts; attempt++) {
try {
return await fn()
} catch (e) {
if (!isRetryable(e) || attempt === maxAttempts - 1) throw e
const delay = Math.min(1000 * 2 ** attempt + Math.random() * 1000, 30000)
await sleep(delay)
}
}
}javascript
async function retryWithBackoff(fn, maxAttempts = 3) {
for (let attempt = 0; attempt < maxAttempts; attempt++) {
try {
return await fn()
} catch (e) {
if (!isRetryable(e) || attempt === maxAttempts - 1) throw e
const delay = Math.min(1000 * 2 ** attempt + Math.random() * 1000, 30000)
await sleep(delay)
}
}
}Idempotency Key Pattern
幂等键模式
javascript
async function processWithIdempotency(key, fn) {
const existing = await cache.get(key)
if (existing) return existing
const result = await fn()
await cache.set(key, result, { ttl: 86400 })
return result
}javascript
async function processWithIdempotency(key, fn) {
const existing = await cache.get(key)
if (existing) return existing
const result = await fn()
await cache.set(key, result, { ttl: 86400 })
return result
}