high-concurrency-scalability
Compare original and translation side by side
🇺🇸
Original
English🇨🇳
Translation
ChineseHigh Concurrency & Scalability
高并发与可扩展性
When to Use
适用场景
- Choose or refactor concurrency models—threads, async/await, actors, coroutines—for target throughput and latency
- Reduce lock contention and design low-contention, lock-free, or partitioned data paths
- Size connection pools, file descriptors, thread pools, and memory limits per dependency
- Design caching layers, TTL strategy, and stampede / thundering-herd mitigation
- Plan horizontal scaling, load balancing, session affinity, and stateless vs sticky tradeoffs
- Apply backpressure, bounded queues, rate limiting, and bulkheads under overload
- Scale the data layer—read replicas, routing, sharding concepts, pool tuning, hot keys
- Profile bottlenecks, model capacity, and tie scale triggers to SLOs and error budgets
- Define autoscaling signals, warm pools, and cold-start vs cost tradeoffs
- Architect multi-region read paths and CDN/edge caching at a design level
- 选择或重构并发模型——线程、async/await、Actor、协程——以满足目标吞吐量和延迟要求
- 减少锁竞争,设计低竞争、无锁或分区的数据路径
- 根据依赖项调整连接池、文件描述符、线程池和内存限制的大小
- 设计缓存层、TTL策略,以及缓解缓存击穿/惊群问题的方案
- 规划水平扩展、负载均衡、会话亲和性,以及无状态与粘性会话的权衡
- 在过载场景下应用背压、有界队列、限流和舱壁模式
- 扩展数据层——只读副本、路由、分片概念、池调优、热点键处理
- 分析性能瓶颈,构建容量模型,并将伸缩触发条件与SLO和错误预算绑定
- 定义自动扩缩容信号、预热池,以及冷启动与成本的权衡
- 在设计层面构建多区域读取路径和CDN/边缘缓存
When NOT to Use
不适用场景
- Decompose monoliths into bounded contexts and inter-service contracts only →
microservices-developer - Event schemas, broker selection, and messaging topology only →
event-driven-architecture - General feature delivery, RFCs, or CRUD without scale focus →
senior-software-engineer - Org-wide SLO program, on-call, incident response, and error-budget policy →
site-reliability-engineer - Deep flame graphs, load-test harnesses, and p99 regression hunts as the main task →
performance-engineer - Kubernetes platform golden paths and IDP product work →
platform-engineer - VPC, managed service provisioning, and landing-zone IaC →
cloud-engineer - Cloud spend optimization and unit economics only → ,
cloud-economistfinops-analyst
- 仅涉及将单体应用拆分为限界上下文和服务间契约 →
microservices-developer - 仅涉及事件 schema、消息代理选择和消息拓扑 →
event-driven-architecture - 仅涉及通用功能交付、RFC或无伸缩性关注的CRUD →
senior-software-engineer - 仅涉及全组织范围的SLO计划、值班、事件响应和错误预算政策 →
site-reliability-engineer - 主要任务为深度火焰图分析、负载测试工具开发和p99延迟回归排查 →
performance-engineer - 仅涉及Kubernetes平台黄金路径和IDP产品工作 →
platform-engineer - 仅涉及VPC、托管服务配置和着陆区IaC →
cloud-engineer - 仅涉及云支出优化和单位经济分析 → ,
cloud-economistfinops-analyst
Related skills
相关技能
| Need | Skill |
|---|---|
| Service boundaries, sagas, circuit breakers between services | |
| Brokers, topics, event contracts, outbox | |
| Profiling, load/soak tests, latency budgets | |
| SLI/SLO programs, incident reliability, toil | |
| Internal platform, K8s abstractions, golden paths | |
| Cloud compute, networking, DR multi-region deploy | |
| Application design and refactoring | |
| 需求 | 技能 |
|---|---|
| 服务边界、事务补偿(sagas)、服务间断路器 | |
| 消息代理、主题、事件契约、发件箱模式 | |
| 性能分析、负载/Soak测试、延迟预算 | |
| SLI/SLO计划、事件可靠性、重复性工作(toil) | |
| 内部平台、K8s抽象、黄金路径 | |
| 云计算、网络、多区域灾难恢复部署 | |
| 应用设计与重构 | |
Core Workflows
核心工作流
1. Scope and constraints
1. 范围与约束
Clarify traffic shape, SLOs, statefulness, and failure modes.
See .
references/high_concurrency_scalability_scope.md明确流量形态、SLO、状态性和故障模式。
详见 。
references/high_concurrency_scalability_scope.md2. Concurrency and synchronization
2. 并发与同步
Pick execution model; partition work; minimize shared mutable state.
See .
references/concurrency_models_and_synchronization.md选择执行模型;划分工作;最小化共享可变状态。
详见 。
references/concurrency_models_and_synchronization.md3. Caching and data-layer scale
3. 缓存与数据层扩展
Cache hierarchy, replica routing, sharding and hot-key mitigation.
See .
references/caching_and_data_layer_scale.md缓存层级、副本路由、分片与热点键缓解方案。
详见 。
references/caching_and_data_layer_scale.md4. Throughput, backpressure, and queues
4. 吞吐量、背压与队列
Bounded queues, shedding, rate limits, and async pipelines.
See .
references/throughput_backpressure_and_queues.md有界队列、流量削峰、限流和异步流水线。
详见 。
references/throughput_backpressure_and_queues.md5. Horizontal scale and load distribution
5. 水平扩展与负载分发
Replicas, LB algorithms, affinity, autoscaling triggers.
See .
references/horizontal_scaling_and_load_distribution.md副本、负载均衡算法、亲和性、自动扩缩容触发条件。
详见 。
references/horizontal_scaling_and_load_distribution.md6. Capacity, observability, and SLO-driven scale
6. 容量、可观测性与基于SLO的伸缩
Metrics, headroom models, scale policies tied to objectives.
See .
references/capacity_planning_observability_slo.md指标、预留容量模型、与目标绑定的伸缩策略。
详见 。
references/capacity_planning_observability_slo.mdOutputs
输出成果
- Scale brief — workload profile, bottlenecks, target RPS/latency, state assumptions
- Concurrency note — model choice, pool sizes, contention risks, partitioning plan
- Cache and data plan — layers, TTL, invalidation, replica/shard routing, hot-key mitigations
- Overload matrix — backpressure, rate limits, bulkheads, degradation modes
- Capacity model — headroom, scale triggers, cold-start impact, cost sensitivity
- Observability checklist — saturation, queue depth, pool wait, cache hit rate, tail latency
- 伸缩简报 — 工作负载概况、瓶颈、目标RPS/延迟、状态假设
- 并发说明 — 模型选择、池大小、竞争风险、分区方案
- 缓存与数据计划 — 层级、TTL、失效策略、副本/分片路由、热点键缓解方案
- 过载处理矩阵 — 背压、限流、舱壁、降级模式
- 容量模型 — 预留容量、伸缩触发条件、冷启动影响、成本敏感度
- 可观测性检查清单 — 饱和度、队列深度、池等待时间、缓存命中率、尾部延迟
Principles
原则
- Measure saturation—CPU, memory, I/O, pool wait, queue depth—not averages alone
- Bound everything—connections, threads, queue length, in-flight requests
- Prefer partition over lock—shard by key, actor mailbox, or isolated replica
- Design for overload—shed load deliberately; never unbounded retry or queue growth
- Scale on SLO signals—error rate and tail latency, not CPU alone
- Keep hot paths stateless where possible; isolate stateful tiers explicitly
- 测量饱和度——关注CPU、内存、I/O、池等待时间、队列深度,而非仅依赖平均值
- 限制所有资源——连接数、线程数、队列长度、在处理请求数
- 优先分区而非锁——按键分片、Actor邮箱或独立副本
- 为过载场景设计——主动削峰;绝不允许无限制重试或队列增长
- 基于SLO信号伸缩——依据错误率和尾部延迟,而非仅依赖CPU使用率
- 尽可能保持热路径无状态;明确隔离有状态层级