high-concurrency-scalability

Compare original and translation side by side

🇺🇸

Original

English
🇨🇳

Translation

Chinese

High Concurrency & Scalability

高并发与可扩展性

When to Use

适用场景

  • Choose or refactor concurrency models—threads, async/await, actors, coroutines—for target throughput and latency
  • Reduce lock contention and design low-contention, lock-free, or partitioned data paths
  • Size connection pools, file descriptors, thread pools, and memory limits per dependency
  • Design caching layers, TTL strategy, and stampede / thundering-herd mitigation
  • Plan horizontal scaling, load balancing, session affinity, and stateless vs sticky tradeoffs
  • Apply backpressure, bounded queues, rate limiting, and bulkheads under overload
  • Scale the data layer—read replicas, routing, sharding concepts, pool tuning, hot keys
  • Profile bottlenecks, model capacity, and tie scale triggers to SLOs and error budgets
  • Define autoscaling signals, warm pools, and cold-start vs cost tradeoffs
  • Architect multi-region read paths and CDN/edge caching at a design level
  • 选择或重构并发模型——线程、async/await、Actor、协程——以满足目标吞吐量和延迟要求
  • 减少锁竞争,设计低竞争、无锁或分区的数据路径
  • 根据依赖项调整连接池、文件描述符、线程池和内存限制的大小
  • 设计缓存层、TTL策略,以及缓解缓存击穿/惊群问题的方案
  • 规划水平扩展负载均衡、会话亲和性,以及无状态与粘性会话的权衡
  • 在过载场景下应用背压、有界队列、限流和舱壁模式
  • 扩展数据层——只读副本、路由、分片概念、池调优、热点键处理
  • 分析性能瓶颈,构建容量模型,并将伸缩触发条件与SLO和错误预算绑定
  • 定义自动扩缩容信号、预热池,以及冷启动与成本的权衡
  • 在设计层面构建多区域读取路径和CDN/边缘缓存

When NOT to Use

不适用场景

  • Decompose monoliths into bounded contexts and inter-service contracts only →
    microservices-developer
  • Event schemas, broker selection, and messaging topology only →
    event-driven-architecture
  • General feature delivery, RFCs, or CRUD without scale focus →
    senior-software-engineer
  • Org-wide SLO program, on-call, incident response, and error-budget policy →
    site-reliability-engineer
  • Deep flame graphs, load-test harnesses, and p99 regression hunts as the main task →
    performance-engineer
  • Kubernetes platform golden paths and IDP product work →
    platform-engineer
  • VPC, managed service provisioning, and landing-zone IaC →
    cloud-engineer
  • Cloud spend optimization and unit economics only →
    cloud-economist
    ,
    finops-analyst
  • 仅涉及将单体应用拆分为限界上下文和服务间契约 →
    microservices-developer
  • 仅涉及事件 schema、消息代理选择和消息拓扑 →
    event-driven-architecture
  • 仅涉及通用功能交付、RFC或无伸缩性关注的CRUD →
    senior-software-engineer
  • 仅涉及全组织范围的SLO计划、值班、事件响应和错误预算政策 →
    site-reliability-engineer
  • 主要任务为深度火焰图分析、负载测试工具开发和p99延迟回归排查 →
    performance-engineer
  • 仅涉及Kubernetes平台黄金路径和IDP产品工作 →
    platform-engineer
  • 仅涉及VPC、托管服务配置和着陆区IaC →
    cloud-engineer
  • 仅涉及云支出优化和单位经济分析 →
    cloud-economist
    ,
    finops-analyst

Related skills

相关技能

NeedSkill
Service boundaries, sagas, circuit breakers between services
microservices-developer
Brokers, topics, event contracts, outbox
event-driven-architecture
Profiling, load/soak tests, latency budgets
performance-engineer
SLI/SLO programs, incident reliability, toil
site-reliability-engineer
Internal platform, K8s abstractions, golden paths
platform-engineer
Cloud compute, networking, DR multi-region deploy
cloud-engineer
Application design and refactoring
senior-software-engineer
需求技能
服务边界、事务补偿(sagas)、服务间断路器
microservices-developer
消息代理、主题、事件契约、发件箱模式
event-driven-architecture
性能分析、负载/Soak测试、延迟预算
performance-engineer
SLI/SLO计划、事件可靠性、重复性工作(toil)
site-reliability-engineer
内部平台、K8s抽象、黄金路径
platform-engineer
云计算、网络、多区域灾难恢复部署
cloud-engineer
应用设计与重构
senior-software-engineer

Core Workflows

核心工作流

1. Scope and constraints

1. 范围与约束

Clarify traffic shape, SLOs, statefulness, and failure modes.
See
references/high_concurrency_scalability_scope.md
.
明确流量形态、SLO、状态性和故障模式。
详见
references/high_concurrency_scalability_scope.md

2. Concurrency and synchronization

2. 并发与同步

Pick execution model; partition work; minimize shared mutable state.
See
references/concurrency_models_and_synchronization.md
.
选择执行模型;划分工作;最小化共享可变状态。
详见
references/concurrency_models_and_synchronization.md

3. Caching and data-layer scale

3. 缓存与数据层扩展

Cache hierarchy, replica routing, sharding and hot-key mitigation.
See
references/caching_and_data_layer_scale.md
.
缓存层级、副本路由、分片与热点键缓解方案。
详见
references/caching_and_data_layer_scale.md

4. Throughput, backpressure, and queues

4. 吞吐量、背压与队列

Bounded queues, shedding, rate limits, and async pipelines.
See
references/throughput_backpressure_and_queues.md
.
有界队列、流量削峰、限流和异步流水线。
详见
references/throughput_backpressure_and_queues.md

5. Horizontal scale and load distribution

5. 水平扩展与负载分发

Replicas, LB algorithms, affinity, autoscaling triggers.
See
references/horizontal_scaling_and_load_distribution.md
.
副本、负载均衡算法、亲和性、自动扩缩容触发条件。
详见
references/horizontal_scaling_and_load_distribution.md

6. Capacity, observability, and SLO-driven scale

6. 容量、可观测性与基于SLO的伸缩

Metrics, headroom models, scale policies tied to objectives.
See
references/capacity_planning_observability_slo.md
.
指标、预留容量模型、与目标绑定的伸缩策略。
详见
references/capacity_planning_observability_slo.md

Outputs

输出成果

  • Scale brief — workload profile, bottlenecks, target RPS/latency, state assumptions
  • Concurrency note — model choice, pool sizes, contention risks, partitioning plan
  • Cache and data plan — layers, TTL, invalidation, replica/shard routing, hot-key mitigations
  • Overload matrix — backpressure, rate limits, bulkheads, degradation modes
  • Capacity model — headroom, scale triggers, cold-start impact, cost sensitivity
  • Observability checklist — saturation, queue depth, pool wait, cache hit rate, tail latency
  • 伸缩简报 — 工作负载概况、瓶颈、目标RPS/延迟、状态假设
  • 并发说明 — 模型选择、池大小、竞争风险、分区方案
  • 缓存与数据计划 — 层级、TTL、失效策略、副本/分片路由、热点键缓解方案
  • 过载处理矩阵 — 背压、限流、舱壁、降级模式
  • 容量模型 — 预留容量、伸缩触发条件、冷启动影响、成本敏感度
  • 可观测性检查清单 — 饱和度、队列深度、池等待时间、缓存命中率、尾部延迟

Principles

原则

  • Measure saturation—CPU, memory, I/O, pool wait, queue depth—not averages alone
  • Bound everything—connections, threads, queue length, in-flight requests
  • Prefer partition over lock—shard by key, actor mailbox, or isolated replica
  • Design for overload—shed load deliberately; never unbounded retry or queue growth
  • Scale on SLO signals—error rate and tail latency, not CPU alone
  • Keep hot paths stateless where possible; isolate stateful tiers explicitly
  • 测量饱和度——关注CPU、内存、I/O、池等待时间、队列深度,而非仅依赖平均值
  • 限制所有资源——连接数、线程数、队列长度、在处理请求数
  • 优先分区而非锁——按键分片、Actor邮箱或独立副本
  • 为过载场景设计——主动削峰;绝不允许无限制重试或队列增长
  • 基于SLO信号伸缩——依据错误率和尾部延迟,而非仅依赖CPU使用率
  • 尽可能保持热路径无状态;明确隔离有状态层级