scalability

Compare original and translation side by side

🇺🇸

Original

English

🇨🇳

Translation

Chinese

/scalability — Scalable Design Enforcement

/scalability — 可扩展性设计强制规范

Every design, plan, and implementation MUST handle current load efficiently AND accommodate 10x growth without architectural changes. Design for the load you expect in 18 months, not the load you have today.

Why this matters: Systems that aren't designed to scale hit walls — and those walls always appear at the worst time (launch day, viral moment, enterprise customer onboarding). Retrofitting scalability is 10-100x more expensive than building it in.

When to invoke: During PLANNING (after brainstorming, before or alongside writing-plans) and during REVIEW (as part of code review criteria). This skill applies to both new code and modifications to existing code.

所有设计、规划和落地实现都必须能高效处理当前负载，同时无需调整架构即可支撑10倍的负载增长。要按照18个月后的预期负载做设计，而非只适配当前的负载规模。

重要性说明： 没有做扩展性设计的系统迟早会遇到瓶颈，而且瓶颈总是会在最糟糕的时间点出现（上线日、内容爆火时、企业客户接入期）。后期改造扩展性的成本是初期就内置扩展性能力的10-100倍。

适用场景： 规划阶段（头脑风暴后，撰写方案时或之前）和审查阶段（作为代码评审标准的一部分）。该规范同时适用于新代码开发和存量代码修改。

The Rules

设计规则

Rule 1: Stateless by Default

规则1：默认无状态

Every service, function, and handler MUST be stateless unless there is an explicit, documented reason for state.

No in-memory state that would break with multiple instances.
No local file system for data that must survive restarts or be shared.
Session state goes in a shared store (Redis, database), never in process memory.
Caches must be external (Redis, Memcached) or have invalidation strategies for multi-instance.

Test: Can you run 5 instances of this service behind a load balancer with no shared state? If no, fix it.

所有服务、函数和处理器都必须是无状态的，除非有明确的、书面记录的需保留状态的理由。

不允许存在会导致多实例运行异常的内存内状态
不允许使用本地文件系统存储需要在服务重启后保留、或需要多实例共享的数据
会话状态必须存储在共享存储中（Redis、数据库），绝对不能放在进程内存里
缓存必须是外置的（Redis、Memcached），或者适配多实例场景的失效策略

验证方式： 你能否在负载均衡后启动5个服务实例，且无需共享状态即可正常运行？如果不能，就要调整设计。

Rule 2: Efficient Data Access

规则2：高效数据访问

Every database query and data access pattern MUST be designed for scale:

Pattern	Requirement
Queries	Must use indexes. No full table scans on tables that will grow.
Pagination	Required for any list endpoint. No unbounded `SELECT *` .
N+1 queries	Forbidden. Use joins, batch loading, or dataloader patterns.
Write amplification	Minimize. Don't update entire records when one field changes.
Connection pooling	Required. Never open/close connections per request.
Read replicas	Design for eventual consistency where appropriate.

Test: Run an

EXPLAIN

on every query. If it says "full table scan" on a table with >10K rows, add an index.

所有数据库查询和数据访问模式都必须适配规模化场景：

模式	要求
查询	必须使用索引。未来会扩容的表不允许做全表扫描。
分页	所有列表类接口必须支持分页，不允许无边界的 `SELECT *` 查询。
N+1 查询	严格禁止。使用关联查询、批量加载或dataloader模式解决。
写放大	尽可能降低。不要在仅修改单个字段时更新整条记录。
连接池	必须使用。绝对不能为每个请求单独打开/关闭连接。
读副本	合适的场景下按最终一致性设计。

验证方式： 对所有查询执行

EXPLAIN

分析，如果在行数超过1万的表上出现「全表扫描」，就要添加索引。

Rule 3: Async Where Possible

规则3：尽可能异步化

Any operation that doesn't need an immediate response MUST be asynchronous:

Email/SMS sending — queue it.
Report generation — queue it, notify on completion.
External API calls — if the user doesn't need the result immediately, queue it.
Data processing — stream or batch, never block the request.
File uploads — accept, acknowledge, process asynchronously.

Synchronous is acceptable for: auth checks, data reads <100ms, input validation.

所有不需要立即返回结果的操作都必须做异步处理：

邮件/短信发送 — 放入队列处理
报表生成 — 放入队列，完成后通知用户
外部API调用 — 如果用户不需要立即拿到结果，就放入队列
数据处理 — 用流式或批量处理，绝对不能阻塞请求
文件上传 — 先接收请求、返回确认，再异步处理后续流程

可使用同步处理的场景： 鉴权校验、耗时<100ms的数据读取、输入校验。

Rule 4: Caching Strategy

规则4：缓存策略

Every read-heavy path MUST have a caching strategy:

Cache layer	TTL	Use when
HTTP cache (CDN, browser)	Minutes to hours	Static assets, API responses that change infrequently
Application cache (Redis)	Seconds to minutes	Computed results, session data, frequent queries
Database query cache	Seconds	Identical queries hitting the DB frequently
No cache	—	Write paths, real-time data, personalized content

Every cache MUST have:

A defined TTL (no infinite caches).
An invalidation strategy (time-based, event-based, or both).
A cache-miss path that works correctly (no assumption that cache is always warm).

所有读路径占比高的链路都必须有配套的缓存策略：

缓存层	TTL	适用场景
HTTP缓存（CDN、浏览器）	数分钟到数小时	静态资源、变更不频繁的API响应
应用缓存（Redis）	数秒到数分钟	计算结果、会话数据、高频查询
数据库查询缓存	数秒	频繁触发的相同数据库查询
不使用缓存	—	写路径、实时数据、个性化内容

所有缓存必须满足：

明确定义TTL（不允许无限期缓存）
配置失效策略（基于时间、基于事件，或两者结合）
缓存未命中时的链路可正常运行（不要假设缓存永远是预热状态）

Rule 5: Resource Limits

规则5：资源限制

Every resource consumer MUST have explicit limits:

Resource	Limit	What happens at limit
HTTP request body	Max size (e.g., 10MB)	413 Payload Too Large
Query results	Max rows (e.g., 1000)	Pagination required
Batch operations	Max batch size (e.g., 100)	Split into chunks
Concurrent connections	Pool size (e.g., 20)	Queue or reject
Background jobs	Max concurrent (e.g., 10)	Queue with backpressure
File uploads	Max size + count	Reject with clear error

No unbounded anything. Every loop, query, queue, and buffer has a maximum.

所有资源消耗方都必须有明确的上限：

资源	限制	触达上限后的处理
HTTP请求体	最大大小（例如10MB）	返回413 Payload Too Large
查询结果	最大行数（例如1000条）	强制要求分页
批量操作	最大批次大小（例如100条）	拆分为多个小批次处理
并发连接	连接池大小（例如20个）	排队等待或直接拒绝
后台任务	最大并发数（例如10个）	带背压机制的队列处理
文件上传	最大大小+数量限制	返回清晰的错误提示拒绝请求

任何操作都不能没有边界，所有循环、查询、队列、缓冲区都必须有最大值限制。

Rule 6: Horizontal Scaling Design

规则6：水平扩展设计

Architecture MUST support horizontal scaling:

No singleton dependencies — no "there can be only one instance" of any service.
Idempotent operations — safe to retry, safe to run in parallel.
Distributed locking only when absolutely necessary (and with TTL).
Event-driven over request-driven for inter-service communication.
Partitionable data — design schemas so data can be sharded by tenant, region, or time.

架构必须支持水平扩展：

无单点依赖 — 所有服务都不能存在「仅能运行单实例」的限制
操作幂等 — 重试、并行运行都不会产生异常结果
仅在绝对必要时使用分布式锁（且必须配置TTL）
服务间通信优先用事件驱动而非请求驱动
数据可分区 — 设计Schema时要支持按租户、区域或时间做数据分片

Rule 7: Performance Budgets

规则7：性能预算

Every user-facing operation MUST have a performance budget:

Operation type	Budget
API response (P95)	<200ms
Page load (LCP)	<2.5s
Database query	<50ms
Background job start	<1s from event
Search	<500ms

If an operation exceeds its budget, it MUST be optimized before shipping. "It works" is not the same as "it scales."

所有面向用户的操作都必须有性能预算：

操作类型	预算
API响应（P95）	<200ms
页面加载（LCP）	<2.5s
数据库查询	<50ms
后台任务启动	事件触发后<1s内启动
搜索	<500ms

如果操作耗时超过预算，必须先优化再上线。「能运行」和「能扩展」是两个完全不同的概念。

Applying This Skill

规范落地方式

During Planning (brainstorming / writing-plans)

规划阶段（头脑风暴/撰写方案时）

During Implementation (executing-plans)

开发阶段（执行方案时）

As you write code:

Run
```
EXPLAIN
```
on new queries. Add indexes proactively.
Add pagination to every list endpoint from day one.
Set explicit timeouts on every external call (HTTP, DB, cache).
Add resource limits to every input (body size, array length, string length).
Use connection pooling for every external resource.

写代码时要注意：

对所有新增查询执行
```
EXPLAIN
```
分析，提前添加索引
所有列表类接口从第一天就支持分页
所有外部调用（HTTP、数据库、缓存）都设置明确的超时时间
所有输入都加资源限制（请求体大小、数组长度、字符串长度）
所有外部资源访问都使用连接池

During Review (code-review / receiving-code-review)

评审阶段（代码评审/接收代码评审时）

Verify these as part of every code review:

No unbounded queries or loops
No in-process state that breaks with multiple instances
Proper caching with TTL and invalidation
Async processing for non-immediate operations
Resource limits on all inputs
Performance budgets documented and met

每次代码评审都要验证以下内容：

无边界查询或循环
不存在会导致多实例运行异常的进程内状态
缓存配置合理，包含TTL和失效策略
非即时操作已做异步处理
所有输入都配置了资源限制
性能预算已明确说明且符合要求

When Modifying Existing Code

修改存量代码时

If existing code violates these rules:

You are NOT required to fix all scalability issues in unrelated code.
You ARE required to not make scalability worse.
If adding a new query to an endpoint, ensure it's indexed and paginated.
If adding a new external dependency, ensure it has timeouts and connection pooling.

如果存量代码不符合这些规则：

你不需要修复不相关代码里的所有扩展性问题
你必须保证修改后的代码不会让扩展性变得更差
如果给接口加了新的查询，要确保查询使用了索引且支持分页
如果新增了外部依赖，要确保配置了超时时间和连接池

Anti-Patterns

反模式

Pattern	Problem	Fix
In-memory sessions	Breaks with multiple instances	External session store
Unbounded queries	Memory explosion at scale	Pagination + limits
Synchronous emails	Request blocked for seconds	Queue + async worker
No connection pooling	Connection exhaustion under load	Pool with limits
Cache without TTL	Stale data forever	TTL + invalidation strategy
SELECT *	Transfers unnecessary data	Select only needed columns
Fat payloads	Network bottleneck	Paginate, compress, or stream

模式	问题	修复方案
内存内存储会话	多实例运行时异常	使用外置会话存储
无边界查询	规模化后内存溢出	分页+上限限制
同步发送邮件	请求阻塞数秒	队列+异步 worker
无连接池	高负载下连接耗尽	带上限的连接池
无TTL的缓存	数据永久过期	TTL+缓存失效策略
SELECT *	传输不必要的数据	仅查询需要的字段
超大 payload	网络瓶颈	分页、压缩或流式传输

Rationalization Prevention

避免不合理的辩解

Excuse	Reality
"We only have 100 users"	You'll have 10,000 before you know it. Design now.
"We can optimize later"	Optimization is cheap. Redesigning architecture is not.
"Premature optimization"	Scalability design ≠ micro-optimization. These are architectural.
"It's fast enough on my machine"	Your machine has 1 user. Production has thousands.
"We'll add caching when we need it"	By then you'll need it urgently. Design the strategy now.
"This is just an internal tool"	Internal tools scale with the company. Design accordingly.

借口	事实
「我们现在只有100个用户」	你可能不知不觉就会有10000个用户，现在就做好设计。
「我们可以以后再优化」	优化成本很低，架构重构的成本却极高。
「这是过早优化」	扩展性设计≠微观优化，这些都是架构层面的基础要求。
「在我机器上跑的很快」	你的机器只有1个用户，生产环境有数千个用户。
「等需要的时候再加缓存就行」	等你需要的时候往往已经很紧急了，现在就把策略设计好。
「这只是个内部工具」	内部工具会跟着公司规模一起扩张，也要做对应设计。