nodejs-performance

Compare original and translation side by side

🇺🇸

Original

English

🇨🇳

Translation

Chinese

Node.js Performance

Node.js 性能优化

Use this workflow to turn Node.js performance/resource investigations into safe, reviewable PRs.

使用此工作流可将Node.js性能/资源问题的调研转化为安全、可评审的PR。

Goals

目标

Improve execution time first: reduce p50/p95/p99 latency and increase throughput without changing intended behavior.
Reduce CPU, memory, event-loop lag, I/O pressure, or lock contention when it supports execution-time gains.
Ship small, isolated changes with measurable impact.

优先优化执行时间：在不改变预期行为的前提下，降低p50/p95/p99延迟并提升吞吐量。
当有助于提升执行时间时，减少CPU、内存、事件循环延迟、I/O压力或锁竞争。
交付具有可衡量影响的小型、独立变更。

Operating Rules

操作准则

Work on one optimization per PR.
Always choose the highest expected-impact task first.
Confirm and respect intentional behaviors before changing them.
Prefer low-risk changes in high-frequency paths.
Prioritize request/job execution-path work over bootstrap/startup micro-optimizations unless startup is on the critical path at scale.
Include evidence: targeted tests + before/after benchmark.

每个PR仅处理一项优化。
始终优先选择预期影响最高的任务。
在变更前确认并尊重原有预期行为。
优先在高频路径中进行低风险变更。
除非启动环节在大规模场景下属于关键路径，否则优先处理请求/作业执行路径的工作，而非启动/初始化阶段的微优化。
需包含佐证：针对性测试 + 变更前后的基准测试结果。

Impact-First Selection

基于影响优先级的选择

Before coding, rank candidates using this score:

priority = (frequency x blast_radius x expected_gain) / (risk x effort)

Use 1-5 for each factor:

```
frequency
```
: how often the path runs in production.
```
blast_radius
```
: how many requests/jobs/users are affected.
```
expected_gain
```
: estimated latency/resource improvement.
```
risk
```
: probability of behavior regression.
```
effort
```
: engineering time and change surface area.

Pick the top-ranked candidate, then validate with a baseline measurement.

If two candidates have similar score, pick the one with clearer end-to-end execution-time impact.

编码前，使用以下评分公式对候选优化项进行排序：

priority = (frequency x blast_radius x expected_gain) / (risk x effort)

每个因子采用1-5分：

```
frequency
```
：该路径在生产环境中的运行频率。
```
blast_radius
```
：受影响的请求/作业/用户数量。
```
expected_gain
```
：预估的延迟/资源优化幅度。
```
risk
```
：行为出现回归的概率。
```
effort
```
：开发时间和变更影响范围。

选择排名最高的候选项，然后通过基准测量验证。

如果两个候选项得分相近，选择端到端执行时间影响更明确的那个。

Prioritization Targets

优先级目标

Start with code that runs on every request/job/task:

Request/job wrappers and middleware.
Retry/timeout/circuit-breaker code.
Connection pools (DB/Redis/HTTP) and socket reuse.
Stream/pipeline transformations and buffering.
Serialization/deserialization hot paths (JSON, parsers, schema validation).
Queue consumers, schedulers, and worker dispatch.
Event listener attach/detach lifecycle and cleanup logic.

Deprioritize unless justified by production profile:

One-time startup/bootstrap code.
Rare admin/debug-only flows.
Teardown paths that are not on the steady-state critical path.

从每个请求/作业/任务都会执行的代码开始：

请求/作业包装器和中间件。
重试/超时/熔断代码。
连接池（DB/Redis/HTTP）和套接字复用。
流/管道转换与缓冲。
序列化/反序列化热点路径（JSON、解析器、schema验证）。
队列消费者、调度器和工作进程分发。
事件监听器的附加/分离生命周期及清理逻辑。

除非生产环境profiling数据证明其必要性，否则优先度较低：

一次性启动/初始化代码。
罕见的管理员/仅调试流程。
非稳态关键路径的销毁清理流程。

Common Hot-Path Smells

常见热点路径问题特征

Recomputing invariant values per invocation.
Re-parsing code/AST repeatedly.
Duplicate async lookups returning the same value.
Per-call heavy object allocation in common-case parsing.
Unnecessary awaits in teardown/close/dispose paths.
Missing fast paths for dominant input shapes.
Unbounded retries or retry storms under degraded dependencies.
Excessive concurrency causing memory spikes or downstream saturation.
Work done for logging/telemetry/metrics formatting even when disabled.

每次调用时重复计算不变值。
重复解析代码/AST。
多次异步查找返回相同值。
常见解析场景中每次调用都进行大量对象分配。
销毁/关闭/释放路径中存在不必要的await。
针对主要输入形状缺少快速路径。
依赖降级时出现无界重试或重试风暴。
过度并发导致内存峰值或下游饱和。
即使禁用日志/遥测/指标格式化仍会执行相关操作。

Execution Workflow

执行工作流

Pick one candidate

Rank candidates and pick the highest priority score.
Explain the issue in one sentence.
State expected impact (CPU, latency, memory, event-loop lag, I/O, contention).

Prove it is hot

Add a focused micro-benchmark or scenario benchmark.
Capture baseline numbers before editing.
Prefer scenario benchmarks that include real request/job flow when the goal is execution-time improvement.
For resource issues, capture process metrics (
```
rss
```
, heap, FD count, event-loop delay).

Design minimal fix

Keep behavior-compatible defaults.
Add fallback path for edge cases.
Avoid broad refactors in the same PR.

Implement

Make the smallest patch that removes repeated work.
Keep interfaces stable unless change is necessary.

Test

Add/adjust targeted tests for new behavior and regressions.
Run relevant package tests (not only whole-monorepo by default).
Add concurrency/degradation tests when the bug appears only under load.

Benchmark again

Re-run the same benchmark with same parameters.
Report absolute and relative deltas.
Include latency deltas first (p50/p95/p99, throughput), then resource deltas when applicable.

Package PR

Branch naming:
```
codex/perf-<area>-<change>
```
.
Commit message:
```
perf(<package>): <what changed>
```
.
Include risk notes and rollback simplicity.

Iterate

Wait for review, then move to next isolated improvement.

选择一个候选项

对候选项排序，选择优先级得分最高的。
用一句话说明问题。
说明预期影响（CPU、延迟、内存、事件循环延迟、I/O、竞争）。

证明其为热点路径

添加针对性的微基准测试或场景基准测试。
在修改前捕获基准数据。
当目标是优化执行时间时，优先选择包含真实请求/作业流程的场景基准测试。
针对资源问题，捕获进程指标（
```
rss
```
、堆内存、FD数量、事件循环延迟）。

设计最小化修复方案

保持行为兼容的默认值。
为边缘情况添加回退路径。
避免在同一个PR中进行大范围重构。

实现

制作最小化补丁以消除重复操作。
保持接口稳定，除非必须变更。

测试

添加/调整针对性测试以覆盖新行为和回归情况。
运行相关包的测试（默认不运行整个monorepo的测试）。
当问题仅在负载下出现时，添加并发/降级测试。

再次进行基准测试

使用相同参数重新运行相同的基准测试。
报告绝对和相对变化值。
优先报告延迟变化（p50/p95/p99和吞吐量），适用时再报告资源变化。

打包PR

分支命名：
```
codex/perf-<area>-<change>
```
。
提交信息：
```
perf(<package>): <变更内容>
```
。
包含风险说明和回滚简易性评估。

迭代

等待评审通过后，再进行下一项独立优化。

Benchmarking Guidance

基准测试指南

Keep benchmark scope narrow to isolate one change.
Use warmup iterations.
Measure both:
```
micro
```
: operation-level overhead.
```
scenario
```
: request/job flow, concurrency, and degraded dependency condition.
For execution-time work, scenario numbers are the decision-maker; micro numbers are supporting evidence.
Always print:
total time
per-op time
p50/p95/p99 latency when applicable
speedup ratio
iterations and workload shape
resource counters (
```
rss
```
, heap, handles, event-loop delay) when relevant

缩小基准测试范围以隔离单个变更。
使用预热迭代。
同时测量：
```
micro
```
：操作级别的开销。
```
scenario
```
：请求/作业流程、并发情况和依赖降级条件。
对于执行时间优化工作，场景测试数据是决策依据；微基准测试数据仅作为辅助证据。
始终输出：
总时间
每次操作的时间
适用时的p50/p95/p99延迟
加速比
迭代次数和工作负载形态
相关的资源计数器（
```
rss
```
、堆内存、句柄、事件循环延迟）

Resource Exhaustion Checklist

资源耗尽检查清单

Cap concurrency at each boundary (ingress, queue, downstream clients).
Ensure timeout + cancellation are wired end-to-end.
Ensure retries are bounded and jittered.
Confirm listeners/timers/intervals are always cleaned up.
Confirm streams are closed/destroyed on success and error paths.
Confirm object caches have size/TTL controls.

在每个边界（入口、队列、下游客户端）设置并发上限。
确保超时 + 取消逻辑端到端连通。
确保重试有边界限制且添加了抖动。
确认监听器/定时器/间隔器总是被清理。
确认流在成功和错误路径中都被关闭/销毁。
确认对象缓存有大小/TTL控制。

CI / Flake Handling

CI / 不稳定用例处理

If CI-only failures appear, add temporary diagnostic payloads in tests.
Serialize only affected flaky tests when resource contention is the cause.
Keep determinism improvements in test code, not production code, unless required.

如果出现仅在CI中失败的情况，在测试中添加临时诊断数据。
当资源竞争是原因时，仅将受影响的不稳定用例改为串行执行。
除非必要，否则在测试代码中而非生产代码中进行确定性改进。

Output Template

输出模板

For each PR, report:

Issue being fixed.
Why it matters under load.
Code locations changed.
Tests run and results.
Benchmark before/after numbers (execution first: p50/p95/p99 and throughput).
Risk assessment.
Next candidate optimization.

每个PR需报告：

修复的问题。
负载下该问题的影响。
修改的代码位置。
运行的测试及结果。
基准测试前后数据（优先执行时间：p50/p95/p99和吞吐量）。
风险评估。
下一个候选优化项。