performance-optimization

Compare original and translation side by side

🇺🇸

Original

English
🇨🇳

Translation

Chinese

Performance Optimization

性能优化

Overview

概述

Systematically identify and resolve performance bottlenecks using measurement-driven methodology. This skill enforces a strict MEASURE-IDENTIFY-OPTIMIZE-VERIFY cycle, preventing premature optimization and speculation. Every optimization must produce measurable improvement or be reverted.
Announce at start: "I'm using the performance-optimization skill to diagnose and resolve bottlenecks."

使用以数据测量为驱动的方法,系统性地识别并解决性能瓶颈。本技能严格遵循 MEASURE-IDENTIFY-OPTIMIZE-VERIFY(测量-识别-优化-验证)周期,避免过早优化和主观推测。每一项优化都必须产生可量化的改进,否则应当回退。
开始前声明: "我正在使用性能优化技能来诊断和解决瓶颈问题。"

Phase 1: MEASURE (Establish Baseline)

阶段1:测量(建立基准线)

Goal: Capture real metrics before changing anything.
目标: 在进行任何改动前采集真实指标。

Actions

操作

bash
undefined
bash
undefined

Web: Lighthouse CI

Web: Lighthouse CI

npx lighthouse https://your-app.com --output=json --output-path=baseline.json
npx lighthouse https://your-app.com --output=json --output-path=baseline.json

API: load test with k6

API: load test with k6

k6 run --out json=baseline.json loadtest.js
k6 run --out json=baseline.json loadtest.js

Database: slow query log

Database: slow query log

PostgreSQL: SET log_min_duration_statement = 100; -- log queries > 100ms

PostgreSQL: SET log_min_duration_statement = 100; -- log queries > 100ms


Record these numbers. They are the baseline against which improvement is measured.

记录这些数值,它们是衡量改进效果的基准线。

STOP — Do NOT proceed to Phase 2 until:

停止条件——满足以下条件前请勿进入阶段2:

  • Baseline metrics are captured and saved
  • Specific metric targets are defined (e.g., LCP < 2.5s)
  • Measurement methodology is documented (so it can be repeated)

  • 已采集并保存基准线指标
  • 已定义明确的指标目标(例如LCP < 2.5s)
  • 已记录测量方法(确保可重复执行)

Phase 2: IDENTIFY (Find the Actual Bottleneck)

阶段2:识别(找到真实瓶颈)

Goal: Use profiling tools to find WHERE time is spent. Do NOT guess.
目标: 使用分析工具找到耗时的真实位置,请勿主观猜测。

Profiling Tool Selection Table

分析工具选择表

LayerToolWhat It Shows
Frontend renderingReact DevTools Profiler, Chrome Performance tabComponent render times
NetworkChrome Network tab, WebPageTestRequest waterfall, TTFB
JavaScriptChrome Performance tab,
console.time()
Function execution time
Node.js server
--prof
flag, clinic.js, 0x
CPU flame graphs
Database
EXPLAIN ANALYZE
, pg_stat_statements
Query plans, slow queries
MemoryChrome Memory tab, heapdumpAllocation patterns, leaks
Bundle sizewebpack-bundle-analyzer, vite-bundle-visualizerModule sizes
The bottleneck is almost never where you assume it is. Measure first.
层级工具可查看内容
前端渲染React DevTools Profiler, Chrome Performance tab组件渲染耗时
网络Chrome Network tab, WebPageTest请求瀑布流、TTFB
JavaScriptChrome Performance tab,
console.time()
函数执行时间
Node.js 服务端
--prof
flag, clinic.js, 0x
CPU火焰图
数据库
EXPLAIN ANALYZE
, pg_stat_statements
查询计划、慢查询
内存Chrome Memory tab, heapdump内存分配模式、泄漏问题
包体积webpack-bundle-analyzer, vite-bundle-visualizer模块大小
瓶颈几乎永远不在你假设的位置,务必先测量。

STOP — Do NOT proceed to Phase 3 until:

停止条件——满足以下条件前请勿进入阶段3:

  • Profiling tool appropriate to the layer has been used
  • Specific bottleneck is identified with data
  • Bottleneck accounts for a significant portion of the problem

  • 已使用对应层级的合适分析工具
  • 已通过数据定位到具体的瓶颈
  • 该瓶颈是导致问题的主要原因

Phase 3: OPTIMIZE (Fix the Identified Bottleneck)

阶段3:优化(修复已识别的瓶颈)

Goal: Apply the targeted fix. Change ONE thing at a time.
目标: 实施针对性修复,每次仅改动一项内容。

Optimization Decision Table

优化决策表

Bottleneck TypeOptimization ApproachExample
Large bundleCode splitting, tree shaking, dynamic imports
React.lazy(() => import('./HeavyComponent'))
Slow API responseCaching, query optimization, paginationAdd Redis cache with 5min TTL
Slow database queryAdd index, optimize query plan, materialized view
CREATE INDEX idx_user_email ON users(email)
Excessive re-rendersMemoization, virtualization, state restructuring
React.memo
,
useMemo
Large imagesCompression, lazy loading, responsive images
<img loading="lazy" srcset="...">
Slow TTFBServer-side caching, CDN, edge renderingStale-while-revalidate pattern
Memory leakFix event listener cleanup, weak referencesProper
useEffect
cleanup
瓶颈类型优化方案示例
包体积过大代码分割、tree shaking、动态导入
React.lazy(() => import('./HeavyComponent'))
API响应慢缓存、查询优化、分页新增TTL为5分钟的Redis缓存
数据库查询慢添加索引、优化查询计划、物化视图
CREATE INDEX idx_user_email ON users(email)
过度重渲染记忆化、虚拟化、状态重构
React.memo
,
useMemo
大图片压缩、懒加载、响应式图片
<img loading="lazy" srcset="...">
TTFB过长服务端缓存、CDN、边缘渲染stale-while-revalidate 模式
内存泄漏修复事件监听器清理问题、使用弱引用规范的
useEffect
清理逻辑

STOP — Do NOT proceed to Phase 4 until:

停止条件——满足以下条件前请勿进入阶段4:

  • Only ONE change has been made
  • Change directly targets the identified bottleneck
  • No unrelated changes were made alongside the optimization

  • 仅完成了一项改动
  • 改动直接针对已识别的瓶颈
  • 优化过程中没有引入无关改动

Phase 4: VERIFY (Measure Again)

阶段4:验证(再次测量)

Goal: Re-run the exact same measurement from Phase 1.
目标: 复用阶段1的完全相同的测量方法重新执行。

Actions

操作步骤

  1. Run the same profiling/measurement as Phase 1
  2. Compare results:
    • Did the metric improve?
    • By how much?
    • Did any other metrics regress?
  3. If improvement is not measurable, REVERT the change.
Optimization that cannot be measured is not optimization.
  1. 执行和阶段1完全相同的分析/测量操作
  2. 对比结果:
    • 指标是否有所改进?
    • 改进幅度是多少?
    • 其他指标是否出现退化?
  3. 如果改进无法量化,回退该改动。
无法量化的优化不能称之为优化。

STOP — Verification complete when:

停止条件——满足以下条件时验证完成:

  • Same measurement methodology used as Phase 1
  • Improvement is quantified (e.g., "LCP reduced from 3.2s to 2.1s")
  • No regressions in other metrics
  • If no improvement: change reverted

  • 使用了和阶段1完全相同的测量方法
  • 改进已被量化(例如 "LCP从3.2s降低到2.1s")
  • 其他指标没有出现退化
  • 如果没有改进:已回退改动

Caching Strategy Decision Table

缓存策略决策表

Cache TypeUse WhenTTL GuidanceInvalidation
In-memory (LRU)Single-instance, hot data, computed valuesSeconds to minutesEviction policy
Redis/MemcachedMulti-instance, shared cache, sessionsMinutes to hoursEvent-based or TTL
CDNStatic assets, public pages, API responsesHours to daysDeploy-triggered purge
BrowserRepeat visits, static resourcesDays to months (versioned)Cache-busting hash
缓存类型适用场景TTL建议失效策略
内存缓存(LRU)单实例场景、热点数据、计算值数秒到数分钟驱逐策略
Redis/Memcached多实例场景、共享缓存、会话数分钟到数小时事件触发或TTL过期
CDN静态资源、公开页面、API响应数小时到数天部署触发清理
浏览器缓存重复访问场景、静态资源数天到数月(带版本号)缓存哈希戳

Cache-Control Headers

Cache-Control 头配置

undefined
undefined

Immutable assets (hashed filenames)

不可变资源(带哈希文件名)

Cache-Control: public, max-age=31536000, immutable
Cache-Control: public, max-age=31536000, immutable

API responses (cacheable but must revalidate)

API响应(可缓存但必须重新验证)

Cache-Control: public, max-age=0, must-revalidate ETag: "abc123"
Cache-Control: public, max-age=0, must-revalidate ETag: "abc123"

Private user data

私有用户数据

Cache-Control: private, no-store
Cache-Control: private, no-store

Stale-while-revalidate (fast response + background refresh)

Stale-while-revalidate(快速响应+后台刷新)

Cache-Control: public, max-age=60, stale-while-revalidate=300

---
Cache-Control: public, max-age=60, stale-while-revalidate=300

---

Bundle Optimization Techniques

包体积优化技术

TechniqueImpactImplementation
Route-level code splittingHigh
React.lazy()
+
Suspense
per route
Tree shakingHighES modules only,
sideEffects: false
Dynamic importsMedium
await import('heavy-lib')
on user action
Image optimizationHighnext/image, WebP/AVIF, responsive srcset
Font optimizationMedium
next/font
,
font-display: swap
, subset
Dependency replacementMediumday.js for moment.js, lodash-es for lodash
技术方案优化效果实现方式
路由级代码分割每个路由使用
React.lazy()
+
Suspense
Tree shaking仅使用ES模块,配置
sideEffects: false
动态导入用户触发操作时执行
await import('heavy-lib')
图片优化next/image、WebP/AVIF格式、响应式srcset
字体优化
next/font
font-display: swap
、字体子集化
依赖替换用day.js替代moment.js,用lodash-es替代lodash

Bundle Analysis Commands

包体积分析命令

bash
undefined
bash
undefined

Webpack

Webpack

npx webpack-bundle-analyzer stats.json
npx webpack-bundle-analyzer stats.json

Vite

Vite

npx vite-bundle-visualizer
npx vite-bundle-visualizer

Next.js

Next.js

ANALYZE=true next build

---
ANALYZE=true next build

---

Database Query Tuning

数据库查询调优

Index Optimization

索引优化

sql
-- Find missing indexes (PostgreSQL)
SELECT schemaname, tablename, seq_scan, idx_scan
FROM pg_stat_user_tables
WHERE seq_scan > idx_scan
ORDER BY seq_scan DESC;
sql
-- 查找缺失的索引(PostgreSQL)
SELECT schemaname, tablename, seq_scan, idx_scan
FROM pg_stat_user_tables
WHERE seq_scan > idx_scan
ORDER BY seq_scan DESC;

Index Rules

索引规则

RuleExplanation
Index WHERE, JOIN, ORDER BY columnsThese are the columns the DB searches
Equality columns first in composite indexMost selective filtering first
Range columns last in composite indexLess selective, applied after equality
Remove unused indexesThey slow down writes
Use partial indexes for filtered queriesSmaller index, faster lookups
规则说明
对WHERE、JOIN、ORDER BY涉及的列建索引这些是数据库检索的核心列
联合索引中等值匹配列放在最前最高筛选优先级的列在前
联合索引中范围匹配列放在最后筛选优先级较低,在等值匹配后生效
删除未使用的索引会拖慢写入性能
为过滤查询使用部分索引索引体积更小,查询速度更快

Query Plan Red Flags

查询计划危险信号

Red Flag in EXPLAIN ANALYZEMeaningFix
Seq Scan on large tableFull table scanAdd index
Nested Loop with many rowsO(n*m) joinAdd index or restructure query
Sort with high memorySorting in memoryAdd index matching ORDER BY
Actual rows >> estimated rowsStale statisticsRun ANALYZE
Hash Join with large buildMemory-intensiveEnsure join columns are indexed

EXPLAIN ANALYZE中的危险信号含义修复方案
大表上的 Seq Scan全表扫描添加索引
大量行的 Nested Loop时间复杂度为O(n*m)的关联查询添加索引或重构查询
高内存消耗的 Sort在内存中完成排序添加匹配ORDER BY的索引
实际行数 >> 预估行数统计信息过时执行ANALYZE命令
大构造表的 Hash Join内存消耗极高确保关联列已建索引

Web Vitals Targets

Web Vitals 指标目标

MetricGoodNeeds WorkPoor
LCP (Largest Contentful Paint)< 2.5s2.5-4s> 4s
INP (Interaction to Next Paint)< 200ms200-500ms> 500ms
CLS (Cumulative Layout Shift)< 0.10.1-0.25> 0.25
指标优秀待优化不佳
LCP(最大内容绘制)< 2.5s2.5-4s> 4s
INP(交互到 next 绘制)< 200ms200-500ms> 500ms
CLS(累计布局偏移)< 0.10.1-0.25> 0.25

Web Vitals Optimization Table

Web Vitals优化表

MetricOptimizationImplementation
LCPPreload LCP resource
<link rel="preload">
or
fetchpriority="high"
LCPInline critical CSSExtract above-fold CSS inline
LCPOptimize TTFBCDN, edge rendering, server caching
INPBreak long tasks
requestIdleCallback
,
scheduler.yield()
INPDebounce input handlers100-300ms debounce on expensive handlers
INPWeb WorkersMove computation off main thread
CLSExplicit dimensionsSet
width
/
height
on images and videos
CLSReserve space for dynamic contentPlaceholder sizing for ads, embeds
CLSUse transform animationsAvoid layout-triggering properties

指标优化方案实现方式
LCP预加载LCP资源
<link rel="preload">
fetchpriority="high"
LCP内联关键CSS提取首屏CSS内联到页面
LCP优化TTFBCDN、边缘渲染、服务端缓存
INP拆分长任务
requestIdleCallback
scheduler.yield()
INP输入处理函数防抖耗时的处理函数添加100-300ms防抖
INPWeb Workers将计算逻辑移到主线程外
CLS显式设置尺寸为图片和视频设置
width
/
height
CLS为动态内容预留空间为广告、嵌入内容设置占位符尺寸
CLS使用transform动画避免触发布局重排的属性

Load Testing

负载测试

Test Types

测试类型

TypeUsersDurationPurpose
Smoke1-21 minuteVerify test works
LoadExpected traffic10-30 minNormal performance
Stress2-3x expected10-30 minFind breaking point
SoakNormal load2-8 hoursFind memory leaks
类型并发用户数持续时长用途
冒烟测试1-21分钟验证测试逻辑可用
负载测试预期流量10-30分钟验证正常负载下的性能
压力测试2-3倍预期流量10-30分钟找到系统崩溃阈值
耐力测试正常负载2-8小时发现内存泄漏问题

Key Metrics

核心指标

  • Response time percentiles (p50, p95, p99) — not averages
  • Error rate under load
  • Throughput (requests per second)
  • Resource utilization (CPU, memory, connections)

  • 响应时间分位数(p50、p95、p99)——而非平均值
  • 负载下的错误率
  • 吞吐量(每秒请求数)
  • 资源利用率(CPU、内存、连接数)

Anti-Patterns / Common Mistakes

反模式/常见错误

Anti-PatternWhy It Is WrongCorrect Approach
Optimizing without measuringYou do not know what to fixMEASURE first, always
Premature optimizationWastes time on non-bottlenecksProfile to find actual bottleneck
Memoizing everythingAdds complexity without proven benefitProfile first, memoize second
Caching without invalidation strategyStale data causes bugsDefine invalidation before adding cache
Optimizing averages instead of percentilesAverages hide tail latencyTrack p95 and p99
Multiple optimizations at onceCannot attribute improvementOne change at a time
Keeping optimizations that do not measurably helpDead code and complexityRevert if no measurable improvement
Adding indexes without checking query patternsUnused indexes slow writesCheck slow query log first

反模式错误原因正确做法
未测量就开始优化不知道真正需要修复的问题永远先测量,再优化
过早优化浪费时间在非瓶颈点上先分析找到真实瓶颈
所有内容都加记忆化增加复杂度却没有可验证的收益先分析,再加记忆化
缓存没有失效策略数据陈旧导致业务bug添加缓存前先定义失效规则
优化平均值而非分位数平均值掩盖了长尾延迟问题跟踪p95和p99指标
同时进行多项优化无法定位改进的来源每次仅改动一项内容
保留无法带来可量化改进的优化产生死代码和额外复杂度没有可量化改进就回退
未检查查询模式就添加索引未使用的索引会拖慢写入性能先查看慢查询日志再建索引

Subagent Dispatch Opportunities

子Agent调度机会

Task PatternDispatch ToWhen
Profiling different system layers concurrently
Agent
tool with
subagent_type="Explore"
(one per layer)
When analyzing frontend, backend, and database independently
Bundle analysis and tree-shaking review
Agent
tool with
subagent_type="general-purpose"
When frontend bundle size is a concern
Database query optimization analysis
Agent
tool dispatching
database-architect
agent
When slow queries are identified across multiple tables
Follow the
dispatching-parallel-agents
skill protocol when dispatching.

任务模式调度到适用场景
同时分析不同系统层的性能
Agent
工具,配置
subagent_type="Explore"
(每个层级一个)
需要独立分析前端、后端和数据库性能时
包体积分析和tree shaking审查
Agent
工具,配置
subagent_type="general-purpose"
前端包体积过大需要优化时
数据库查询优化分析
Agent
工具调度
database-architect
agent
多个表都存在慢查询问题时
调度时遵循
dispatching-parallel-agents
技能协议。

Integration Points

集成点

SkillRelationship
senior-frontend
Frontend performance uses bundle and Web Vitals optimization
senior-backend
Backend performance uses caching and query tuning
testing-strategy
Load tests are part of the testing pyramid
code-review
Review checks for performance regressions
systematic-debugging
Performance issues follow the same investigation methodology
acceptance-testing
Performance targets become acceptance criteria

关联技能关联关系
senior-frontend
前端性能优化包含包体积和Web Vitals优化
senior-backend
后端性能优化包含缓存和查询调优
testing-strategy
负载测试是测试金字塔的组成部分
code-review
代码评审需要检查性能退化问题
systematic-debugging
性能问题遵循相同的排查方法论
acceptance-testing
性能目标会成为验收标准

Skill Type

技能类型

FLEXIBLE — Adapt the depth of optimization to the project context. The MEASURE-IDENTIFY-OPTIMIZE-VERIFY cycle is mandatory for every optimization. Revert any change that does not produce measurable improvement.
FLEXIBLE ——可根据项目上下文调整优化深度。所有优化都必须遵循MEASURE-IDENTIFY-OPTIMIZE-VERIFY周期。任何无法产生可量化改进的改动都应当回退。