benchmark

Compare original and translation side by side

🇺🇸

Original

English

🇨🇳

Translation

Chinese

Benchmark — Performance Baseline & Regression Detection

Benchmark — 性能基线与退化检测

When to Use

何时使用

Before and after a PR to measure performance impact
Setting up performance baselines for a project
When users report "it feels slow"
Before a launch — ensure you meet performance targets
Comparing your stack against alternatives

提交PR前后，衡量性能影响
为项目设置性能基线
当用户反馈「用起来感觉很慢」时
产品发布前 —— 确保达到性能目标
对比你的技术栈与其他可选方案

How It Works

工作原理

Mode 1: Page Performance

模式1：页面性能

Measures real browser metrics via browser MCP:

1. Navigate to each target URL
2. Measure Core Web Vitals:
   - LCP (Largest Contentful Paint) — target < 2.5s
   - CLS (Cumulative Layout Shift) — target < 0.1
   - INP (Interaction to Next Paint) — target < 200ms
   - FCP (First Contentful Paint) — target < 1.8s
   - TTFB (Time to First Byte) — target < 800ms
3. Measure resource sizes:
   - Total page weight (target < 1MB)
   - JS bundle size (target < 200KB gzipped)
   - CSS size
   - Image weight
   - Third-party script weight
4. Count network requests
5. Check for render-blocking resources

通过浏览器MCP测量真实浏览器指标：

1. Navigate to each target URL
2. Measure Core Web Vitals:
   - LCP (Largest Contentful Paint) — target < 2.5s
   - CLS (Cumulative Layout Shift) — target < 0.1
   - INP (Interaction to Next Paint) — target < 200ms
   - FCP (First Contentful Paint) — target < 1.8s
   - TTFB (Time to First Byte) — target < 800ms
3. Measure resource sizes:
   - Total page weight (target < 1MB)
   - JS bundle size (target < 200KB gzipped)
   - CSS size
   - Image weight
   - Third-party script weight
4. Count network requests
5. Check for render-blocking resources

Mode 2: API Performance

模式2：API性能

Benchmarks API endpoints:

1. Hit each endpoint 100 times
2. Measure: p50, p95, p99 latency
3. Track: response size, status codes
4. Test under load: 10 concurrent requests
5. Compare against SLA targets

对API端点进行基准测试：

1. Hit each endpoint 100 times
2. Measure: p50, p95, p99 latency
3. Track: response size, status codes
4. Test under load: 10 concurrent requests
5. Compare against SLA targets

Mode 3: Build Performance

模式3：构建性能

Measures development feedback loop:

1. Cold build time
2. Hot reload time (HMR)
3. Test suite duration
4. TypeScript check time
5. Lint time
6. Docker build time

衡量开发反馈链路效率：

1. Cold build time
2. Hot reload time (HMR)
3. Test suite duration
4. TypeScript check time
5. Lint time
6. Docker build time

Mode 4: Before/After Comparison

模式4：前后对比

Run before and after a change to measure impact:

/benchmark baseline    # saves current metrics

在变更前后运行以衡量变更影响：

/benchmark baseline    # saves current metrics

... make changes ...

/benchmark compare # compares against baseline


Output:

Metric	Before	After	Delta	Verdict
LCP	1.2s	1.4s	+200ms	⚠ WARN
Bundle	180KB	175KB	-5KB	✓ BETTER
Build	12s	14s	+2s	⚠ WARN

undefined

/benchmark compare # compares against baseline


输出示例：

Metric	Before	After	Delta	Verdict
LCP	1.2s	1.4s	+200ms	⚠ WARN
Bundle	180KB	175KB	-5KB	✓ BETTER
Build	12s	14s	+2s	⚠ WARN

undefined

Output

输出

Stores baselines in

.ecc/benchmarks/

as JSON. Git-tracked so the team shares baselines.

将基线以JSON格式存储在

.ecc/benchmarks/

目录下，已纳入Git跟踪，因此团队成员共享同一套基准线。

Integration

集成

CI: run
```
/benchmark compare
```
on every PR
Pair with
```
/canary-watch
```
for post-deploy monitoring
Pair with
```
/browser-qa
```
for full pre-ship checklist

CI：在每个PR上运行
```
/benchmark compare
```
搭配
```
/canary-watch
```
使用，用于部署后监控
搭配
```
/browser-qa
```
使用，完成发布前完整检查清单