benchmark

Compare original and translation side by side

🇺🇸

Original

English
🇨🇳

Translation

Chinese

Benchmark — Performance Baseline & Regression Detection

Benchmark — 性能基线与退化检测

When to Use

何时使用

  • Before and after a PR to measure performance impact
  • Setting up performance baselines for a project
  • When users report "it feels slow"
  • Before a launch — ensure you meet performance targets
  • Comparing your stack against alternatives
  • 提交PR前后,衡量性能影响
  • 为项目设置性能基线
  • 当用户反馈「用起来感觉很慢」时
  • 产品发布前 —— 确保达到性能目标
  • 对比你的技术栈与其他可选方案

How It Works

工作原理

Mode 1: Page Performance

模式1:页面性能

Measures real browser metrics via browser MCP:
1. Navigate to each target URL
2. Measure Core Web Vitals:
   - LCP (Largest Contentful Paint) — target < 2.5s
   - CLS (Cumulative Layout Shift) — target < 0.1
   - INP (Interaction to Next Paint) — target < 200ms
   - FCP (First Contentful Paint) — target < 1.8s
   - TTFB (Time to First Byte) — target < 800ms
3. Measure resource sizes:
   - Total page weight (target < 1MB)
   - JS bundle size (target < 200KB gzipped)
   - CSS size
   - Image weight
   - Third-party script weight
4. Count network requests
5. Check for render-blocking resources
通过浏览器MCP测量真实浏览器指标:
1. Navigate to each target URL
2. Measure Core Web Vitals:
   - LCP (Largest Contentful Paint) — target < 2.5s
   - CLS (Cumulative Layout Shift) — target < 0.1
   - INP (Interaction to Next Paint) — target < 200ms
   - FCP (First Contentful Paint) — target < 1.8s
   - TTFB (Time to First Byte) — target < 800ms
3. Measure resource sizes:
   - Total page weight (target < 1MB)
   - JS bundle size (target < 200KB gzipped)
   - CSS size
   - Image weight
   - Third-party script weight
4. Count network requests
5. Check for render-blocking resources

Mode 2: API Performance

模式2:API性能

Benchmarks API endpoints:
1. Hit each endpoint 100 times
2. Measure: p50, p95, p99 latency
3. Track: response size, status codes
4. Test under load: 10 concurrent requests
5. Compare against SLA targets
对API端点进行基准测试:
1. Hit each endpoint 100 times
2. Measure: p50, p95, p99 latency
3. Track: response size, status codes
4. Test under load: 10 concurrent requests
5. Compare against SLA targets

Mode 3: Build Performance

模式3:构建性能

Measures development feedback loop:
1. Cold build time
2. Hot reload time (HMR)
3. Test suite duration
4. TypeScript check time
5. Lint time
6. Docker build time
衡量开发反馈链路效率:
1. Cold build time
2. Hot reload time (HMR)
3. Test suite duration
4. TypeScript check time
5. Lint time
6. Docker build time

Mode 4: Before/After Comparison

模式4:前后对比

Run before and after a change to measure impact:
/benchmark baseline    # saves current metrics
在变更前后运行以衡量变更影响:
/benchmark baseline    # saves current metrics

... make changes ...

... make changes ...

/benchmark compare # compares against baseline

Output:
MetricBeforeAfterDeltaVerdict
LCP1.2s1.4s+200ms⚠ WARN
Bundle180KB175KB-5KB✓ BETTER
Build12s14s+2s⚠ WARN
undefined
/benchmark compare # compares against baseline

输出示例:
MetricBeforeAfterDeltaVerdict
LCP1.2s1.4s+200ms⚠ WARN
Bundle180KB175KB-5KB✓ BETTER
Build12s14s+2s⚠ WARN
undefined

Output

输出

Stores baselines in
.ecc/benchmarks/
as JSON. Git-tracked so the team shares baselines.
将基线以JSON格式存储在
.ecc/benchmarks/
目录下,已纳入Git跟踪,因此团队成员共享同一套基准线。

Integration

集成

  • CI: run
    /benchmark compare
    on every PR
  • Pair with
    /canary-watch
    for post-deploy monitoring
  • Pair with
    /browser-qa
    for full pre-ship checklist
  • CI:在每个PR上运行
    /benchmark compare
  • 搭配
    /canary-watch
    使用,用于部署后监控
  • 搭配
    /browser-qa
    使用,完成发布前完整检查清单