performance-testing

Compare original and translation side by side

🇺🇸

Original

English
🇨🇳

Translation

Chinese

Performance Testing

性能测试

<default_to_action> When testing performance or planning load tests:
  1. DEFINE SLOs: p95 response time, throughput, error rate targets
  2. IDENTIFY critical paths: revenue flows, high-traffic pages, key APIs
  3. CREATE realistic scenarios: user journeys, think time, varied data
  4. EXECUTE with monitoring: CPU, memory, DB queries, network
  5. ANALYZE bottlenecks and fix before production
Quick Test Type Selection:
  • Expected load validation → Load testing
  • Find breaking point → Stress testing
  • Sudden traffic spike → Spike testing
  • Memory leaks, resource exhaustion → Endurance/soak testing
  • Horizontal/vertical scaling → Scalability testing
Critical Success Factors:
  • Performance is a feature, not an afterthought
  • Test early and often, not just before release
  • Focus on user-impacting bottlenecks </default_to_action>
<default_to_action> 在测试性能或规划负载测试时:
  1. 定义SLO:p95响应时间、吞吐量、错误率目标
  2. 识别关键路径:营收流程、高流量页面、核心API
  3. 创建真实场景:用户旅程、思考时间、多样化数据
  4. 结合监控执行测试:CPU、内存、数据库查询、网络
  5. 分析瓶颈并在上线前修复
快速选择测试类型:
  • 验证预期负载 → 负载测试
  • 寻找系统临界点 → 压力测试
  • 应对突发流量峰值 → 峰值测试
  • 检测内存泄漏、资源耗尽 → 耐久性/浸泡测试
  • 验证水平/垂直扩容能力 → 可扩展性测试
关键成功因素:
  • 性能是一项功能,而非事后补充
  • 尽早且频繁测试,不要只在发布前进行
  • 聚焦影响用户体验的瓶颈 </default_to_action>

Quick Reference Card

快速参考卡片

When to Use

适用场景

  • Before major releases
  • After infrastructure changes
  • Before scaling events (Black Friday)
  • When setting SLAs/SLOs
  • 重大版本发布前
  • 基础设施变更后
  • 大促活动前(如黑色星期五)
  • 设定SLA/SLO时

Test Types

测试类型

TypePurposeWhen
LoadExpected trafficEvery release
StressBeyond capacityQuarterly
SpikeSudden surgeBefore events
EnduranceMemory leaksAfter code changes
ScalabilityScaling validationInfrastructure changes
类型目的适用时机
负载测试验证预期流量下的性能每次版本发布
压力测试测试超出容量的极限情况每季度
峰值测试应对突发流量激增活动前
耐久性测试检测内存泄漏代码变更后
可扩展性测试验证扩容能力基础设施变更时

Key Metrics

核心指标

MetricTargetWhy
p95 response< 200msUser experience
Throughput10k req/minCapacity
Error rate< 0.1%Reliability
CPU< 70%Headroom
Memory< 80%Stability
指标目标原因
p95响应时间< 200ms用户体验
吞吐量10k req/min系统容量
错误率< 0.1%可靠性
CPU使用率< 70%预留冗余
内存使用率< 80%稳定性

Tools

工具

  • k6: Modern, JS-based, CI/CD friendly
  • JMeter: Enterprise, feature-rich
  • Artillery: Simple YAML configs
  • Gatling: Scala, great reporting
  • k6:现代化、基于JS、适配CI/CD
  • JMeter:企业级、功能丰富
  • Artillery:简洁的YAML配置
  • Gatling:基于Scala、报表能力出色

Agent Coordination

Agent协同

  • qe-performance-tester
    : Load test orchestration
  • qe-quality-analyzer
    : Results analysis
  • qe-production-intelligence
    : Production comparison

  • qe-performance-tester
    :负载测试编排
  • qe-quality-analyzer
    :结果分析
  • qe-production-intelligence
    :生产环境对比

Defining SLOs

定义SLO

Bad: "The system should be fast" Good: "p95 response time < 200ms under 1,000 concurrent users"
javascript
export const options = {
  thresholds: {
    http_req_duration: ['p(95)<200'],  // 95% < 200ms
    http_req_failed: ['rate<0.01'],     // < 1% failures
  },
};

反面示例: "系统应该很快" 正面示例: "在1000个并发用户下,p95响应时间<200ms"
javascript
export const options = {
  thresholds: {
    http_req_duration: ['p(95)<200'],  // 95%请求<200ms
    http_req_failed: ['rate<0.01'],     // 错误率<1%
  },
};

Realistic Scenarios

真实场景构建

Bad: Every user hits homepage repeatedly Good: Model actual user behavior
javascript
// Realistic distribution
// 40% browse, 30% search, 20% details, 10% checkout
export default function () {
  const action = Math.random();
  if (action < 0.4) browse();
  else if (action < 0.7) search();
  else if (action < 0.9) viewProduct();
  else checkout();

  sleep(randomInt(1, 5)); // Think time
}

反面示例: 所有用户重复访问首页 正面示例: 模拟真实用户行为
javascript
// 真实行为分布
// 40%浏览、30%搜索、20%查看详情、10%结账
export default function () {
  const action = Math.random();
  if (action < 0.4) browse();
  else if (action < 0.7) search();
  else if (action < 0.9) viewProduct();
  else checkout();

  sleep(randomInt(1, 5)); // 思考时间
}

Common Bottlenecks

常见瓶颈

Database

数据库

Symptoms: Slow queries under load, connection pool exhaustion Fixes: Add indexes, optimize N+1 queries, increase pool size, read replicas
症状: 负载下查询缓慢、连接池耗尽 解决方案: 添加索引、优化N+1查询、扩大连接池、读写分离

N+1 Queries

N+1查询

javascript
// BAD: 100 orders = 101 queries
const orders = await Order.findAll();
for (const order of orders) {
  const customer = await Customer.findById(order.customerId);
}

// GOOD: 1 query
const orders = await Order.findAll({ include: [Customer] });
javascript
// 反面示例:100条订单=101次查询
const orders = await Order.findAll();
for (const order of orders) {
  const customer = await Customer.findById(order.customerId);
}

// 正面示例:仅1次查询
const orders = await Order.findAll({ include: [Customer] });

Synchronous Processing

同步处理

Problem: Blocking operations in request path (sending email during checkout) Fix: Use message queues, process async, return immediately
问题: 请求路径中的阻塞操作(如结账时同步发送邮件) 解决方案: 使用消息队列、异步处理、立即返回结果

Memory Leaks

内存泄漏

Detection: Endurance testing, memory profiling Common causes: Event listeners not cleaned, caches without eviction
检测方式: 耐久性测试、内存分析 常见原因: 事件监听器未清理、缓存未设置淘汰策略

External Dependencies

外部依赖

Solutions: Aggressive timeouts, circuit breakers, caching, graceful degradation

解决方案: 严格超时设置、断路器模式、缓存、优雅降级

k6 CI/CD Example

k6 CI/CD示例

javascript
// performance-test.js
import http from 'k6/http';
import { check, sleep } from 'k6';

export const options = {
  stages: [
    { duration: '1m', target: 50 },   // Ramp up
    { duration: '3m', target: 50 },   // Steady
    { duration: '1m', target: 0 },    // Ramp down
  ],
  thresholds: {
    http_req_duration: ['p(95)<200'],
    http_req_failed: ['rate<0.01'],
  },
};

export default function () {
  const res = http.get('https://api.example.com/products');
  check(res, {
    'status is 200': (r) => r.status === 200,
    'response time < 200ms': (r) => r.timings.duration < 200,
  });
  sleep(1);
}
yaml
undefined
javascript
// performance-test.js
import http from 'k6/http';
import { check, sleep } from 'k6';

export const options = {
  stages: [
    { duration: '1m', target: 50 },   // 逐步加压
    { duration: '3m', target: 50 },   // 稳定负载
    { duration: '1m', target: 0 },    // 逐步减压
  ],
  thresholds: {
    http_req_duration: ['p(95)<200'],
    http_req_failed: ['rate<0.01'],
  },
};

export default function () {
  const res = http.get('https://api.example.com/products');
  check(res, {
    '状态码为200': (r) => r.status === 200,
    '响应时间<200ms': (r) => r.timings.duration < 200,
  });
  sleep(1);
}
yaml
undefined

GitHub Actions

GitHub Actions

  • name: Run k6 test uses: grafana/k6-action@v0.3.0 with: filename: performance-test.js

---
  • name: 运行k6测试 uses: grafana/k6-action@v0.3.0 with: filename: performance-test.js

---

Analyzing Results

结果分析

Good Results

合格结果

Load: 1,000 users | p95: 180ms | Throughput: 5,000 req/s
Error rate: 0.05% | CPU: 65% | Memory: 70%
负载:1000用户 | p95:180ms | 吞吐量:5000 req/s
错误率:0.05% | CPU:65% | 内存:70%

Problems

异常结果

Load: 1,000 users | p95: 3,500ms ❌ | Throughput: 500 req/s ❌
Error rate: 5% ❌ | CPU: 95% ❌ | Memory: 90% ❌
负载:1000用户 | p95:3500ms ❌ | 吞吐量:500 req/s ❌
错误率:5% ❌ | CPU:95% ❌ | 内存:90% ❌

Root Cause Analysis

根因分析

  1. Correlate metrics: When response time spikes, what changes?
  2. Check logs: Errors, warnings, slow queries
  3. Profile code: Where is time spent?
  4. Monitor resources: CPU, memory, disk
  5. Trace requests: End-to-end flow

  1. 关联指标:响应时间飙升时,哪些指标发生变化?
  2. 检查日志:错误、警告、慢查询
  3. 代码分析:时间消耗在何处?
  4. 资源监控:CPU、内存、磁盘
  5. 请求追踪:端到端流程

Anti-Patterns

反模式

❌ Anti-Pattern✅ Better
Testing too lateTest early and often
Unrealistic scenariosModel real user behavior
0 to 1000 users instantlyRamp up gradually
No monitoring during testsMonitor everything
No baselineEstablish and track trends
One-time testingContinuous performance testing

❌ 反模式✅ 优化方案
测试过晚尽早且频繁测试
场景不真实模拟真实用户行为
瞬间从0到1000用户逐步加压
测试时无监控全链路监控
无基准线建立并追踪趋势
一次性测试持续性能测试

Agent-Assisted Performance Testing

Agent辅助性能测试

typescript
// Comprehensive load test
await Task("Load Test", {
  target: 'https://api.example.com',
  scenarios: {
    checkout: { vus: 100, duration: '5m' },
    search: { vus: 200, duration: '5m' },
    browse: { vus: 500, duration: '5m' }
  },
  thresholds: {
    'http_req_duration': ['p(95)<200'],
    'http_req_failed': ['rate<0.01']
  }
}, "qe-performance-tester");

// Bottleneck analysis
await Task("Analyze Bottlenecks", {
  testResults: perfTest,
  metrics: ['cpu', 'memory', 'db_queries', 'network']
}, "qe-performance-tester");

// CI integration
await Task("CI Performance Gate", {
  mode: 'smoke',
  duration: '1m',
  vus: 10,
  failOn: { 'p95_response_time': 300, 'error_rate': 0.01 }
}, "qe-performance-tester");

typescript
// 全面负载测试
await Task("Load Test", {
  target: 'https://api.example.com',
  scenarios: {
    checkout: { vus: 100, duration: '5m' },
    search: { vus: 200, duration: '5m' },
    browse: { vus: 500, duration: '5m' }
  },
  thresholds: {
    'http_req_duration': ['p(95)<200'],
    'http_req_failed': ['rate<0.01']
  }
}, "qe-performance-tester");

// 瓶颈分析
await Task("Analyze Bottlenecks", {
  testResults: perfTest,
  metrics: ['cpu', 'memory', 'db_queries', 'network']
}, "qe-performance-tester");

// CI集成
await Task("CI Performance Gate", {
  mode: 'smoke',
  duration: '1m',
  vus: 10,
  failOn: { 'p95_response_time': 300, 'error_rate': 0.01 }
}, "qe-performance-tester");

Agent Coordination Hints

Agent协同提示

Memory Namespace

内存命名空间

aqe/performance/
├── results/*       - Test execution results
├── baselines/*     - Performance baselines
├── bottlenecks/*   - Identified bottlenecks
└── trends/*        - Historical trends
aqe/performance/
├── results/*       - 测试执行结果
├── baselines/*     - 性能基准线
├── bottlenecks/*   - 已识别瓶颈
└── trends/*        - 历史趋势

Fleet Coordination

集群协同

typescript
const perfFleet = await FleetManager.coordinate({
  strategy: 'performance-testing',
  agents: [
    'qe-performance-tester',
    'qe-quality-analyzer',
    'qe-production-intelligence',
    'qe-deployment-readiness'
  ],
  topology: 'sequential'
});

typescript
const perfFleet = await FleetManager.coordinate({
  strategy: 'performance-testing',
  agents: [
    'qe-performance-tester',
    'qe-quality-analyzer',
    'qe-production-intelligence',
    'qe-deployment-readiness'
  ],
  topology: 'sequential'
});

Pre-Production Checklist

上线前检查清单

  • Load test passed (expected traffic)
  • Stress test passed (2-3x expected)
  • Spike test passed (sudden surge)
  • Endurance test passed (24+ hours)
  • Database indexes in place
  • Caching configured
  • Monitoring and alerting set up
  • Performance baseline established

  • 负载测试通过(预期流量)
  • 压力测试通过(2-3倍预期流量)
  • 峰值测试通过(突发流量)
  • 耐久性测试通过(24小时以上)
  • 数据库索引已配置
  • 缓存已启用
  • 监控与告警已设置
  • 性能基准线已建立

Related Skills

相关技能

  • agentic-quality-engineering - Agent coordination
  • api-testing-patterns - API performance
  • chaos-engineering-resilience - Resilience testing

  • agentic-quality-engineering - Agent协同
  • api-testing-patterns - API性能测试
  • chaos-engineering-resilience - 弹性测试

Remember

要点回顾

Performance is a feature: Test it like functionality Test continuously: Not just before launch Monitor production: Synthetic + real user monitoring Fix what matters: Focus on user-impacting bottlenecks Trend over time: Catch degradation early
With Agents: Agents automate load testing, analyze bottlenecks, and compare with production. Use agents to maintain performance at scale.
性能是一项功能: 像测试功能一样测试性能 持续测试: 不要只在上线前进行 监控生产环境: 合成监控+真实用户监控 聚焦关键瓶颈: 优先修复影响用户的问题 追踪长期趋势: 提前发现性能退化
借助Agent: Agent可自动化负载测试、分析瓶颈并与生产环境对比。使用Agent在规模场景下维持性能水平。