performance-regression-debugging
Compare original and translation side by side
🇺🇸
Original
English🇨🇳
Translation
ChinesePerformance Regression Debugging
性能回归调试
Overview
概述
Performance regressions occur when code changes degrade application performance. Detection and quick resolution are critical.
性能回归是指代码变更导致应用性能下降的情况。及时检测和解决这类问题至关重要。
When to Use
使用场景
- After deployment performance degrades
- Metrics show negative trend
- User complaints about slowness
- A/B testing shows variance
- Regular performance monitoring
- 部署后性能下降
- 指标呈现负面趋势
- 用户反馈应用卡顿
- A/B测试显示性能差异
- 常规性能监控
Instructions
操作步骤
1. Detection & Measurement
1. 检测与度量
javascript
// Before: 500ms response time
// After: 1000ms response time (2x slower = regression)
// Capture baseline metrics
const baseline = {
responseTime: 500, // ms
timeToInteractive: 2000, // ms
largestContentfulPaint: 1500, // ms
memoryUsage: 50, // MB
bundleSize: 150 // KB gzipped
};
// Monitor after change
const current = {
responseTime: 1000,
timeToInteractive: 4000,
largestContentfulPaint: 3000,
memoryUsage: 150,
bundleSize: 200
};
// Calculate regression
const regressions = {};
for (let metric in baseline) {
const change = (current[metric] - baseline[metric]) / baseline[metric];
if (change > 0.1) { // >10% degradation
regressions[metric] = {
baseline: baseline[metric],
current: current[metric],
percentChange: (change * 100).toFixed(1) + '%',
severity: change > 0.5 ? 'Critical' : 'High'
};
}
}
// Results:
// responseTime: 500ms → 1000ms (100% slower = CRITICAL)
// largestContentfulPaint: 1500ms → 3000ms (100% slower = CRITICAL)javascript
// Before: 500ms response time
// After: 1000ms response time (2x slower = regression)
// Capture baseline metrics
const baseline = {
responseTime: 500, // ms
timeToInteractive: 2000, // ms
largestContentfulPaint: 1500, // ms
memoryUsage: 50, // MB
bundleSize: 150 // KB gzipped
};
// Monitor after change
const current = {
responseTime: 1000,
timeToInteractive: 4000,
largestContentfulPaint: 3000,
memoryUsage: 150,
bundleSize: 200
};
// Calculate regression
const regressions = {};
for (let metric in baseline) {
const change = (current[metric] - baseline[metric]) / baseline[metric];
if (change > 0.1) { // >10% degradation
regressions[metric] = {
baseline: baseline[metric],
current: current[metric],
percentChange: (change * 100).toFixed(1) + '%',
severity: change > 0.5 ? 'Critical' : 'High'
};
}
}
// Results:
// responseTime: 500ms → 1000ms (100% slower = CRITICAL)
// largestContentfulPaint: 1500ms → 3000ms (100% slower = CRITICAL)2. Root Cause Identification
2. 根因定位
yaml
Systematic Search:
Step 1: Identify Changed Code
- Check git commits between versions
- Review code review comments
- Identify risky changes
- Prioritize by likelyhood
Step 2: Binary Search (Bisect)
- Start with suspected change
- Disable the change
- Re-measure performance
- If improves → this is the issue
- If not → disable other changes
git bisect start
git bisect bad HEAD
git bisect good v1.0.0
# Test each commit
Step 3: Profile the Change
- Run profiler on old vs new code
- Compare flame graphs
- Identify expensive functions
- Check allocation patterns
Step 4: Analyze Impact
- Code review the change
- Understand what changed
- Check for O(n²) algorithms
- Look for new database queries
- Check for missing indexes
---
Common Regressions:
N+1 Query:
Before: 1 query (10ms)
After: 1000 queries (1000ms)
Caused: Removed JOIN, now looping
Fix: Restore JOIN or use eager loading
Missing Index:
Before: Index Scan (10ms)
After: Seq Scan (500ms)
Caused: New filter column, no index
Fix: Add index
Memory Leak:
Before: 50MB memory
After: 500MB after 1 hour
Caused: Listener not removed, cache grows
Fix: Clean up properly
Bundle Size:
Before: 150KB gzipped
After: 250KB gzipped
Caused: Added library without tree-shaking
Fix: Use lighter alternative or split
Algorithm Efficiency:
Before: O(n) = 1ms for 1000 items
After: O(n²) = 1000ms for 1000 items
Caused: Nested loops added
Fix: Use better algorithmyaml
系统化排查:
步骤1:识别变更的代码
- 检查不同版本间的git提交记录
- 查看代码评审评论
- 识别高风险变更
- 按可能性优先级排序
步骤2:二分查找(Bisect)
- 从疑似有问题的变更开始
- 禁用该变更
- 重新度量性能
- 如果性能提升 → 该变更就是问题根源
- 如果没有 → 禁用其他变更
git bisect start
git bisect bad HEAD
git bisect good v1.0.0
# 测试每个提交
步骤3:对变更进行性能分析
- 在旧代码和新代码上运行性能分析工具
- 对比火焰图
- 定位耗时较高的函数
- 检查内存分配模式
步骤4:分析影响
- 对变更进行代码评审
- 理解具体变更内容
- 检查是否存在O(n²)复杂度的算法
- 查找新增的数据库查询
- 检查是否缺少索引
---
常见的性能回归场景:
N+1查询:
之前: 1次查询(10ms)
之后: 1000次查询(1000ms)
原因: 移除了JOIN操作,改为循环查询
修复: 恢复JOIN或使用预加载
缺失索引:
之前: 索引扫描(10ms)
之后: 全表扫描(500ms)
原因: 新增了过滤字段但未创建索引
修复: 添加索引
内存泄漏:
之前: 内存占用50MB
之后: 1小时后占用500MB
原因: 未移除监听器,缓存持续增长
修复: 正确清理资源
包体积增大:
之前: 压缩后150KB
之后: 压缩后250KB
原因: 添加了未开启tree-shaking的库
修复: 使用更轻量的替代库或拆分代码
算法效率降低:
之前: O(n)复杂度,处理1000条数据耗时1ms
之后: O(n²)复杂度,处理1000条数据耗时1000ms
原因: 添加了嵌套循环
修复: 使用更高效的算法3. Fixing & Verification
3. 修复与验证
yaml
Fix Process:
1. Understand the Problem
- Profile and identify exactly what's slow
- Measure impact quantitatively
- Understand root cause
2. Implement Fix
- Make minimal changes
- Don't introduce new issues
- Test locally first
- Measure improvement
3. Verify Fix
- Run same measurement
- Check regression gone
- Ensure no new issues
- Compare metrics
Before regression: 500ms
After regression: 1000ms
After fix: 550ms (acceptable, minor overhead)
4. Prevent Recurrence
- Add performance test
- Set performance budget
- Alert on regressions
- Code review for perfyaml
修复流程:
1. 理解问题
- 通过性能分析准确定位慢代码
- 量化度量影响程度
- 理解问题根源
2. 实施修复
- 做最小化变更
- 不要引入新问题
- 先在本地测试
- 度量性能提升情况
3. 验证修复效果
- 运行相同的度量测试
- 确认性能回归已解决
- 确保未引入新问题
- 对比指标数据
回归前: 500ms
回归后: 1000ms
修复后: 550ms(可接受,存在轻微开销)
4. 预防复发
- 添加性能测试
- 设置性能预算
- 对性能回归设置告警
- 代码评审时关注性能4. Prevention Measures
4. 预防措施
yaml
Performance Testing:
Baseline Testing:
- Establish baseline metrics
- Record for each release
- Track trends over time
- Alert on degradation
Load Testing:
- Test with realistic load
- Measure under stress
- Identify bottlenecks
- Catch regressions
Performance Budgets:
- Set max bundle size
- Set max response time
- Set max LCP/FCP
- Enforce in CI/CD
Monitoring:
- Track real user metrics
- Alert on degradation
- Compare releases
- Analyze trends
---
Checklist:
[ ] Baseline metrics established
[ ] Regression detected and measured
[ ] Changed code identified
[ ] Root cause found (code, data, infra)
[ ] Fix implemented
[ ] Fix verified
[ ] No new issues introduced
[ ] Performance test added
[ ] Budget set
[ ] Monitoring updated
[ ] Team notified
[ ] Prevention measures in placeyaml
性能测试:
基准测试:
- 建立基准指标
- 为每个版本记录指标
- 跟踪长期趋势
- 性能下降时触发告警
负载测试:
- 使用真实负载进行测试
- 在压力下度量性能
- 定位瓶颈
- 发现性能回归
性能预算:
- 设置最大包体积
- 设置最大响应时间
- 设置最大LCP/FCP值
- 在CI/CD中强制执行
监控:
- 跟踪真实用户指标
- 性能下降时触发告警
- 对比不同版本的性能
- 分析趋势
---
检查清单:
[ ] 已建立基准指标
[ ] 已检测并度量性能回归
[ ] 已识别变更的代码
[ ] 已找到根因(代码、数据、基础设施)
[ ] 已实施修复
[ ] 已验证修复效果
[ ] 未引入新问题
[ ] 已添加性能测试
[ ] 已设置性能预算
[ ] 已更新监控配置
[ ] 已通知团队
[ ] 已落实预防措施Key Points
核心要点
- Establish baseline metrics for comparison
- Use binary search to find culprit commits
- Profile to identify exact bottleneck
- Measure before/after fix
- Add performance regression tests
- Set and enforce performance budgets
- Monitor production metrics
- Alert on significant degradation
- Document root cause
- Prevent through code review
- 建立用于对比的基准指标
- 使用二分查找定位有问题的提交
- 通过性能分析确定确切的瓶颈
- 修复前后都要进行度量
- 添加性能回归测试
- 设置并强制执行性能预算
- 监控生产环境指标
- 性能显著下降时触发告警
- 记录问题根因
- 通过代码评审预防问题