perf-profile

Compare original and translation side by side

🇺🇸

Original

English
🇨🇳

Translation

Chinese

Phase 1: Determine Scope

阶段1:确定范围

Read the argument:
  • System name → focus profiling on that specific system
  • full
    → run a comprehensive profile across all systems

读取参数:
  • 系统名称 → 针对该特定系统开展剖析
  • full
    → 对所有系统进行全面剖析

Phase 2: Load Performance Budgets

阶段2:加载性能预算

Check for existing performance targets in design docs or CLAUDE.md:
  • Target FPS (e.g., 60fps = 16.67ms frame budget)
  • Memory budget (total and per-system)
  • Load time targets
  • Draw call budgets
  • Network bandwidth limits (if multiplayer)

检查设计文档或CLAUDE.md中是否存在现有性能目标:
  • 目标FPS(例如:60fps = 每帧16.67ms预算)
  • 内存预算(总量及单系统)
  • 加载时间目标
  • 绘制调用预算
  • 网络带宽限制(若为多人游戏)

Phase 3: Analyze Codebase

阶段3:分析代码库

CPU Profiling Targets:
  • _process()
    /
    Update()
    /
    Tick()
    functions — list all and estimate cost
  • Nested loops over large collections
  • String operations in hot paths
  • Allocation patterns in per-frame code
  • Unoptimized search/sort over game entities
  • Expensive physics queries (raycasts, overlaps) every frame
Memory Profiling Targets:
  • Large data structures and their growth patterns
  • Texture/asset memory footprint estimates
  • Object pool vs instantiate/destroy patterns
  • Leaked references (objects that should be freed but aren't)
  • Cache sizes and eviction policies
Rendering Targets (if applicable):
  • Draw call estimates
  • Overdraw from overlapping transparent objects
  • Shader complexity
  • Unoptimized particle systems
  • Missing LODs or occlusion culling
I/O Targets:
  • Save/load performance
  • Asset loading patterns (sync vs async)
  • Network message frequency and size

CPU剖析目标:
  • _process()
    /
    Update()
    /
    Tick()
    函数 — 列出所有函数并估算成本
  • 针对大型集合的嵌套循环
  • 热路径中的字符串操作
  • 每帧代码中的内存分配模式
  • 针对游戏实体的未优化搜索/排序
  • 每帧执行的高开销物理查询(射线检测、重叠检测)
内存剖析目标:
  • 大型数据结构及其增长模式
  • 纹理/资源内存占用估算
  • 对象池实例化/销毁模式对比
  • 内存泄漏引用(应释放但未释放的对象)
  • 缓存大小与淘汰策略
渲染剖析目标(如适用):
  • 绘制调用估算
  • 重叠透明对象导致的过度绘制
  • 着色器复杂度
  • 未优化的粒子系统
  • 缺失LOD(细节层次)或遮挡剔除
I/O剖析目标:
  • 保存/加载性能
  • 资源加载模式(同步vs异步)
  • 网络消息频率与大小

Phase 4: Generate Profiling Report

阶段4:生成剖析报告

markdown
undefined
markdown
undefined

Performance Profile: [System or Full]

性能剖析报告:[系统或全量]

Generated: [Date]
生成时间:[日期]

Performance Budgets

性能预算

MetricBudgetEstimated CurrentStatus
Frame time[16.67ms][estimate][OK/WARNING/OVER]
Memory[target][estimate][OK/WARNING/OVER]
Load time[target][estimate][OK/WARNING/OVER]
Draw calls[target][estimate][OK/WARNING/OVER]
指标预算当前估算值状态
帧时间[16.67ms][估算值][正常/警告/超标]
内存[目标值][估算值][正常/警告/超标]
加载时间[目标值][估算值][正常/警告/超标]
绘制调用[目标值][估算值][正常/警告/超标]

Hotspots Identified

识别到的热点

#LocationIssueEstimated ImpactFix Effort
#位置问题估算影响修复成本

Optimization Recommendations (Priority Order)

优化建议(优先级排序)

  1. [Title] — [Description]
    • Location: [file:line]
    • Expected gain: [estimate]
    • Risk: [Low/Med/High]
    • Approach: [How to implement]
  1. [标题] — [描述]
    • 位置:[文件:行号]
    • 预期收益:[估算值]
    • 风险:[低/中/高]
    • 实施方案:[具体实现方式]

Quick Wins (< 1 hour each)

快速优化项(每项耗时<1小时)

  • [Simple optimization 1]
  • [简单优化项1]

Requires Investigation

需要进一步调研

  • [Area that needs actual runtime profiling to confirm impact]

Output the report with a summary: top 3 hotspots, estimated headroom vs budget, and recommended next action.

---
  • [需通过运行时剖析确认影响的领域]

输出报告时需包含摘要:前3个热点问题、与预算相比的估算剩余空间,以及建议的下一步行动。

---

Phase 5: Scope and Timeline Decision

阶段5:范围与时间线决策

Activate this phase only if any hotspot has Fix Effort rated M or L.
Present significant-effort items and ask the user to choose for each:
  • A) Implement the optimization (proceed with fix now or schedule it)
  • B) Reduce feature scope (run
    /scope-check [feature]
    to analyze trade-offs)
  • C) Accept the performance hit and defer to Polish phase (log as known issue)
  • D) Escalate to technical-director for an architectural decision (run
    /architecture-decision
    )
If multiple items are deferred to Polish (choice C), record them under
### Deferred to Polish
.
This skill is read-only — no files are written. Verdict: COMPLETE — performance profile generated.

仅当存在修复成本为中(M)或高(L)的热点问题时,启动此阶段。
列出高成本优化项,并请用户为每项选择:
  • A) 实施优化(立即修复或安排排期)
  • B) 缩减功能范围(运行
    /scope-check [feature]
    分析权衡)
  • C) 接受性能损耗并推迟至打磨阶段(记录为已知问题)
  • D) 上报技术总监进行架构决策(运行
    /architecture-decision
若多项选择推迟至打磨阶段(选项C),请将其记录在
### 推迟至打磨阶段
下。
本技能为只读模式 — 不会写入任何文件。结论:完成 — 性能剖析报告已生成。

Phase 6: Next Steps

阶段6:下一步行动

  • If bottlenecks require architectural change: run
    /architecture-decision
    .
  • If scope reduction is needed: run
    /scope-check [feature]
    .
  • To schedule optimizations: run
    /sprint-plan update
    .
  • 若瓶颈需要架构变更:运行
    /architecture-decision
  • 若需缩减功能范围:运行
    /scope-check [feature]
  • 若需安排优化排期:运行
    /sprint-plan update

Rules

规则

  • Never optimize without measuring first — gut feelings about performance are unreliable
  • Recommendations must include estimated impact — "make it faster" is not actionable
  • Profile on target hardware, not just development machines
  • Static analysis (this skill) identifies candidates; runtime profiling confirms
  • 未经测量绝不优化 — 对性能的直觉判断不可靠
  • 建议必须包含估算影响 — “提升速度”不具备可执行性
  • 在目标硬件上进行剖析,而非仅在开发机器上
  • 静态分析(本技能)识别候选问题;运行时剖析确认问题