ln-811-performance-profiler
Compare original and translation side by side
🇺🇸
Original
English🇨🇳
Translation
ChinesePaths: File paths (,shared/,references/) are relative to skills repo root. If not found at CWD, locate this SKILL.md directory and go up one level for repo root.../ln-*
路径说明: 文件路径(、shared/、references/)均相对于技能仓库根目录。若在当前工作目录未找到,请定位至本SKILL.md所在目录,再向上一级即为仓库根目录。../ln-*
ln-811-performance-profiler
ln-811-performance-profiler
Type: L3 Worker
Category: 8XX Optimization
Runtime profiler that executes the optimization target, measures multiple metrics (CPU, memory, I/O, time), instruments code for per-function breakdown, and produces a standardized performance map from real data.
类型: L3 Worker
分类: 8XX 优化类
一款运行时性能分析工具,可执行优化目标、测量多类指标(CPU、内存、I/O、时间)、对代码进行插桩以实现按函数拆解分析,并基于真实数据生成标准化性能图谱。
Overview
概述
| Aspect | Details |
|---|---|
| Input | Problem statement: target (file/endpoint/pipeline) + observed metric |
| Output | Performance map (multi-metric, per-function), suspicion stack, bottleneck classification |
| Pattern | Discover test → Baseline run → Static analysis → Deep profile → Performance map → Report |
| 维度 | 详情 |
|---|---|
| 输入 | 问题描述:优化目标(文件/接口/流水线)+ 观测指标 |
| 输出 | 多指标按函数拆分的性能图谱、疑点栈、瓶颈分类 |
| 流程模式 | 测试发现 → 基准运行 → 静态分析 → 深度剖析 → 性能图谱 → 报告生成 |
Workflow
工作流程
Phases: Test Discovery → Baseline Run → Static Analysis → Deep Profile → Performance Map → Report
阶段划分: 测试发现 → 基准运行 → 静态分析 → 深度剖析 → 性能图谱 → 报告生成
Phase 0: Test Discovery/Creation
阶段0:测试用例发现/创建
MANDATORY READ: Load for test framework detection.
MANDATORY READ: Load for auto-generating benchmarks when none exist.
shared/references/ci_tool_detection.mdshared/references/benchmark_generation.mdFind or create commands that exercise the optimization target. Two outputs: (profiling/measurement) and (functional safety gate).
test_commande2e_test_command必读文档: 加载以进行测试框架检测。
必读文档: 加载以在无现有基准测试时自动生成。
shared/references/ci_tool_detection.mdshared/references/benchmark_generation.md查找或创建可执行优化目标的命令,输出两个结果:(用于性能剖析/指标测量)和(用于功能安全校验)。
test_commande2e_test_commandStep 1: Discover test_command
步骤1:查找test_command
| Priority | Method | Action |
|---|---|---|
| 1 | User-provided | User specifies test command or API endpoint |
| 2 | Discover existing E2E test | Grep test files for target entry point (stop at first match) |
| 3 | Create test script | Generate per |
E2E discovery protocol (stop at first match):
| Priority | Method | How |
|---|---|---|
| 1 | Route-based search | Grep e2e/integration test files for entry point route |
| 2 | Function-based search | Grep for entry point function name |
| 3 | Module-based search | Grep for import of entry point module |
Test creation (if no existing test found):
| Target Type | Generated Script |
|---|---|
| API endpoint | |
| Function | Stack-specific benchmark per |
| Pipeline | Full pipeline invocation with test input |
| 优先级 | 方法 | 操作 |
|---|---|---|
| 1 | 用户提供 | 用户指定测试命令或API接口 |
| 2 | 查找现有端到端测试 | 在测试文件中搜索目标入口点(找到首个匹配项即停止) |
| 3 | 创建测试脚本 | 根据 |
端到端测试发现规则(找到首个匹配项即停止):
| 优先级 | 方法 | 操作方式 |
|---|---|---|
| 1 | 基于路由搜索 | 在端到端/集成测试文件中搜索入口点路由 |
| 2 | 基于函数搜索 | 搜索入口点函数名 |
| 3 | 基于模块搜索 | 搜索入口点模块的导入语句 |
测试用例创建(无现有测试时):
| 目标类型 | 生成的脚本 |
|---|---|
| API接口 | |
| 函数 | 根据 |
| 流水线 | 使用测试输入调用完整流水线 |
Step 2: Discover e2e_test_command
步骤2:查找e2e_test_command
If came from E2E discovery (Step 1 priority 2): .
test_commande2e_test_command = test_commandOtherwise, run E2E discovery protocol again (same 3-priority table) to find a separate functional safety test.
If not found: , log:
e2e_test_command = nullWARNING: No e2e test covers {entry_point}. Full test suite serves as functional gate.如果来自端到端测试发现(步骤1优先级2):。
test_commande2e_test_command = test_command否则,再次执行端到端测试发现规则(使用相同的3级优先级表格)以找到独立的功能安全测试用例。
如果未找到:,并记录日志:
e2e_test_command = nullWARNING: No e2e test covers {entry_point}. Full test suite serves as functional gate.Output
输出结果
| Field | Description |
|---|---|
| Command for profiling/measurement |
| Command for functional safety gate (may equal test_command, or null) |
| Discovery method: user / route / function / module / none |
| 字段 | 说明 |
|---|---|
| 用于性能剖析/指标测量的命令 |
| 用于功能安全校验的命令(可能与test_command相同,或为null) |
| 发现方式:user / route / function / module / none |
Phase 1: Baseline Run (Multi-Metric)
阶段1:基准运行(多指标测量)
Run with system-level profiling. Capture simultaneously:
test_command| Metric | How to Capture | When |
|---|---|---|
| Wall time | | Always |
| CPU time (user+sys) | | Always |
| Memory peak (RSS) | | Always |
| I/O bytes | | If I/O suspected |
| HTTP round-trips | Count from structured logs or application metrics | If network I/O in call graph |
| GPU utilization | | Only if CUDA/GPU detected in stack |
使用系统级性能剖析工具运行,同时捕获以下指标:
test_command| 指标 | 捕获方式 | 适用场景 |
|---|---|---|
| 挂钟时间 | | 始终捕获 |
| CPU时间(用户态+内核态) | | 始终捕获 |
| 内存峰值(RSS) | | 始终捕获 |
| I/O字节数 | | 当怀疑存在I/O瓶颈时 |
| HTTP往返次数 | 从结构化日志或应用指标中统计 | 当调用图中存在网络I/O时 |
| GPU利用率 | | 仅当检测到技术栈中使用CUDA/GPU时 |
Baseline Protocol
基准运行规则
| Parameter | Value |
|---|---|
| Runs | 3 |
| Metric | Median |
| Warm-up | 1 discarded run |
| Output | |
| 参数 | 取值 |
|---|---|
| 运行次数 | 3 |
| 统计方式 | 中位数 |
| 预热处理 | 丢弃1次运行结果 |
| 输出 | |
Phase 2: Static Analysis → Instrumentation Points
阶段2:静态分析 → 插桩点确定
MANDATORY READ: Load bottleneck_classification.md
Trace call chain from code + build suspicion stack. Purpose: guide WHERE to instrument in Phase 3.
必读文档: 加载bottleneck_classification.md
从代码中追踪调用链并构建疑点栈。目的: 指导阶段3的代码插桩位置。
Step 1: Trace Call Chain
步骤1:追踪调用链
Starting from entry point, trace depth-first (max depth 5). At each step, READ the full function body.
Cross-service tracing: If is available from coordinator and a step makes an HTTP/gRPC call to another service whose code is accessible:
service_topology| Situation | Action |
|---|---|
| HTTP call to service with code in submodule/monorepo | Follow into that service's handler: resolve route → trace handler code (depth resets to 0 for the new service) |
| HTTP call to service without accessible code | Classify as External, record latency estimate |
| gRPC/message queue to known service | Same as HTTP — follow into handler if code accessible |
Record on each step to track which service owns it. The performance_map tree can span multiple services.
service: "{service_name}"stepsDepth-First Rule: If code of the called service is accessible — ALWAYS profile INSIDE. NEVER classify an accessible service as "External/slow" without profiling its internals. "Slow" is a symptom, not a diagnosis.
5 Whys for each bottleneck: Before reporting a bottleneck, chain "why?" until you reach config/architecture level:
- "What is slow?" → alignment service (5.9s) 2. "Why?" → 6 pairs × ~1s each 3. "Why ~1s per pair?" → O(n²) mwmf computation 4. "Why O(n²)?" → library default, not production config 5. "Why default?" → not configured → root cause = config
matching_methods
从入口点开始,采用深度优先方式追踪(最大深度5层)。每一步都需完整阅读函数体代码。
跨服务追踪: 如果从协调器处获取到,且某一步骤调用了另一个代码可访问的服务(HTTP/gRPC调用):
service_topology| 场景 | 操作 |
|---|---|
| 调用代码位于子模块/单体仓库中的服务的HTTP请求 | 追踪至该服务的处理器:解析路由 → 追踪处理器代码(新服务的追踪深度重置为0) |
| 调用代码不可访问的服务的HTTP请求 | 归类为外部服务,记录延迟估算值 |
| 调用已知服务的gRPC/消息队列请求 | 与HTTP场景相同——若代码可访问则追踪至处理器 |
在每一步记录以追踪该步骤所属的服务。性能图谱的树可跨多个服务。
service: "{service_name}"steps深度优先规则: 若被调用服务的代码可访问,必须对其内部进行性能剖析。未对内部进行剖析时,绝不能将可访问服务归类为“外部/缓慢”。“缓慢”是症状,而非诊断结果。
针对每个瓶颈的5Why分析法: 在报告瓶颈前,连续追问“为什么?”直至找到配置/架构层面的根因:
- “什么模块慢?” → 对齐服务(5.9秒) 2. “为什么慢?” → 6组数据 × 每组约1秒 3. “每组为什么约1秒?” → O(n²)复杂度的mwmf计算 4. “为什么是O(n²)复杂度?” → 库默认配置,而非生产环境配置 5. “为什么使用默认配置?” → 未配置 → 根因=配置问题
matching_methods
Step 2: Classify & Suspicion Scan
步骤2:分类与疑点扫描
For each step, classify by type (CPU, I/O-DB, I/O-Network, I/O-File, Architecture, External, Cache) and scan for performance concerns.
Suspicion checklist (minimum, not limitation):
| Category | What to Look For |
|---|---|
| Connection management | Client created per-request? Missing pooling? Missing reuse? |
| Data flow | Data read multiple times? Over-fetching? Unnecessary transforms? |
| Async patterns | Sync I/O in async context? Sequential awaits without data dependency? |
| Resource lifecycle | Unclosed connections? Temp files? Memory accumulation in loop? |
| Configuration | Hardcoded timeouts? Default pool sizes? Missing batch size config? |
| Redundant work | Same validation at multiple layers? Same data loaded twice? |
| Architecture | N+1 in loop? Batch API unused? Cache infra unused? Sequential-when-parallel? |
| (open) | Anything else spotted — checklist does not limit findings |
对每个步骤按类型分类(CPU、I/O-数据库、I/O-网络、I/O-文件、架构、外部服务、缓存),并扫描性能问题疑点。
疑点检查清单(基础项,并非全部):
| 类别 | 检查内容 |
|---|---|
| 连接管理 | 是否每次请求创建新客户端?是否缺少连接池?是否未复用连接? |
| 数据流 | 是否重复读取数据?是否过度获取数据?是否存在不必要的数据转换? |
| 异步模式 | 异步上下文是否存在同步I/O?是否存在无数据依赖的顺序等待? |
| 资源生命周期 | 是否存在未关闭的连接?是否残留临时文件?循环中是否存在内存累积? |
| 配置 | 是否存在硬编码超时?是否使用默认连接池大小?是否缺少批量处理大小配置? |
| 冗余操作 | 是否在多层重复校验?是否重复加载相同数据? |
| 架构设计 | 循环中是否存在N+1问题?是否未使用批量API?是否未利用缓存基础设施?是否可并行却采用串行执行? |
| (其他) | 任何其他发现——清单不限制可发现的问题 |
Step 2b: Suspicion Deduplication
步骤2b:疑点去重
MANDATORY READ: Load
shared/references/output_normalization.mdAfter generating suspicions across all call chain steps, normalize and deduplicate per §1-§2:
- Normalize suspicion descriptions (replace specific values with placeholders)
- Group identical suspicions across different steps → merge into single entry with
affected_steps: [list] - Example: "Missing connection pooling" found in steps 1.1, 1.2, 1.3 → one suspicion with
affected_steps: ["1.1", "1.2", "1.3"]
必读文档: 加载
shared/references/output_normalization.md在所有调用链步骤中生成疑点后,根据§1-§2进行标准化和去重:
- 标准化疑点描述(将具体值替换为占位符)
- 将不同步骤中相同的疑点分组 → 合并为单个条目,添加
affected_steps: [列表] - 示例:在步骤1.1、1.2、1.3中均发现“缺少连接池” → 合并为一个疑点,
affected_steps: ["1.1", "1.2", "1.3"]
Step 3: Verify & Map to Instrumentation Points
步骤3:验证与插桩点映射
FOR each suspicion:
1. VERIFY: follow code to confirm or dismiss
2. VERDICT: CONFIRMED → map to instrumentation point | DISMISSED → log reason
3. For each CONFIRMED suspicion, identify:
- function to wrap with timing
- I/O call to count
- memory allocation to trackFOR each suspicion:
1. VERIFY: follow code to confirm or dismiss
2. VERDICT: CONFIRMED → map to instrumentation point | DISMISSED → log reason
3. For each CONFIRMED suspicion, identify:
- function to wrap with timing
- I/O call to count
- memory allocation to trackProfiler Selection (per stack)
性能剖析工具选择(按技术栈)
| Stack | Non-invasive profiler | Invasive (if non-invasive insufficient) |
|---|---|---|
| Python | | |
| Node.js | | |
| Go | | Usually not needed |
| .NET | | |
| Rust | | |
Stack detection: per .
shared/references/ci_tool_detection.md| 技术栈 | 非侵入式剖析工具 | 侵入式工具(当非侵入式工具不足时) |
|---|---|---|
| Python | | |
| Node.js | | |
| Go | | 通常无需使用 |
| .NET | | |
| Rust | | |
技术栈检测: 依据执行。
shared/references/ci_tool_detection.mdPhase 3: Deep Profile
阶段3:深度性能剖析
Profiler Hierarchy (escalate as needed)
剖析工具层级(按需升级)
| Level | Tool Examples | What It Shows | When to Use |
|---|---|---|---|
| 1 | | Function-level hotspots | Always — first pass |
| 2 | | Line-level timing in hotspot function | Hotspot function found but cause unclear |
| 3 | | Per-line memory allocation | Memory metrics abnormal in baseline |
| 层级 | 工具示例 | 展示内容 | 适用场景 |
|---|---|---|---|
| 1 | | 函数级热点 | 始终优先使用 |
| 2 | | 热点函数的行级计时 | 找到热点函数但原因不明时 |
| 3 | | 行级内存分配 | 基准运行中内存指标异常时 |
Step 1: Non-Invasive Profiling (preferred)
步骤1:非侵入式性能剖析(优先选择)
Run with Level 1 profiler to get per-function breakdown without code changes.
test_command使用1级剖析工具运行,无需修改代码即可获取按函数拆分的分析结果。
test_commandStep 2: Escalation Decision
步骤2:升级决策
After Level 1 profiler run, evaluate result against suspicion stack from Phase 2:
| Profiler Result | Action |
|---|---|
| Hotspot function identified, time breakdown confirms suspicions | DONE — proceed to Phase 4 |
| Hotspot identified but internal cause unclear (CPU vs I/O inside one function) | Escalate to Level 2 (line-level timing) |
| Memory baseline abnormal (peak or delta) | Escalate to Level 3 (memory profiler) |
| Multiple suspicions unresolved — profiler granularity insufficient | Go to Step 3 (targeted instrumentation) |
| Profiler unavailable or overhead > 20% of wall time | Go to Step 3 (targeted instrumentation) |
运行1级剖析工具后,结合阶段2的疑点栈评估结果:
| 剖析结果 | 操作 |
|---|---|
| 已识别热点函数,时间拆分结果验证了疑点 | 完成,进入阶段4 |
| 已识别热点函数但内部原因不明(函数内CPU与I/O瓶颈区分不清) | 升级至2级(行级计时) |
| 基准内存指标异常(峰值或增量) | 升级至3级(内存剖析工具) |
| 多个疑点未解决——剖析工具粒度不足 | 进入步骤3(定向插桩) |
| 剖析工具不可用或开销超过挂钟时间的20% | 进入步骤3(定向插桩) |
Step 3: Targeted Instrumentation (proactive)
步骤3:定向插桩(主动式)
Add timing/logging along the call stack at instrumentation points identified in Phase 2 Step 3:
1. FOR each CONFIRMED suspicion without measured data:
Add timing wrapper around target function/I/O call
Add counter for I/O round-trips if network/DB suspected
(cross-service: instrument in the correct service's codebase)
2. Re-run test_command (3 runs, median)
3. Collect per-function measurements from logs
4. Record list of instrumented files (may span multiple services)| Instrumentation Type | When | Example |
|---|---|---|
| Timing wrapper | Always for unresolved suspicions | |
| I/O call counter | Network or DB bottleneck suspected | Count HTTP requests, DB queries in loop |
| Memory snapshot | Memory accumulation suspected | |
KEEP instrumentation in place. The executor reuses it for post-optimization per-function comparison, then cleans up after strike. Report in output.
instrumented_files在阶段2步骤3确定的插桩点处,沿调用链添加计时/日志:
1. FOR each CONFIRMED suspicion without measured data:
Add timing wrapper around target function/I/O call
Add counter for I/O round-trips if network/DB suspected
(cross-service: instrument in the correct service's codebase)
2. Re-run test_command (3 runs, median)
3. Collect per-function measurements from logs
4. Record list of instrumented files (may span multiple services)| 插桩类型 | 适用场景 | 示例 |
|---|---|---|
| 计时包装 | 针对未解决的疑点始终使用 | |
| I/O调用计数器 | 怀疑存在网络或数据库瓶颈时 | 统计循环中的HTTP请求、数据库查询次数 |
| 内存快照 | 怀疑存在内存累积时 | |
保留插桩代码。执行器会复用这些插桩代码进行优化后的按函数对比,完成后再清理。需在输出中报告。
instrumented_filesPhase 4: Build Performance Map
阶段4:生成性能图谱
Standardized format — feeds into for downstream consumption.
.optimization/{slug}/context.mdyaml
performance_map:
test_command: "uv run pytest tests/e2e/test_example.py -s"
baseline:
wall_time_ms: 7280
cpu_time_ms: 850
memory_peak_mb: 256
memory_delta_mb: 45
io_read_bytes: 1200000
io_write_bytes: 500000
http_round_trips: 13
steps: # service field present only in multi-service topology
- id: "1"
function: "process_job"
location: "app/services/job_processor.py:45"
service: "api" # optional — which service owns this step
wall_time_ms: 7200
time_share_pct: 99
type: "function_call"
children:
- id: "1.1"
function: "translate_binary"
wall_time_ms: 7100
type: "function_call"
children:
- id: "1.1.1"
function: "tikal_extract"
service: "tikal" # cross-service: code traced into submodule
wall_time_ms: 2800
type: "http_call"
http_round_trips: 1
- id: "1.1.2"
function: "mt_translate"
service: "mt-engine"
wall_time_ms: 3500
type: "http_call"
http_round_trips: 13
bottleneck_classification: "I/O-Network"
bottleneck_detail: "13 sequential HTTP calls to MT service (3500ms)"
top_bottlenecks:
- step: "1.1.2", type: "I/O-Network", share: 48%
- step: "1.1.1", type: "I/O-Network", share: 38%标准化格式——将结果写入以供下游流程使用。
.optimization/{slug}/context.mdyaml
performance_map:
test_command: "uv run pytest tests/e2e/test_example.py -s"
baseline:
wall_time_ms: 7280
cpu_time_ms: 850
memory_peak_mb: 256
memory_delta_mb: 45
io_read_bytes: 1200000
io_write_bytes: 500000
http_round_trips: 13
steps: # service field present only in multi-service topology
- id: "1"
function: "process_job"
location: "app/services/job_processor.py:45"
service: "api" # optional — which service owns this step
wall_time_ms: 7200
time_share_pct: 99
type: "function_call"
children:
- id: "1.1"
function: "translate_binary"
wall_time_ms: 7100
type: "function_call"
children:
- id: "1.1.1"
function: "tikal_extract"
service: "tikal" # cross-service: code traced into submodule
wall_time_ms: 2800
type: "http_call"
http_round_trips: 1
- id: "1.1.2"
function: "mt_translate"
service: "mt-engine"
wall_time_ms: 3500
type: "http_call"
http_round_trips: 13
bottleneck_classification: "I/O-Network"
bottleneck_detail: "13 sequential HTTP calls to MT service (3500ms)"
top_bottlenecks:
- step: "1.1.2", type: "I/O-Network", share: 48%
- step: "1.1.1", type: "I/O-Network", share: 38%Phase 5: Report
阶段5:生成报告
Report Structure
报告结构
profile_result:
entry_point_info:
type: <string> # "api_endpoint" | "function" | "pipeline"
location: <string> # file:line
route: <string|null> # API route (if endpoint)
function: <string> # Entry point function name
performance_map: <object> # Full map from Phase 4
bottleneck_classification: <string> # Primary bottleneck type
bottleneck_detail: <string> # Human-readable description
top_bottlenecks:
- step, type, share, description
optimization_hints: # CONFIRMED suspicions only (Phase 2)
- hint with evidence
suspicion_stack: # Full audit trail (confirmed + dismissed)
- category: <string>
location: <string>
description: <string>
verdict: <string> # "confirmed" | "dismissed"
evidence: <string>
verification_note: <string>
e2e_test:
command: <string|null> # E2E safety test command (from Phase 0)
source: <string> # user / route / function / module / none
instrumented_files: [<string>] # Files with active instrumentation (empty if non-invasive only)
wrong_tool_indicators: [] # Empty = proceed, non-empty = exitprofile_result:
entry_point_info:
type: <string> # "api_endpoint" | "function" | "pipeline"
location: <string> # file:line
route: <string|null> # API route (if endpoint)
function: <string> # Entry point function name
performance_map: <object> # Full map from Phase 4
bottleneck_classification: <string> # Primary bottleneck type
bottleneck_detail: <string> # Human-readable description
top_bottlenecks:
- step, type, share, description
optimization_hints: # CONFIRMED suspicions only (Phase 2)
- hint with evidence
suspicion_stack: # Full audit trail (confirmed + dismissed)
- category: <string>
location: <string>
description: <string>
verdict: <string> # "confirmed" | "dismissed"
evidence: <string>
verification_note: <string>
e2e_test:
command: <string|null> # E2E safety test command (from Phase 0)
source: <string> # user / route / function / module / none
instrumented_files: [<string>] # Files with active instrumentation (empty if non-invasive only)
wrong_tool_indicators: [] # Empty = proceed, non-empty = exitWrong Tool Indicators
不适用本工具的标识
| Indicator | Condition |
|---|---|
| 90%+ measured time in external service, no batch/cache/parallel path |
| Measured time within expected range for operation type |
| Bottleneck is hardware (measured via system metrics) |
| Code already uses best patterns (confirmed by suspicion scan) |
| 标识 | 触发条件 |
|---|---|
| 90%以上的测量时间消耗在外部服务,且无批量/缓存/并行优化路径 |
| 测量时间符合该操作类型的行业预期范围 |
| 瓶颈为硬件问题(通过系统指标测量确认) |
| 代码已采用最佳实践(通过疑点扫描确认) |
Error Handling
错误处理
| Error | Recovery |
|---|---|
| Cannot resolve entry point | Block: "file/function not found at {path}" |
| Test command fails on unmodified code | Block: "test fails before profiling — fix test first" |
| Profiler not available for stack | Fall back to invasive instrumentation (Phase 3 Step 2) |
| Instrumentation breaks tests | Revert immediately: |
| Call chain too deep (> 5 levels) | Stop at depth 5, note truncation |
| Cannot classify step type | Default to "Unknown", use measured time |
| No I/O detected (pure CPU) | Classify as CPU, focus on algorithm profiling |
| 错误 | 恢复措施 |
|---|---|
| 无法解析入口点 | 阻塞:"file/function not found at {path}" |
| 未修改代码时测试命令执行失败 | 阻塞:"test fails before profiling — fix test first" |
| 当前技术栈无可用的剖析工具 | Fallback到侵入式插桩(阶段3步骤2) |
| 插桩导致测试失败 | 立即回滚: |
| 调用链过深(超过5层) | 在5层深度处停止,记录截断说明 |
| 无法对步骤类型进行分类 | 默认归类为"Unknown",使用测量时间 |
| 未检测到I/O(纯CPU场景) | 归类为CPU类型,重点进行算法剖析 |
References
参考文档
- bottleneck_classification.md — classification taxonomy
- latency_estimation.md — latency heuristics (fallback for static-only mode)
- — stack/tool detection
shared/references/ci_tool_detection.md - — benchmark templates per stack
shared/references/benchmark_generation.md
- bottleneck_classification.md — 分类体系
- latency_estimation.md — 延迟估算规则(纯静态模式下的fallback方案)
- — 技术栈/工具检测
shared/references/ci_tool_detection.md - — 各技术栈的基准测试模板
shared/references/benchmark_generation.md
Definition of Done
完成标准
- Test command discovered or created for optimization target
- E2E safety test discovered (or documented as unavailable)
- Baseline measured: wall time, CPU, memory (3 runs, median)
- Call graph traced and function bodies read
- Suspicion stack built: each suspicion verified and mapped to instrumentation point
- Deep profile completed (non-invasive preferred, invasive if needed)
- Instrumented files reported (cleanup deferred to executor)
- Performance map built in standardized format (real measurements)
- Top 3 bottlenecks identified from measured data
- Wrong tool indicators evaluated from real metrics
- optimization_hints contain only CONFIRMED suspicions with measurement evidence
- Report returned to coordinator
Version: 3.0.0
Last Updated: 2026-03-15
- 已为优化目标找到或创建测试命令
- 已找到端到端安全测试(或记录为无可用测试)
- 已完成基准测量:挂钟时间、CPU、内存(3次运行,取中位数)
- 已追踪调用链并阅读函数体代码
- 已构建疑点栈:每个疑点均已验证并映射到插桩点
- 已完成深度剖析(优先非侵入式,必要时使用侵入式)
- 已报告插桩文件列表(清理工作由执行器延迟执行)
- 已生成标准化格式的性能图谱(基于真实测量数据)
- 已从测量数据中识别出Top 3瓶颈
- 已基于真实指标评估是否适用本工具
- optimization_hints仅包含有测量证据的已确认疑点
- 已向协调器返回报告
版本: 3.0.0
最后更新日期: 2026-03-15