ln-814-optimization-executor
Compare original and translation side by side
🇺🇸
Original
English🇨🇳
Translation
ChinesePaths: File paths (,shared/,references/) are relative to skills repo root. If not found at CWD, locate this SKILL.md directory and go up one level for repo root. If../ln-*is missing, fetch files via WebFetch fromshared/.https://raw.githubusercontent.com/levnikolaevich/claude-code-skills/master/skills/{path}
路径: 文件路径(、shared/、references/)均相对于技能仓库根目录。如果在当前工作目录未找到,请定位到该SKILL.md所在目录并向上跳转一级即为仓库根目录。如果缺失../ln-*目录,请通过WebFetch从shared/获取文件。https://raw.githubusercontent.com/levnikolaevich/claude-code-skills/master/skills/{path}
ln-814-optimization-executor
ln-814-optimization-executor
Type: L3 Worker
Category: 8XX Optimization
Executes optimization hypotheses from the researcher using keep/discard autoresearch loop. Supports multi-file changes, compound baselines, and any optimization type (algorithm, architecture, query, caching, batching).
类型: L3 Worker
分类: 8XX 优化
使用保留/丢弃自动研究循环执行研究员提出的优化假设,支持多文件变更、复合基准线,以及任意类型的优化(算法、架构、查询、缓存、批量处理)。
Overview
概述
| Aspect | Details |
|---|---|
| Input | |
| Output | Optimized code on isolated branch, per-hypothesis results, experiment log |
| Pattern | Strike-first: apply all → test → measure. Bisect only on failure. A/B only for contested alternatives |
| 维度 | 详情 |
|---|---|
| 输入 | |
| 输出 | 独立分支上的优化后代码、单假设执行结果、实验日志 |
| 模式 | 优先批量执行:一次性应用所有优化 → 测试 → 度量。仅失败时执行二分排查。仅针对存在争议的方案执行A/B测试 |
Workflow
工作流
Phases: Pre-flight → Baseline → Strike-First Execution → Report → Gap Analysis
阶段: 前置检查 → 基准线建立 → 批量优化执行 → 结果报告 → 差距分析
Phase 0: Pre-flight Checks
阶段0:前置检查
Slug Resolution
Slug解析
- If invoked via Agent with contextStore containing — use directly.
slug - If invoked standalone — derive slug from context_file path or ask user.
- 如果通过Agent调用且contextStore包含参数 —— 直接使用该值
slug - 如果是独立调用 —— 从上下文文件路径推导slug或向用户询问
Step 1: Load Context
步骤1:加载上下文
Read from project root. Contains problem statement, profiling results, research hypotheses, and target metric.
.hex-skills/optimization/{slug}/context.mdIf file not found: check conversation context for the same data (standalone invocation).
读取项目根目录下的文件,包含问题描述、性能分析结果、研究假设和目标指标。
.hex-skills/optimization/{slug}/context.md如果未找到该文件:从对话上下文提取相同数据(独立调用场景)。
Step 2: Pre-flight Validation
步骤2:前置校验
| Check | Required | Action if Missing |
|---|---|---|
| Hypotheses provided (H1..H7) | Yes | Block — nothing to execute |
| Test infrastructure | Yes | Block (see ci_tool_detection.md) |
| Git clean state | Yes | Block (need clean baseline for revert) |
| Worktree isolation | Yes | Create per git_worktree_fallback.md |
| E2E safety test | No (recommended) | Read from context; WARN if null — full test suite as fallback gate |
MANDATORY READ: Load — use optimization rows.
MANDATORY READ: Load — use Test Frameworks + Benchmarks sections.
shared/references/git_worktree_fallback.mdshared/references/ci_tool_detection.md| 检查项 | 是否必填 | 缺失时处理 |
|---|---|---|
| 提供了优化假设(H1..H7) | 是 | 阻断流程 —— 无内容可执行 |
| 测试基础设施可用 | 是 | 阻断流程(参考ci_tool_detection.md) |
| Git状态干净 | 是 | 阻断流程(需要干净的基准线用于回滚) |
| 工作树隔离可用 | 是 | 参考git_worktree_fallback.md创建 |
| E2E安全测试 | 否(推荐) | 从上下文读取;为空则告警 —— 降级使用全量测试集作为准入校验 |
强制阅读: 加载 —— 使用优化相关配置项。
强制阅读: 加载 —— 使用测试框架 + 基准测试相关章节。
shared/references/git_worktree_fallback.mdshared/references/ci_tool_detection.mdE2E Safety Test
E2E安全测试
Read from context file (discovered by profiler during test discovery phase).
e2e_test_command| Source | Action |
|---|---|
Context has | Use as functional safety gate in Phase 2 |
Context has | WARN: full test suite serves as fallback gate |
| Standalone (no context) | User must provide test command; block if missing |
从上下文文件读取(由分析器在测试发现阶段自动识别)。
e2e_test_command| 来源 | 处理方式 |
|---|---|
上下文包含 | 作为阶段2的功能安全准入校验 |
上下文 | 告警:降级使用全量测试集作为准入校验 |
| 独立调用(无上下文) | 用户必须提供测试命令;缺失则阻断流程 |
Phase 1: Establish Baseline
阶段1:建立基准线
Reuse baseline from performance map (already measured with real metrics).
复用性能映射表中的基准线(已通过真实指标完成度量)。
From Context File
从上下文文件获取
Read and from .
performance_map.baselineperformance_map.test_command.hex-skills/optimization/{slug}/context.md| Field | Source |
|---|---|
| Discovered/created test command |
| Multi-metric snapshot: wall time, CPU, memory, I/O |
从读取和。
.hex-skills/optimization/{slug}/context.mdperformance_map.baselineperformance_map.test_command| 字段 | 来源 |
|---|---|
| 自动识别/创建的测试命令 |
| 多维度指标快照:耗时、CPU、内存、I/O |
Verification Run
校验运行
Run once to confirm baseline is still valid (code unchanged since profiling):
test_command| Step | Action |
|---|---|
| 1 | Run |
| 2 | IF result within 10% of |
| 3 | IF result diverges > 10% → re-measure (3 runs, median) as new baseline |
| 4 | IF test FAILS → BLOCK: "test fails on unmodified code" |
执行一次确认基准线仍然有效(性能分析后代码未发生变更):
test_command| 步骤 | 处理方式 |
|---|---|
| 1 | 执行 |
| 2 | 如果结果在 |
| 3 | 如果结果偏差超过10% → 重新度量(执行3次取中位数)作为新基准线 |
| 4 | 如果测试失败 → 阻断流程:提示「未修改的代码上测试执行失败」 |
Phase 2: Strike-First Execution
阶段2:批量优化执行
MANDATORY READ: Load optimization_categories.md for pattern reference during implementation.
Apply maximum changes at once. Only fall back to A/B testing where sources genuinely disagree on approach.
强制阅读: 加载optimization_categories.md作为实现阶段的模式参考。
一次性应用尽可能多的变更,仅当不同来源的方案存在本质分歧时降级使用A/B测试。
Step 1: Triage Hypotheses
步骤1:假设分级
Split hypotheses from researcher into two groups:
| Group | Criteria | Action |
|---|---|---|
| Uncontested | Clear best approach, no conflicting alternatives | Apply directly in the strike |
| Contested | Multiple approaches exist (e.g., source A says cache, source B says batch) OR | A/B test each alternative on top of full implementation |
Most hypotheses should be uncontested — the researcher already ranked them by evidence.
将研究员提供的假设拆分为两组:
| 分组 | 判断标准 | 处理方式 |
|---|---|---|
| 无争议 | 方案最优性明确,无冲突替代方案 | 批量执行阶段直接应用 |
| 有争议 | 存在多种可行方案(例如来源A建议缓存,来源B建议批量处理)或与其他假设存在 | 在全量无争议优化落地后对每个替代方案执行A/B测试 |
绝大多数假设都应属于无争议分组 —— 研究员已按照证据可信度完成排序。
Step 2: Strike (Apply All Uncontested)
步骤2:批量执行(应用所有无争议假设)
1. APPLY all uncontested hypotheses at once (all file edits)
2. VERIFY: Run full test suite
IF tests FAIL:
- IF fixable (typo, missing import) → fix & re-run ONCE
- IF fundamental → BISECT (see Step 4)
3. E2E GATE (if e2e_test_command not null):
IF FAIL → BISECT
4. MEASURE: 5 runs, median
5. COMPARE: improvement vs baseline
IF improvement meets target → DONE. Commit all:
git add {all_files}
git commit -m "perf: apply optimizations H1,H2,H3,... (+{improvement}%)"
IF no improvement → BISECT1. 一次性应用所有无争议假设(所有文件修改)
2. 验证:执行全量测试集
如果测试失败:
- 如果问题可修复(拼写错误、缺失导入) → 修复后重新执行1次
- 如果是根本性问题 → 二分排查(参考步骤4)
3. E2E准入校验(如果e2e_test_command不为空):
如果失败 → 二分排查
4. 度量:执行5次取中位数
5. 对比:与基准线对比优化效果
如果优化效果达到目标 → 完成。提交所有变更:
git add {all_files}
git commit -m "perf: apply optimizations H1,H2,H3,... (+{improvement}%)"
如果无优化效果 → 二分排查Step 3: Contested Alternatives (A/B on top of strike)
步骤3:有争议替代方案(批量执行基础上的A/B测试)
For each contested pair/group, with ALL uncontested changes already applied:
FOR each contested hypothesis group:
1. Apply alternative A → test → measure (5 runs, median)
2. Revert alternative A, apply alternative B → test → measure
3. KEEP the winner. Commit.
4. Winner becomes part of the baseline for next contested group.针对每个有争议的分组,在所有无争议变更已应用的基础上:
遍历每个有争议假设分组:
1. 应用方案A → 测试 → 度量(执行5次取中位数)
2. 回滚方案A,应用方案B → 测试 → 度量
3. 保留效果更优的方案,提交变更
4. 获胜方案作为后续争议分组的基准线组成部分Step 4: Bisect (only on strike failure)
步骤4:二分排查(仅批量执行失败时触发)
If strike fails tests or shows no improvement:
1. Revert all changes: git checkout -- . && git clean -fd
2. Binary search: apply first half of hypotheses → test
- IF passes → problem in second half
- IF fails → problem in first half
3. Narrow down to the breaking hypothesis
4. Remove it from strike, re-apply remaining → test → measure
5. Log removed hypothesis with reason如果批量执行后测试失败或无优化效果:
1. 回滚所有变更:git checkout -- . && git clean -fd
2. 二分查找:应用前半部分假设 → 测试
- 如果通过 → 问题出在后半部分
- 如果失败 → 问题出在前半部分
3. 逐步缩小范围定位到导致问题的假设
4. 从批量执行列表中移除该假设,重新应用剩余假设 → 测试 → 度量
5. 记录被移除的假设及原因Scope Rules
范围规则
| Rule | Description |
|---|---|
| File scope | Multiple files allowed (not limited to single function) |
| Signature changes | Allowed if tests still pass |
| New files | Allowed (cache wrapper, batch adapter, utility) |
| New dependencies | Allowed if already in project ecosystem (e.g., using configured Redis) |
| Time budget | 45 minutes total |
| 规则 | 描述 |
|---|---|
| 文件范围 | 允许修改多个文件(不限制为单个函数) |
| 签名变更 | 只要测试仍然通过则允许 |
| 新增文件 | 允许(缓存封装层、批量适配器、工具函数) |
| 新增依赖 | 只要已存在于项目生态中则允许(例如使用已配置的Redis) |
| 时间预算 | 总耗时不超过45分钟 |
Revert Protocol
回滚协议
| Scope | Command |
|---|---|
| Full revert | |
| Single hypothesis | |
| 范围 | 命令 |
|---|---|
| 全量回滚 | |
| 单假设回滚 | |
Safety Rules
安全规则
| Rule | Description |
|---|---|
| Traceability | Commit message lists all applied hypothesis IDs |
| Isolation | All work in isolated worktree; never modify main worktree |
| Bisect only on failure | Do NOT test hypotheses individually unless strike fails or alternatives genuinely conflict |
| Crash triage | Runtime crash → fix once if trivial (typo, import), else bisect to find cause |
| 规则 | 描述 |
|---|---|
| 可追溯性 | 提交信息列出所有应用的假设ID |
| 隔离性 | 所有工作在隔离工作树中完成;永远不修改主工作树 |
| 仅失败时二分排查 | 不要单独测试每个假设,除非批量执行失败或方案存在本质冲突 |
| 崩溃分级处理 | 运行时崩溃 → 如果是简单问题(拼写错误、导入缺失)修复一次,否则二分排查定位原因 |
Stop Conditions (Execution Loop)
执行循环停止条件
| Condition | Action |
|---|---|
| Strike passes + improvement meets target | STOP — commit, proceed to Report |
| All contested alternatives tested | STOP — commit winner, proceed to Report |
| Bisect removes all hypotheses | STOP — report "all hypotheses failed" with profiling data |
| Time budget exceeded (45 min) | STOP — report partial results with remaining hypotheses |
| All tests fail after strike + bisect | STOP — full revert, report diagnostic value only |
| 条件 | 处理方式 |
|---|---|
| 批量执行通过 + 优化效果达到目标 | 停止 —— 提交变更,进入报告阶段 |
| 所有有争议替代方案测试完成 | 停止 —— 提交获胜方案,进入报告阶段 |
| 二分排查移除了所有假设 | 停止 —— 附带性能数据报告「所有假设均失败」 |
| 超过时间预算(45分钟) | 停止 —— 报告部分结果及剩余未执行假设 |
| 批量执行+二分排查后所有测试仍失败 | 停止 —— 全量回滚,仅报告诊断信息 |
Phase 3: Report Results
阶段3:结果报告
Report Schema
报告结构
| Field | Description |
|---|---|
| baseline | Original measurement (metric + value) |
| final | Final measurement after optimizations |
| total_improvement_pct | Overall percentage improvement |
| target_met | Boolean — did we reach the target metric? |
| strike_result | |
| hypotheses_applied | List of hypothesis IDs applied in strike |
| hypotheses_removed | List removed during bisect (with reasons) |
| contested_results | Per-contested group: alternatives tested, winner, measurement |
| branch | Worktree branch name |
| files_modified | All changed files |
| e2e_test | |
| 字段 | 描述 |
|---|---|
| baseline | 原始度量值(指标+数值) |
| final | 优化后最终度量值 |
| total_improvement_pct | 整体优化百分比 |
| target_met | 布尔值 —— 是否达到目标指标? |
| strike_result | |
| hypotheses_applied | 批量执行阶段应用的假设ID列表 |
| hypotheses_removed | 二分排查阶段移除的假设列表(附原因) |
| contested_results | 每个有争议分组的详情:测试的替代方案、获胜方案、度量结果 |
| branch | 工作树分支名称 |
| files_modified | 所有修改的文件 |
| e2e_test | |
Results Comparison (mandatory)
结果对比(必填)
Show baseline vs final for EVERY metric from . Include both percentage and multiplier.
performance_map.baseline| Metric | Baseline | After Strike | Improvement |
|--------|----------|-------------|-------------|
| Wall time | 7280ms | 3800ms | 47.8% (1.9x) |
| CPU time | 850ms | 720ms | 15.3% (1.2x) |
| Memory peak | 256MB | 245MB | 4.3% |
| HTTP round-trips | 13 | 2 | 84.6% (6.5x) |
Target: 5000ms → Achieved: 3800ms ✓ TARGET MET展示中每个指标的基准线与最终值对比,包含百分比和倍数。
performance_map.baseline| 指标 | 基准线 | 批量执行后 | 优化幅度 |
|--------|----------|-------------|-------------|
| Wall time | 7280ms | 3800ms | 47.8% (1.9x) |
| CPU time | 850ms | 720ms | 15.3% (1.2x) |
| Memory peak | 256MB | 245MB | 4.3% |
| HTTP round-trips | 13 | 2 | 84.6% (6.5x) |
目标: 5000ms → 达成: 3800ms ✓ 达到目标Per-Function Delta (if instrumentation available)
函数级性能变化(如果有可用埋点)
If from context is non-empty, run once more AFTER strike to capture per-function timing with the same instrumentation the profiler placed:
instrumented_filestest_command| Function | Before (ms) | After (ms) | Delta |
|----------|------------|------------|-------|
| mt_translate | 3500 | 450 | -87% (7.8x) |
| tikal_extract | 2800 | 2800 | 0% (unchanged) |Then clean up: — remove all profiling instrumentation before final commit.
git checkout -- {instrumented_files}Present both tables to user. This is the primary deliverable — numbers the user sees first.
如果上下文中的非空,在批量执行完成后再执行一次,使用性能分析器植入的相同埋点采集函数级耗时:
instrumented_filestest_command| 函数 | 执行前(ms) | 执行后(ms) | 变化幅度 |
|----------|------------|------------|-------|
| mt_translate | 3500 | 450 | -87% (7.8x) |
| tikal_extract | 2800 | 2800 | 0% (无变化) |然后清理: —— 最终提交前移除所有性能分析埋点。
git checkout -- {instrumented_files}向用户展示以上两个表格,这是核心交付物 —— 用户最先看到的量化结果。
Experiment Log
实验日志
Write to :
{project_root}/.hex-skills/optimization/{slug}/ln-814-log.tsv| Column | Description |
|---|---|
| timestamp | ISO 8601 |
| phase | |
| hypotheses | Comma-separated IDs applied in this round |
| baseline_ms | Baseline before this round |
| result_ms | Measurement after changes |
| improvement_pct | Percentage change |
| status | |
| commit | Git commit hash |
| files | Comma-separated modified files |
| e2e_status | pass / fail / skipped |
Append to existing file if present (enables tracking across multiple runs).
写入到:
{project_root}/.hex-skills/optimization/{slug}/ln-814-log.tsv| 列名 | 描述 |
|---|---|
| timestamp | ISO 8601格式时间戳 |
| phase | |
| hypotheses | 本轮应用的假设ID,英文逗号分隔 |
| baseline_ms | 本轮执行前的基准线耗时 |
| result_ms | 变更后的度量耗时 |
| improvement_pct | 变化百分比 |
| status | |
| commit | Git提交哈希 |
| files | 修改的文件,英文逗号分隔 |
| e2e_status | pass / fail / skipped |
如果文件已存在则追加内容(支持多轮执行的跟踪)。
Phase 4: Gap Analysis (If Target Not Met)
阶段4:差距分析(未达到目标时执行)
If target metric not reached after all hypotheses:
| Section | Content |
|---|---|
| Achievement | What was achieved (original → final, improvement %) |
| Remaining bottlenecks | From time map: which steps still dominate |
| Remaining cycles | If coordinator runs multi-cycle: "{remaining} optimization cycles available for remaining bottlenecks" |
| Infrastructure recommendations | If bottleneck requires infra changes (scaling, caching layer, CDN) |
| Further research | Optimization directions not explored in this run |
如果所有假设执行后仍未达到目标指标:
| 章节 | 内容 |
|---|---|
| 已达成效果 | 已实现的优化结果(原始值 → 最终值,优化百分比) |
| 剩余瓶颈 | 从耗时分布来看仍占主导的步骤 |
| 剩余迭代次数 | 如果协调器支持多轮迭代:「剩余{remaining}次优化迭代可用于解决剩余瓶颈」 |
| 基础设施建议 | 如果瓶颈需要基础设施变更(扩容、缓存层、CDN) |
| 后续研究方向 | 本次执行未探索的优化方向 |
Error Handling
错误处理
| Error | Recovery |
|---|---|
| Strike fails all tests | Bisect to find breaking hypothesis, remove it, retry |
| Strike shows no improvement | Bisect to identify ineffective hypotheses |
| Measurement inconsistent (high variance) | Increase runs to 10, use median |
| Worktree creation fails | Fall back to branch per git_worktree_fallback.md |
| Time budget exceeded | Stop loop, report partial results with hypotheses remaining |
| Multi-file revert fails | |
| 错误 | 恢复方式 |
|---|---|
| 批量执行后所有测试失败 | 二分排查定位导致问题的假设,移除后重试 |
| 批量执行后无优化效果 | 二分排查识别无效假设 |
| 度量结果不一致(方差过高) | 增加执行次数到10次,取中位数 |
| 工作树创建失败 | 参考git_worktree_fallback.md降级使用分支 |
| 超过时间预算 | 停止循环,报告部分结果及剩余未执行假设 |
| 多文件回滚失败 | 在工作树中执行 |
References
参考资料
- optimization_categories.md — optimization pattern checklist
- (test + benchmark detection)
shared/references/ci_tool_detection.md - (worktree isolation)
shared/references/git_worktree_fallback.md
- optimization_categories.md —— 优化模式检查清单
- (测试+基准测试识别)
shared/references/ci_tool_detection.md - (工作树隔离)
shared/references/git_worktree_fallback.md
Runtime Summary Artifact
运行时摘要产物
MANDATORY READ: Load
shared/references/coordinator_summary_contract.mdWrite before finishing.
.hex-skills/runtime-artifacts/runs/{run_id}/optimization-execution/{slug}.json强制阅读: 加载
shared/references/coordinator_summary_contract.md执行完成前写入。
.hex-skills/runtime-artifacts/runs/{run_id}/optimization-execution/{slug}.jsonDefinition of Done
完成定义
- Baseline established using same metric type as observed problem
- Hypotheses triaged: uncontested vs contested
- Strike applied: all uncontested hypotheses implemented at once
- Tests pass after strike
- Contested alternatives A/B tested on top of full implementation
- Bisect performed only if strike fails (not preemptively)
- E2E safety test passes (or documented as unavailable)
- Experiment log written to
.hex-skills/optimization/{slug}/ln-814-log.tsv - Report returned with baseline, final, improvement%, strike result
- All changes on isolated branch, pushed to remote
- Gap analysis provided if target metric not met
- Optimization execution artifact written to the shared location
Version: 2.0.0
Last Updated: 2026-03-14
- 使用与观测到的问题相同类型的指标建立基准线
- 完成假设分级:区分无争议/有争议假设
- 完成批量执行:一次性实现所有无争议假设
- 批量执行后测试通过
- 在全量实现基础上完成有争议替代方案的A/B测试
- 仅在批量执行失败时执行二分排查(不提前执行)
- E2E安全测试通过(或记录不可用)
- 实验日志写入到
.hex-skills/optimization/{slug}/ln-814-log.tsv - 返回的报告包含基准线、最终值、优化百分比、批量执行结果
- 所有变更在隔离分支上,已推送到远程仓库
- 未达到目标指标时提供差距分析
- 优化执行产物写入到共享路径
版本: 2.0.0
最后更新: 2026-03-14