ln-814-optimization-executor

Compare original and translation side by side

🇺🇸

Original

English
🇨🇳

Translation

Chinese
Paths: File paths (
shared/
,
references/
,
../ln-*
) are relative to skills repo root. If not found at CWD, locate this SKILL.md directory and go up one level for repo root. If
shared/
is missing, fetch files via WebFetch from
https://raw.githubusercontent.com/levnikolaevich/claude-code-skills/master/skills/{path}
.
路径: 文件路径(
shared/
references/
../ln-*
)均相对于技能仓库根目录。如果在当前工作目录未找到,请定位到该SKILL.md所在目录并向上跳转一级即为仓库根目录。如果缺失
shared/
目录,请通过WebFetch从
https://raw.githubusercontent.com/levnikolaevich/claude-code-skills/master/skills/{path}
获取文件。

ln-814-optimization-executor

ln-814-optimization-executor

Type: L3 Worker Category: 8XX Optimization
Executes optimization hypotheses from the researcher using keep/discard autoresearch loop. Supports multi-file changes, compound baselines, and any optimization type (algorithm, architecture, query, caching, batching).

类型: L3 Worker 分类: 8XX 优化
使用保留/丢弃自动研究循环执行研究员提出的优化假设,支持多文件变更、复合基准线,以及任意类型的优化(算法、架构、查询、缓存、批量处理)。

Overview

概述

AspectDetails
Input
.hex-skills/optimization/{slug}/context.md
OR conversation context (standalone invocation)
OutputOptimized code on isolated branch, per-hypothesis results, experiment log
PatternStrike-first: apply all → test → measure. Bisect only on failure. A/B only for contested alternatives

维度详情
输入
.hex-skills/optimization/{slug}/context.md
或对话上下文(独立调用场景)
输出独立分支上的优化后代码、单假设执行结果、实验日志
模式优先批量执行:一次性应用所有优化 → 测试 → 度量。仅失败时执行二分排查。仅针对存在争议的方案执行A/B测试

Workflow

工作流

Phases: Pre-flight → Baseline → Strike-First Execution → Report → Gap Analysis

阶段: 前置检查 → 基准线建立 → 批量优化执行 → 结果报告 → 差距分析

Phase 0: Pre-flight Checks

阶段0:前置检查

Slug Resolution

Slug解析

  • If invoked via Agent with contextStore containing
    slug
    — use directly.
  • If invoked standalone — derive slug from context_file path or ask user.
  • 如果通过Agent调用且contextStore包含
    slug
    参数 —— 直接使用该值
  • 如果是独立调用 —— 从上下文文件路径推导slug或向用户询问

Step 1: Load Context

步骤1:加载上下文

Read
.hex-skills/optimization/{slug}/context.md
from project root. Contains problem statement, profiling results, research hypotheses, and target metric.
If file not found: check conversation context for the same data (standalone invocation).
读取项目根目录下的
.hex-skills/optimization/{slug}/context.md
文件,包含问题描述、性能分析结果、研究假设和目标指标。
如果未找到该文件:从对话上下文提取相同数据(独立调用场景)。

Step 2: Pre-flight Validation

步骤2:前置校验

CheckRequiredAction if Missing
Hypotheses provided (H1..H7)YesBlock — nothing to execute
Test infrastructureYesBlock (see ci_tool_detection.md)
Git clean stateYesBlock (need clean baseline for revert)
Worktree isolationYesCreate per git_worktree_fallback.md
E2E safety testNo (recommended)Read from context; WARN if null — full test suite as fallback gate
MANDATORY READ: Load
shared/references/git_worktree_fallback.md
— use optimization rows. MANDATORY READ: Load
shared/references/ci_tool_detection.md
— use Test Frameworks + Benchmarks sections.
检查项是否必填缺失时处理
提供了优化假设(H1..H7)阻断流程 —— 无内容可执行
测试基础设施可用阻断流程(参考ci_tool_detection.md)
Git状态干净阻断流程(需要干净的基准线用于回滚)
工作树隔离可用参考git_worktree_fallback.md创建
E2E安全测试否(推荐)从上下文读取;为空则告警 —— 降级使用全量测试集作为准入校验
强制阅读: 加载
shared/references/git_worktree_fallback.md
—— 使用优化相关配置项。 强制阅读: 加载
shared/references/ci_tool_detection.md
—— 使用测试框架 + 基准测试相关章节。

E2E Safety Test

E2E安全测试

Read
e2e_test_command
from context file (discovered by profiler during test discovery phase).
SourceAction
Context has
e2e_test_command
Use as functional safety gate in Phase 2
Context has
e2e_test_command = null
WARN: full test suite serves as fallback gate
Standalone (no context)User must provide test command; block if missing

从上下文文件读取
e2e_test_command
(由分析器在测试发现阶段自动识别)。
来源处理方式
上下文包含
e2e_test_command
作为阶段2的功能安全准入校验
上下文
e2e_test_command = null
告警:降级使用全量测试集作为准入校验
独立调用(无上下文)用户必须提供测试命令;缺失则阻断流程

Phase 1: Establish Baseline

阶段1:建立基准线

Reuse baseline from performance map (already measured with real metrics).
复用性能映射表中的基准线(已通过真实指标完成度量)。

From Context File

从上下文文件获取

Read
performance_map.baseline
and
performance_map.test_command
from
.hex-skills/optimization/{slug}/context.md
.
FieldSource
test_command
Discovered/created test command
baseline
Multi-metric snapshot: wall time, CPU, memory, I/O
.hex-skills/optimization/{slug}/context.md
读取
performance_map.baseline
performance_map.test_command
字段来源
test_command
自动识别/创建的测试命令
baseline
多维度指标快照:耗时、CPU、内存、I/O

Verification Run

校验运行

Run
test_command
once to confirm baseline is still valid (code unchanged since profiling):
StepAction
1Run
test_command
2IF result within 10% of
baseline.wall_time_ms
→ baseline confirmed
3IF result diverges > 10% → re-measure (3 runs, median) as new baseline
4IF test FAILS → BLOCK: "test fails on unmodified code"

执行一次
test_command
确认基准线仍然有效(性能分析后代码未发生变更):
步骤处理方式
1执行
test_command
2如果结果在
baseline.wall_time_ms
的10%波动范围内 → 基准线验证通过
3如果结果偏差超过10% → 重新度量(执行3次取中位数)作为新基准线
4如果测试失败 → 阻断流程:提示「未修改的代码上测试执行失败」

Phase 2: Strike-First Execution

阶段2:批量优化执行

MANDATORY READ: Load optimization_categories.md for pattern reference during implementation.
Apply maximum changes at once. Only fall back to A/B testing where sources genuinely disagree on approach.
强制阅读: 加载optimization_categories.md作为实现阶段的模式参考。
一次性应用尽可能多的变更,仅当不同来源的方案存在本质分歧时降级使用A/B测试。

Step 1: Triage Hypotheses

步骤1:假设分级

Split hypotheses from researcher into two groups:
GroupCriteriaAction
UncontestedClear best approach, no conflicting alternativesApply directly in the strike
ContestedMultiple approaches exist (e.g., source A says cache, source B says batch) OR
conflicts_with
another hypothesis
A/B test each alternative on top of full implementation
Most hypotheses should be uncontested — the researcher already ranked them by evidence.
将研究员提供的假设拆分为两组:
分组判断标准处理方式
无争议方案最优性明确,无冲突替代方案批量执行阶段直接应用
有争议存在多种可行方案(例如来源A建议缓存,来源B建议批量处理)或与其他假设存在
conflicts_with
关联
在全量无争议优化落地后对每个替代方案执行A/B测试
绝大多数假设都应属于无争议分组 —— 研究员已按照证据可信度完成排序。

Step 2: Strike (Apply All Uncontested)

步骤2:批量执行(应用所有无争议假设)

1. APPLY all uncontested hypotheses at once (all file edits)
2. VERIFY: Run full test suite
   IF tests FAIL:
     - IF fixable (typo, missing import) → fix & re-run ONCE
     - IF fundamental → BISECT (see Step 4)
3. E2E GATE (if e2e_test_command not null):
   IF FAIL → BISECT
4. MEASURE: 5 runs, median
5. COMPARE: improvement vs baseline
   IF improvement meets target → DONE. Commit all:
     git add {all_files}
     git commit -m "perf: apply optimizations H1,H2,H3,... (+{improvement}%)"
   IF no improvement → BISECT
1. 一次性应用所有无争议假设(所有文件修改)
2. 验证:执行全量测试集
   如果测试失败:
     - 如果问题可修复(拼写错误、缺失导入) → 修复后重新执行1次
     - 如果是根本性问题 → 二分排查(参考步骤4)
3. E2E准入校验(如果e2e_test_command不为空):
   如果失败 → 二分排查
4. 度量:执行5次取中位数
5. 对比:与基准线对比优化效果
   如果优化效果达到目标 → 完成。提交所有变更:
     git add {all_files}
     git commit -m "perf: apply optimizations H1,H2,H3,... (+{improvement}%)"
   如果无优化效果 → 二分排查

Step 3: Contested Alternatives (A/B on top of strike)

步骤3:有争议替代方案(批量执行基础上的A/B测试)

For each contested pair/group, with ALL uncontested changes already applied:
FOR each contested hypothesis group:
  1. Apply alternative A → test → measure (5 runs, median)
  2. Revert alternative A, apply alternative B → test → measure
  3. KEEP the winner. Commit.
  4. Winner becomes part of the baseline for next contested group.
针对每个有争议的分组,在所有无争议变更已应用的基础上:
遍历每个有争议假设分组:
  1. 应用方案A → 测试 → 度量(执行5次取中位数)
  2. 回滚方案A,应用方案B → 测试 → 度量
  3. 保留效果更优的方案,提交变更
  4. 获胜方案作为后续争议分组的基准线组成部分

Step 4: Bisect (only on strike failure)

步骤4:二分排查(仅批量执行失败时触发)

If strike fails tests or shows no improvement:
1. Revert all changes: git checkout -- . && git clean -fd
2. Binary search: apply first half of hypotheses → test
   - IF passes → problem in second half
   - IF fails → problem in first half
3. Narrow down to the breaking hypothesis
4. Remove it from strike, re-apply remaining → test → measure
5. Log removed hypothesis with reason
如果批量执行后测试失败或无优化效果:
1. 回滚所有变更:git checkout -- . && git clean -fd
2. 二分查找:应用前半部分假设 → 测试
   - 如果通过 → 问题出在后半部分
   - 如果失败 → 问题出在前半部分
3. 逐步缩小范围定位到导致问题的假设
4. 从批量执行列表中移除该假设,重新应用剩余假设 → 测试 → 度量
5. 记录被移除的假设及原因

Scope Rules

范围规则

RuleDescription
File scopeMultiple files allowed (not limited to single function)
Signature changesAllowed if tests still pass
New filesAllowed (cache wrapper, batch adapter, utility)
New dependenciesAllowed if already in project ecosystem (e.g., using configured Redis)
Time budget45 minutes total
规则描述
文件范围允许修改多个文件(不限制为单个函数)
签名变更只要测试仍然通过则允许
新增文件允许(缓存封装层、批量适配器、工具函数)
新增依赖只要已存在于项目生态中则允许(例如使用已配置的Redis)
时间预算总耗时不超过45分钟

Revert Protocol

回滚协议

ScopeCommand
Full revert
git checkout -- . && git clean -fd
(safe in worktree)
Single hypothesis
git checkout -- {files}
(only during bisect)
范围命令
全量回滚
git checkout -- . && git clean -fd
(工作树中安全执行)
单假设回滚
git checkout -- {files}
(仅二分排查阶段使用)

Safety Rules

安全规则

RuleDescription
TraceabilityCommit message lists all applied hypothesis IDs
IsolationAll work in isolated worktree; never modify main worktree
Bisect only on failureDo NOT test hypotheses individually unless strike fails or alternatives genuinely conflict
Crash triageRuntime crash → fix once if trivial (typo, import), else bisect to find cause
规则描述
可追溯性提交信息列出所有应用的假设ID
隔离性所有工作在隔离工作树中完成;永远不修改主工作树
仅失败时二分排查不要单独测试每个假设,除非批量执行失败或方案存在本质冲突
崩溃分级处理运行时崩溃 → 如果是简单问题(拼写错误、导入缺失)修复一次,否则二分排查定位原因

Stop Conditions (Execution Loop)

执行循环停止条件

ConditionAction
Strike passes + improvement meets targetSTOP — commit, proceed to Report
All contested alternatives testedSTOP — commit winner, proceed to Report
Bisect removes all hypothesesSTOP — report "all hypotheses failed" with profiling data
Time budget exceeded (45 min)STOP — report partial results with remaining hypotheses
All tests fail after strike + bisectSTOP — full revert, report diagnostic value only

条件处理方式
批量执行通过 + 优化效果达到目标停止 —— 提交变更,进入报告阶段
所有有争议替代方案测试完成停止 —— 提交获胜方案,进入报告阶段
二分排查移除了所有假设停止 —— 附带性能数据报告「所有假设均失败」
超过时间预算(45分钟)停止 —— 报告部分结果及剩余未执行假设
批量执行+二分排查后所有测试仍失败停止 —— 全量回滚,仅报告诊断信息

Phase 3: Report Results

阶段3:结果报告

Report Schema

报告结构

FieldDescription
baselineOriginal measurement (metric + value)
finalFinal measurement after optimizations
total_improvement_pctOverall percentage improvement
target_metBoolean — did we reach the target metric?
strike_result
clean
(all applied) /
bisected
(some removed) /
failed
hypotheses_appliedList of hypothesis IDs applied in strike
hypotheses_removedList removed during bisect (with reasons)
contested_resultsPer-contested group: alternatives tested, winner, measurement
branchWorktree branch name
files_modifiedAll changed files
e2e_test
{ command, source, baseline_passed, final_passed }
or null
字段描述
baseline原始度量值(指标+数值)
final优化后最终度量值
total_improvement_pct整体优化百分比
target_met布尔值 —— 是否达到目标指标?
strike_result
clean
(全部应用) /
bisected
(部分移除) /
failed
(失败)
hypotheses_applied批量执行阶段应用的假设ID列表
hypotheses_removed二分排查阶段移除的假设列表(附原因)
contested_results每个有争议分组的详情:测试的替代方案、获胜方案、度量结果
branch工作树分支名称
files_modified所有修改的文件
e2e_test
{ command, source, baseline_passed, final_passed }
或 null

Results Comparison (mandatory)

结果对比(必填)

Show baseline vs final for EVERY metric from
performance_map.baseline
. Include both percentage and multiplier.
| Metric | Baseline | After Strike | Improvement |
|--------|----------|-------------|-------------|
| Wall time | 7280ms | 3800ms | 47.8% (1.9x) |
| CPU time | 850ms | 720ms | 15.3% (1.2x) |
| Memory peak | 256MB | 245MB | 4.3% |
| HTTP round-trips | 13 | 2 | 84.6% (6.5x) |

Target: 5000ms → Achieved: 3800ms ✓ TARGET MET
展示
performance_map.baseline
中每个指标的基准线与最终值对比,包含百分比和倍数。
| 指标 | 基准线 | 批量执行后 | 优化幅度 |
|--------|----------|-------------|-------------|
| Wall time | 7280ms | 3800ms | 47.8% (1.9x) |
| CPU time | 850ms | 720ms | 15.3% (1.2x) |
| Memory peak | 256MB | 245MB | 4.3% |
| HTTP round-trips | 13 | 2 | 84.6% (6.5x) |

目标: 5000ms → 达成: 3800ms ✓ 达到目标

Per-Function Delta (if instrumentation available)

函数级性能变化(如果有可用埋点)

If
instrumented_files
from context is non-empty, run
test_command
once more AFTER strike to capture per-function timing with the same instrumentation the profiler placed:
| Function | Before (ms) | After (ms) | Delta |
|----------|------------|------------|-------|
| mt_translate | 3500 | 450 | -87% (7.8x) |
| tikal_extract | 2800 | 2800 | 0% (unchanged) |
Then clean up:
git checkout -- {instrumented_files}
— remove all profiling instrumentation before final commit.
Present both tables to user. This is the primary deliverable — numbers the user sees first.
如果上下文中的
instrumented_files
非空,在批量执行完成后再执行一次
test_command
,使用性能分析器植入的相同埋点采集函数级耗时:
| 函数 | 执行前(ms) | 执行后(ms) | 变化幅度 |
|----------|------------|------------|-------|
| mt_translate | 3500 | 450 | -87% (7.8x) |
| tikal_extract | 2800 | 2800 | 0% (无变化) |
然后清理:
git checkout -- {instrumented_files}
—— 最终提交前移除所有性能分析埋点。
向用户展示以上两个表格,这是核心交付物 —— 用户最先看到的量化结果。

Experiment Log

实验日志

Write to
{project_root}/.hex-skills/optimization/{slug}/ln-814-log.tsv
:
ColumnDescription
timestampISO 8601
phase
strike
/
bisect
/
contested
hypothesesComma-separated IDs applied in this round
baseline_msBaseline before this round
result_msMeasurement after changes
improvement_pctPercentage change
status
applied
/
removed
/
alternative_a
/
alternative_b
commitGit commit hash
filesComma-separated modified files
e2e_statuspass / fail / skipped
Append to existing file if present (enables tracking across multiple runs).

写入到
{project_root}/.hex-skills/optimization/{slug}/ln-814-log.tsv
列名描述
timestampISO 8601格式时间戳
phase
strike
/
bisect
/
contested
hypotheses本轮应用的假设ID,英文逗号分隔
baseline_ms本轮执行前的基准线耗时
result_ms变更后的度量耗时
improvement_pct变化百分比
status
applied
/
removed
/
alternative_a
/
alternative_b
commitGit提交哈希
files修改的文件,英文逗号分隔
e2e_statuspass / fail / skipped
如果文件已存在则追加内容(支持多轮执行的跟踪)。

Phase 4: Gap Analysis (If Target Not Met)

阶段4:差距分析(未达到目标时执行)

If target metric not reached after all hypotheses:
SectionContent
AchievementWhat was achieved (original → final, improvement %)
Remaining bottlenecksFrom time map: which steps still dominate
Remaining cyclesIf coordinator runs multi-cycle: "{remaining} optimization cycles available for remaining bottlenecks"
Infrastructure recommendationsIf bottleneck requires infra changes (scaling, caching layer, CDN)
Further researchOptimization directions not explored in this run

如果所有假设执行后仍未达到目标指标:
章节内容
已达成效果已实现的优化结果(原始值 → 最终值,优化百分比)
剩余瓶颈从耗时分布来看仍占主导的步骤
剩余迭代次数如果协调器支持多轮迭代:「剩余{remaining}次优化迭代可用于解决剩余瓶颈」
基础设施建议如果瓶颈需要基础设施变更(扩容、缓存层、CDN)
后续研究方向本次执行未探索的优化方向

Error Handling

错误处理

ErrorRecovery
Strike fails all testsBisect to find breaking hypothesis, remove it, retry
Strike shows no improvementBisect to identify ineffective hypotheses
Measurement inconsistent (high variance)Increase runs to 10, use median
Worktree creation failsFall back to branch per git_worktree_fallback.md
Time budget exceededStop loop, report partial results with hypotheses remaining
Multi-file revert fails
git checkout -- .
in worktree (safe — worktree is isolated)

错误恢复方式
批量执行后所有测试失败二分排查定位导致问题的假设,移除后重试
批量执行后无优化效果二分排查识别无效假设
度量结果不一致(方差过高)增加执行次数到10次,取中位数
工作树创建失败参考git_worktree_fallback.md降级使用分支
超过时间预算停止循环,报告部分结果及剩余未执行假设
多文件回滚失败在工作树中执行
git checkout -- .
(安全 —— 工作树是隔离的)

References

参考资料

  • optimization_categories.md — optimization pattern checklist
  • shared/references/ci_tool_detection.md
    (test + benchmark detection)
  • shared/references/git_worktree_fallback.md
    (worktree isolation)

  • optimization_categories.md —— 优化模式检查清单
  • shared/references/ci_tool_detection.md
    (测试+基准测试识别)
  • shared/references/git_worktree_fallback.md
    (工作树隔离)

Runtime Summary Artifact

运行时摘要产物

MANDATORY READ: Load
shared/references/coordinator_summary_contract.md
Write
.hex-skills/runtime-artifacts/runs/{run_id}/optimization-execution/{slug}.json
before finishing.
强制阅读: 加载
shared/references/coordinator_summary_contract.md
执行完成前写入
.hex-skills/runtime-artifacts/runs/{run_id}/optimization-execution/{slug}.json

Definition of Done

完成定义

  • Baseline established using same metric type as observed problem
  • Hypotheses triaged: uncontested vs contested
  • Strike applied: all uncontested hypotheses implemented at once
  • Tests pass after strike
  • Contested alternatives A/B tested on top of full implementation
  • Bisect performed only if strike fails (not preemptively)
  • E2E safety test passes (or documented as unavailable)
  • Experiment log written to
    .hex-skills/optimization/{slug}/ln-814-log.tsv
  • Report returned with baseline, final, improvement%, strike result
  • All changes on isolated branch, pushed to remote
  • Gap analysis provided if target metric not met
  • Optimization execution artifact written to the shared location

Version: 2.0.0 Last Updated: 2026-03-14
  • 使用与观测到的问题相同类型的指标建立基准线
  • 完成假设分级:区分无争议/有争议假设
  • 完成批量执行:一次性实现所有无争议假设
  • 批量执行后测试通过
  • 在全量实现基础上完成有争议替代方案的A/B测试
  • 仅在批量执行失败时执行二分排查(不提前执行)
  • E2E安全测试通过(或记录不可用)
  • 实验日志写入到
    .hex-skills/optimization/{slug}/ln-814-log.tsv
  • 返回的报告包含基准线、最终值、优化百分比、批量执行结果
  • 所有变更在隔离分支上,已推送到远程仓库
  • 未达到目标指标时提供差距分析
  • 优化执行产物写入到共享路径

版本: 2.0.0 最后更新: 2026-03-14