ln-814-optimization-executor

Compare original and translation side by side

🇺🇸

Original

English

🇨🇳

Translation

Chinese

Paths: File paths (
shared/
,
references/
,
../ln-*
) are relative to skills repo root. If not found at CWD, locate this SKILL.md directory and go up one level for repo root. If
shared/
is missing, fetch files via WebFetch from
https://raw.githubusercontent.com/levnikolaevich/claude-code-skills/master/skills/{path}
.

路径： 文件路径（
shared/
、
references/
、
../ln-*
）均相对于技能仓库根目录。如果在当前工作目录未找到，请定位到该SKILL.md所在目录并向上跳转一级即为仓库根目录。如果缺失
shared/
目录，请通过WebFetch从
https://raw.githubusercontent.com/levnikolaevich/claude-code-skills/master/skills/{path}
获取文件。

ln-814-optimization-executor

Type: L3 Worker Category: 8XX Optimization

Executes optimization hypotheses from the researcher using keep/discard autoresearch loop. Supports multi-file changes, compound baselines, and any optimization type (algorithm, architecture, query, caching, batching).

类型： L3 Worker 分类： 8XX 优化

使用保留/丢弃自动研究循环执行研究员提出的优化假设，支持多文件变更、复合基准线，以及任意类型的优化（算法、架构、查询、缓存、批量处理）。

Overview

概述

Aspect	Details
Input	`.hex-skills/optimization/{slug}/context.md` OR conversation context (standalone invocation)
Output	Optimized code on isolated branch, per-hypothesis results, experiment log
Pattern	Strike-first: apply all → test → measure. Bisect only on failure. A/B only for contested alternatives

维度	详情
输入	`.hex-skills/optimization/{slug}/context.md` 或对话上下文（独立调用场景）
输出	独立分支上的优化后代码、单假设执行结果、实验日志
模式	优先批量执行：一次性应用所有优化 → 测试 → 度量。仅失败时执行二分排查。仅针对存在争议的方案执行A/B测试

Workflow

工作流

Phases: Pre-flight → Baseline → Strike-First Execution → Report → Gap Analysis

阶段： 前置检查 → 基准线建立 → 批量优化执行 → 结果报告 → 差距分析

Phase 0: Pre-flight Checks

阶段0：前置检查

Slug Resolution

Slug解析

If invoked via Agent with contextStore containing
```
slug
```
— use directly.
If invoked standalone — derive slug from context_file path or ask user.

如果通过Agent调用且contextStore包含
```
slug
```
参数 —— 直接使用该值
如果是独立调用 —— 从上下文文件路径推导slug或向用户询问

Step 1: Load Context

步骤1：加载上下文

Read

.hex-skills/optimization/{slug}/context.md

from project root. Contains problem statement, profiling results, research hypotheses, and target metric.

If file not found: check conversation context for the same data (standalone invocation).

读取项目根目录下的

.hex-skills/optimization/{slug}/context.md

文件，包含问题描述、性能分析结果、研究假设和目标指标。

如果未找到该文件：从对话上下文提取相同数据（独立调用场景）。

Step 2: Pre-flight Validation

步骤2：前置校验

Check	Required	Action if Missing
Hypotheses provided (H1..H7)	Yes	Block — nothing to execute
Test infrastructure	Yes	Block (see ci_tool_detection.md)
Git clean state	Yes	Block (need clean baseline for revert)
Worktree isolation	Yes	Create per git_worktree_fallback.md
E2E safety test	No (recommended)	Read from context; WARN if null — full test suite as fallback gate

MANDATORY READ: Load

shared/references/git_worktree_fallback.md

— use optimization rows. MANDATORY READ: Load

shared/references/ci_tool_detection.md

— use Test Frameworks + Benchmarks sections.

检查项	是否必填	缺失时处理
提供了优化假设（H1..H7）	是	阻断流程 —— 无内容可执行
测试基础设施可用	是	阻断流程（参考ci_tool_detection.md）
Git状态干净	是	阻断流程（需要干净的基准线用于回滚）
工作树隔离可用	是	参考git_worktree_fallback.md创建
E2E安全测试	否（推荐）	从上下文读取；为空则告警 —— 降级使用全量测试集作为准入校验

强制阅读： 加载

shared/references/git_worktree_fallback.md

—— 使用优化相关配置项。 强制阅读： 加载

shared/references/ci_tool_detection.md

—— 使用测试框架 + 基准测试相关章节。

E2E Safety Test

E2E安全测试

Read

e2e_test_command

from context file (discovered by profiler during test discovery phase).

Source	Action
Context has `e2e_test_command`	Use as functional safety gate in Phase 2
Context has `e2e_test_command = null`	WARN: full test suite serves as fallback gate
Standalone (no context)	User must provide test command; block if missing

从上下文文件读取

e2e_test_command

（由分析器在测试发现阶段自动识别）。

来源	处理方式
上下文包含 `e2e_test_command`	作为阶段2的功能安全准入校验
上下文 `e2e_test_command = null`	告警：降级使用全量测试集作为准入校验
独立调用（无上下文）	用户必须提供测试命令；缺失则阻断流程

Phase 1: Establish Baseline

阶段1：建立基准线

Reuse baseline from performance map (already measured with real metrics).

复用性能映射表中的基准线（已通过真实指标完成度量）。

From Context File

从上下文文件获取

Read

performance_map.baseline

and

performance_map.test_command

from

.hex-skills/optimization/{slug}/context.md

Field	Source
`test_command`	Discovered/created test command
`baseline`	Multi-metric snapshot: wall time, CPU, memory, I/O

从

.hex-skills/optimization/{slug}/context.md

读取

performance_map.baseline

和

performance_map.test_command

。

字段	来源
`test_command`	自动识别/创建的测试命令
`baseline`	多维度指标快照：耗时、CPU、内存、I/O

Verification Run

校验运行

Run

test_command

once to confirm baseline is still valid (code unchanged since profiling):

Step	Action
1	Run `test_command`
2	IF result within 10% of `baseline.wall_time_ms` → baseline confirmed
3	IF result diverges > 10% → re-measure (3 runs, median) as new baseline
4	IF test FAILS → BLOCK: "test fails on unmodified code"

执行一次

test_command

确认基准线仍然有效（性能分析后代码未发生变更）：

步骤	处理方式
1	执行 `test_command`
2	如果结果在 `baseline.wall_time_ms` 的10%波动范围内 → 基准线验证通过
3	如果结果偏差超过10% → 重新度量（执行3次取中位数）作为新基准线
4	如果测试失败 → 阻断流程：提示「未修改的代码上测试执行失败」

Phase 2: Strike-First Execution

阶段2：批量优化执行

MANDATORY READ: Load optimization_categories.md for pattern reference during implementation.

Apply maximum changes at once. Only fall back to A/B testing where sources genuinely disagree on approach.

强制阅读： 加载optimization_categories.md作为实现阶段的模式参考。

一次性应用尽可能多的变更，仅当不同来源的方案存在本质分歧时降级使用A/B测试。

Step 1: Triage Hypotheses

步骤1：假设分级

Split hypotheses from researcher into two groups:

Group	Criteria	Action
Uncontested	Clear best approach, no conflicting alternatives	Apply directly in the strike
Contested	Multiple approaches exist (e.g., source A says cache, source B says batch) OR `conflicts_with` another hypothesis	A/B test each alternative on top of full implementation

Most hypotheses should be uncontested — the researcher already ranked them by evidence.

将研究员提供的假设拆分为两组：

分组	判断标准	处理方式
无争议	方案最优性明确，无冲突替代方案	批量执行阶段直接应用
有争议	存在多种可行方案（例如来源A建议缓存，来源B建议批量处理）或与其他假设存在 `conflicts_with` 关联	在全量无争议优化落地后对每个替代方案执行A/B测试

绝大多数假设都应属于无争议分组 —— 研究员已按照证据可信度完成排序。

Step 2: Strike (Apply All Uncontested)

步骤2：批量执行（应用所有无争议假设）

1. APPLY all uncontested hypotheses at once (all file edits)
2. VERIFY: Run full test suite
   IF tests FAIL:
     - IF fixable (typo, missing import) → fix & re-run ONCE
     - IF fundamental → BISECT (see Step 4)
3. E2E GATE (if e2e_test_command not null):
   IF FAIL → BISECT
4. MEASURE: 5 runs, median
5. COMPARE: improvement vs baseline
   IF improvement meets target → DONE. Commit all:
     git add {all_files}
     git commit -m "perf: apply optimizations H1,H2,H3,... (+{improvement}%)"
   IF no improvement → BISECT

1. 一次性应用所有无争议假设（所有文件修改）
2. 验证：执行全量测试集
   如果测试失败:
     - 如果问题可修复（拼写错误、缺失导入） → 修复后重新执行1次
     - 如果是根本性问题 → 二分排查（参考步骤4）
3. E2E准入校验（如果e2e_test_command不为空）:
   如果失败 → 二分排查
4. 度量：执行5次取中位数
5. 对比：与基准线对比优化效果
   如果优化效果达到目标 → 完成。提交所有变更：
     git add {all_files}
     git commit -m "perf: apply optimizations H1,H2,H3,... (+{improvement}%)"
   如果无优化效果 → 二分排查

Step 3: Contested Alternatives (A/B on top of strike)

步骤3：有争议替代方案（批量执行基础上的A/B测试）

For each contested pair/group, with ALL uncontested changes already applied:

FOR each contested hypothesis group:
  1. Apply alternative A → test → measure (5 runs, median)
  2. Revert alternative A, apply alternative B → test → measure
  3. KEEP the winner. Commit.
  4. Winner becomes part of the baseline for next contested group.

针对每个有争议的分组，在所有无争议变更已应用的基础上：

遍历每个有争议假设分组:
  1. 应用方案A → 测试 → 度量（执行5次取中位数）
  2. 回滚方案A，应用方案B → 测试 → 度量
  3. 保留效果更优的方案，提交变更
  4. 获胜方案作为后续争议分组的基准线组成部分

Step 4: Bisect (only on strike failure)

步骤4：二分排查（仅批量执行失败时触发）

If strike fails tests or shows no improvement:

1. Revert all changes: git checkout -- . && git clean -fd
2. Binary search: apply first half of hypotheses → test
   - IF passes → problem in second half
   - IF fails → problem in first half
3. Narrow down to the breaking hypothesis
4. Remove it from strike, re-apply remaining → test → measure
5. Log removed hypothesis with reason

如果批量执行后测试失败或无优化效果：

1. 回滚所有变更：git checkout -- . && git clean -fd
2. 二分查找：应用前半部分假设 → 测试
   - 如果通过 → 问题出在后半部分
   - 如果失败 → 问题出在前半部分
3. 逐步缩小范围定位到导致问题的假设
4. 从批量执行列表中移除该假设，重新应用剩余假设 → 测试 → 度量
5. 记录被移除的假设及原因

Scope Rules

范围规则

Rule	Description
File scope	Multiple files allowed (not limited to single function)
Signature changes	Allowed if tests still pass
New files	Allowed (cache wrapper, batch adapter, utility)
New dependencies	Allowed if already in project ecosystem (e.g., using configured Redis)
Time budget	45 minutes total

规则	描述
文件范围	允许修改多个文件（不限制为单个函数）
签名变更	只要测试仍然通过则允许
新增文件	允许（缓存封装层、批量适配器、工具函数）
新增依赖	只要已存在于项目生态中则允许（例如使用已配置的Redis）
时间预算	总耗时不超过45分钟

Revert Protocol

回滚协议

Scope	Command
Full revert	`git checkout -- . && git clean -fd` (safe in worktree)
Single hypothesis	`git checkout -- {files}` (only during bisect)

范围	命令
全量回滚	`git checkout -- . && git clean -fd` （工作树中安全执行）
单假设回滚	`git checkout -- {files}` （仅二分排查阶段使用）

Safety Rules

安全规则

Rule	Description
Traceability	Commit message lists all applied hypothesis IDs
Isolation	All work in isolated worktree; never modify main worktree
Bisect only on failure	Do NOT test hypotheses individually unless strike fails or alternatives genuinely conflict
Crash triage	Runtime crash → fix once if trivial (typo, import), else bisect to find cause

规则	描述
可追溯性	提交信息列出所有应用的假设ID
隔离性	所有工作在隔离工作树中完成；永远不修改主工作树
仅失败时二分排查	不要单独测试每个假设，除非批量执行失败或方案存在本质冲突
崩溃分级处理	运行时崩溃 → 如果是简单问题（拼写错误、导入缺失）修复一次，否则二分排查定位原因

Stop Conditions (Execution Loop)

执行循环停止条件

Condition	Action
Strike passes + improvement meets target	STOP — commit, proceed to Report
All contested alternatives tested	STOP — commit winner, proceed to Report
Bisect removes all hypotheses	STOP — report "all hypotheses failed" with profiling data
Time budget exceeded (45 min)	STOP — report partial results with remaining hypotheses
All tests fail after strike + bisect	STOP — full revert, report diagnostic value only

条件	处理方式
批量执行通过 + 优化效果达到目标	停止 —— 提交变更，进入报告阶段
所有有争议替代方案测试完成	停止 —— 提交获胜方案，进入报告阶段
二分排查移除了所有假设	停止 —— 附带性能数据报告「所有假设均失败」
超过时间预算（45分钟）	停止 —— 报告部分结果及剩余未执行假设
批量执行+二分排查后所有测试仍失败	停止 —— 全量回滚，仅报告诊断信息

Phase 3: Report Results

阶段3：结果报告

Report Schema

报告结构

Field	Description
baseline	Original measurement (metric + value)
final	Final measurement after optimizations
total_improvement_pct	Overall percentage improvement
target_met	Boolean — did we reach the target metric?
strike_result	`clean` (all applied) / `bisected` (some removed) / `failed`
hypotheses_applied	List of hypothesis IDs applied in strike
hypotheses_removed	List removed during bisect (with reasons)
contested_results	Per-contested group: alternatives tested, winner, measurement
branch	Worktree branch name
files_modified	All changed files
e2e_test	`{ command, source, baseline_passed, final_passed }` or null

字段	描述
baseline	原始度量值（指标+数值）
final	优化后最终度量值
total_improvement_pct	整体优化百分比
target_met	布尔值 —— 是否达到目标指标？
strike_result	`clean` （全部应用） / `bisected` （部分移除） / `failed` （失败）
hypotheses_applied	批量执行阶段应用的假设ID列表
hypotheses_removed	二分排查阶段移除的假设列表（附原因）
contested_results	每个有争议分组的详情：测试的替代方案、获胜方案、度量结果
branch	工作树分支名称
files_modified	所有修改的文件
e2e_test	`{ command, source, baseline_passed, final_passed }` 或 null

Results Comparison (mandatory)

结果对比（必填）

Show baseline vs final for EVERY metric from

performance_map.baseline

. Include both percentage and multiplier.

| Metric | Baseline | After Strike | Improvement |
|--------|----------|-------------|-------------|
| Wall time | 7280ms | 3800ms | 47.8% (1.9x) |
| CPU time | 850ms | 720ms | 15.3% (1.2x) |
| Memory peak | 256MB | 245MB | 4.3% |
| HTTP round-trips | 13 | 2 | 84.6% (6.5x) |

Target: 5000ms → Achieved: 3800ms ✓ TARGET MET

展示

performance_map.baseline

中每个指标的基准线与最终值对比，包含百分比和倍数。

| 指标 | 基准线 | 批量执行后 | 优化幅度 |
|--------|----------|-------------|-------------|
| Wall time | 7280ms | 3800ms | 47.8% (1.9x) |
| CPU time | 850ms | 720ms | 15.3% (1.2x) |
| Memory peak | 256MB | 245MB | 4.3% |
| HTTP round-trips | 13 | 2 | 84.6% (6.5x) |

目标: 5000ms → 达成: 3800ms ✓ 达到目标

Per-Function Delta (if instrumentation available)

函数级性能变化（如果有可用埋点）

instrumented_files

from context is non-empty, run

test_command

once more AFTER strike to capture per-function timing with the same instrumentation the profiler placed:

| Function | Before (ms) | After (ms) | Delta |
|----------|------------|------------|-------|
| mt_translate | 3500 | 450 | -87% (7.8x) |
| tikal_extract | 2800 | 2800 | 0% (unchanged) |

Then clean up:

git checkout -- {instrumented_files}

— remove all profiling instrumentation before final commit.

Present both tables to user. This is the primary deliverable — numbers the user sees first.

如果上下文中的

instrumented_files

非空，在批量执行完成后再执行一次

test_command

，使用性能分析器植入的相同埋点采集函数级耗时：

| 函数 | 执行前(ms) | 执行后(ms) | 变化幅度 |
|----------|------------|------------|-------|
| mt_translate | 3500 | 450 | -87% (7.8x) |
| tikal_extract | 2800 | 2800 | 0% (无变化) |

然后清理：

git checkout -- {instrumented_files}

—— 最终提交前移除所有性能分析埋点。

向用户展示以上两个表格，这是核心交付物 —— 用户最先看到的量化结果。

Experiment Log

实验日志

Write to

{project_root}/.hex-skills/optimization/{slug}/ln-814-log.tsv

Column	Description
timestamp	ISO 8601
phase	`strike` / `bisect` / `contested`
hypotheses	Comma-separated IDs applied in this round
baseline_ms	Baseline before this round
result_ms	Measurement after changes
improvement_pct	Percentage change
status	`applied` / `removed` / `alternative_a` / `alternative_b`
commit	Git commit hash
files	Comma-separated modified files
e2e_status	pass / fail / skipped

Append to existing file if present (enables tracking across multiple runs).

写入到

{project_root}/.hex-skills/optimization/{slug}/ln-814-log.tsv

：

列名	描述
timestamp	ISO 8601格式时间戳
phase	`strike` / `bisect` / `contested`
hypotheses	本轮应用的假设ID，英文逗号分隔
baseline_ms	本轮执行前的基准线耗时
result_ms	变更后的度量耗时
improvement_pct	变化百分比
status	`applied` / `removed` / `alternative_a` / `alternative_b`
commit	Git提交哈希
files	修改的文件，英文逗号分隔
e2e_status	pass / fail / skipped

如果文件已存在则追加内容（支持多轮执行的跟踪）。

Phase 4: Gap Analysis (If Target Not Met)

阶段4：差距分析（未达到目标时执行）

If target metric not reached after all hypotheses:

Section	Content
Achievement	What was achieved (original → final, improvement %)
Remaining bottlenecks	From time map: which steps still dominate
Remaining cycles	If coordinator runs multi-cycle: "{remaining} optimization cycles available for remaining bottlenecks"
Infrastructure recommendations	If bottleneck requires infra changes (scaling, caching layer, CDN)
Further research	Optimization directions not explored in this run

如果所有假设执行后仍未达到目标指标：

章节	内容
已达成效果	已实现的优化结果（原始值 → 最终值，优化百分比）
剩余瓶颈	从耗时分布来看仍占主导的步骤
剩余迭代次数	如果协调器支持多轮迭代：「剩余{remaining}次优化迭代可用于解决剩余瓶颈」
基础设施建议	如果瓶颈需要基础设施变更（扩容、缓存层、CDN）
后续研究方向	本次执行未探索的优化方向

Error Handling

错误处理

Error	Recovery
Strike fails all tests	Bisect to find breaking hypothesis, remove it, retry
Strike shows no improvement	Bisect to identify ineffective hypotheses
Measurement inconsistent (high variance)	Increase runs to 10, use median
Worktree creation fails	Fall back to branch per git_worktree_fallback.md
Time budget exceeded	Stop loop, report partial results with hypotheses remaining
Multi-file revert fails	`git checkout -- .` in worktree (safe — worktree is isolated)

错误	恢复方式
批量执行后所有测试失败	二分排查定位导致问题的假设，移除后重试
批量执行后无优化效果	二分排查识别无效假设
度量结果不一致（方差过高）	增加执行次数到10次，取中位数
工作树创建失败	参考git_worktree_fallback.md降级使用分支
超过时间预算	停止循环，报告部分结果及剩余未执行假设
多文件回滚失败	在工作树中执行 `git checkout -- .` （安全 —— 工作树是隔离的）

References

参考资料

optimization_categories.md — optimization pattern checklist
```
shared/references/ci_tool_detection.md
```
(test + benchmark detection)

shared/references/git_worktree_fallback.md

(worktree isolation)

optimization_categories.md —— 优化模式检查清单
```
shared/references/ci_tool_detection.md
```
（测试+基准测试识别）

shared/references/git_worktree_fallback.md

（工作树隔离）

Runtime Summary Artifact

运行时摘要产物

MANDATORY READ: Load

shared/references/coordinator_summary_contract.md

Write

.hex-skills/runtime-artifacts/runs/{run_id}/optimization-execution/{slug}.json

before finishing.

强制阅读： 加载

shared/references/coordinator_summary_contract.md

执行完成前写入

.hex-skills/runtime-artifacts/runs/{run_id}/optimization-execution/{slug}.json

。

Definition of Done

完成定义

Baseline established using same metric type as observed problem
Hypotheses triaged: uncontested vs contested
Strike applied: all uncontested hypotheses implemented at once
Tests pass after strike
Contested alternatives A/B tested on top of full implementation
Bisect performed only if strike fails (not preemptively)
E2E safety test passes (or documented as unavailable)

Experiment log written to

.hex-skills/optimization/{slug}/ln-814-log.tsv

Report returned with baseline, final, improvement%, strike result
All changes on isolated branch, pushed to remote
Gap analysis provided if target metric not met
Optimization execution artifact written to the shared location

Version: 2.0.0 Last Updated: 2026-03-14

使用与观测到的问题相同类型的指标建立基准线
完成假设分级：区分无争议/有争议假设
完成批量执行：一次性实现所有无争议假设
批量执行后测试通过
在全量实现基础上完成有争议替代方案的A/B测试
仅在批量执行失败时执行二分排查（不提前执行）
E2E安全测试通过（或记录不可用）

实验日志写入到

.hex-skills/optimization/{slug}/ln-814-log.tsv

返回的报告包含基准线、最终值、优化百分比、批量执行结果
所有变更在隔离分支上，已推送到远程仓库
未达到目标指标时提供差距分析
优化执行产物写入到共享路径

版本： 2.0.0 最后更新： 2026-03-14