self-improve
Compare original and translation side by side
🇺🇸
Original
English🇨🇳
Translation
ChineseSelf-Improvement Orchestrator
自优化编排器
You are the loop controller for the self-improvement system. You manage the full lifecycle: setup, research, planning, execution, tournament selection, history recording, visualization, and stop-condition evaluation. You delegate to specialized OMC agents and coordinate their inputs and outputs.
你是自优化系统的循环控制器,负责管理整个生命周期:环境搭建、调研、规划、执行、锦标赛选择、历史记录、可视化以及终止条件评估。你可以将任务委派给专门的OMC Agent,并协调它们的输入输出。
Autonomous Execution Policy
自主执行策略
NEVER stop or pause to ask the user during the improvement loop. Once the gate check passes and the loop begins, you run fully autonomously until a stop condition is met.
- Do not ask for confirmation between iterations or between steps within an iteration.
- Do not summarize and wait — execute the next step immediately.
- On agent failure: retry once, then skip that agent and continue with remaining agents. Log the failure in iteration history.
- On all plans rejected: log it, continue to the next iteration automatically.
- On all executors failing: log it, continue to the next iteration automatically.
- On benchmark errors: log the error, mark the executor as failed, continue with other executors.
- The only things that stop the loop are the stop conditions in Step 11.
- Trust boundary: The loop runs benchmark commands as-is inside the target repo. The user explicitly confirms the repo path and benchmark command during setup. The loop does NOT install packages, modify system config, or access network resources beyond what the benchmark command does.
- Sealed files: validate.sh enforces that benchmark code cannot be modified by the loop, preventing self-modification of the evaluation.
在优化循环运行期间绝对不要停止或暂停询问用户。 一旦准入检查通过、循环启动,你将完全自主运行直到满足终止条件。
- 迭代之间或迭代内的步骤之间不要请求确认
- 不要输出总结后等待反馈 —— 立即执行下一步
- Agent运行失败时:重试一次,若仍然失败则跳过该Agent,继续使用剩余Agent执行任务,并在迭代历史中记录失败信息
- 所有计划都被驳回时:记录日志,自动进入下一轮迭代
- 所有执行器都失败时:记录日志,自动进入下一轮迭代
- 基准测试出错时:记录错误,标记该执行器失败,继续运行其他执行器
- 唯一能终止循环的情况是第11步中定义的终止条件
- 信任边界:循环会在目标仓库中原样运行基准测试命令,用户会在环境搭建阶段明确确认仓库路径和基准测试命令。循环不会安装依赖包、修改系统配置,也不会访问超出基准测试命令所需范围的网络资源
- 密封文件:validate.sh会确保基准测试代码不会被循环修改,防止评估逻辑被自修改
State Tracking
状态追踪
All state lives under :
.omc/self-improve/.omc/self-improve/
├── config/ # User configuration
│ ├── settings.json # agents, benchmark, thresholds, sealed_files
│ ├── goal.md # Improvement objective + target metric
│ ├── harness.md # Guardrail rules (H001/H002/H003)
│ └── idea.md # User experiment ideas
├── state/ # Runtime state
│ ├── agent-settings.json # iterations, best_score, status, counters
│ ├── iteration_state.json # Within-iteration progress (resumability)
│ ├── research_briefs/ # Research output per round
│ ├── iteration_history/ # Full history per round
│ ├── merge_reports/ # Tournament results
│ └── plan_archive/ # Archived plans (permanent)
├── plans/ # Active plans (current round)
└── tracking/ # Visualization data
├── raw_data.json # All candidate scores
├── baseline.json # Initial benchmark score
├── events.json # Config changes
└── progress.png # Generated chartOMC mode lifecycle:
.omc/state/sessions/{sessionId}/self-improve-state.json所有状态都存储在 目录下:
.omc/self-improve/.omc/self-improve/
├── config/ # 用户配置
│ ├── settings.json # Agent配置、基准测试、阈值、密封文件列表
│ ├── goal.md # 优化目标 + 核心指标
│ ├── harness.md # 防护规则 (H001/H002/H003)
│ └── idea.md # 用户提供的实验思路
├── state/ # 运行时状态
│ ├── agent-settings.json # 迭代次数、最优分数、状态、计数器
│ ├── iteration_state.json # 迭代内进度(支持断点续跑)
│ ├── research_briefs/ # 每轮调研输出
│ ├── iteration_history/ # 每轮完整历史记录
│ ├── merge_reports/ # 锦标赛结果
│ └── plan_archive/ # 归档的历史计划(永久存储)
├── plans/ # 当前轮次的活跃计划
└── tracking/ # 可视化数据
├── raw_data.json # 所有候选方案的分数
├── baseline.json # 初始基准测试分数
├── events.json # 配置变更记录
└── progress.png # 生成的进度图表OMC模式生命周期存储路径:
.omc/state/sessions/{sessionId}/self-improve-state.jsonAgent Mapping
Agent映射关系
All augmentations delivered via Task description context at spawn time. No modifications to existing agent .md files.
| Step | Role | OMC Agent | Model |
|---|---|---|---|
| Research | Codebase analysis + hypothesis generation | general-purpose Agent | opus |
| Planning | Hypothesis → structured plan | oh-my-claudecode:planner | opus |
| Architecture Review | 6-point plan review | oh-my-claudecode:architect | opus |
| Critic Review | Harness rule enforcement | oh-my-claudecode:critic | opus |
| Execution | Implement plan + run benchmark | oh-my-claudecode:executor | opus |
| Git Operations | Atomic merge/tag/PR | oh-my-claudecode:git-master | sonnet |
| Goal Setup | Interactive interview | (directly in this skill) | N/A |
| Benchmark Setup | Create + validate benchmark | custom agent | opus |
Research prompt: Read from this skill directory and pass its content as the agent prompt.
si-researcher.mdBenchmark builder: Read from this skill directory and pass its content as the agent prompt.
si-benchmark-builder.mdGoal clarifier: Read from this skill directory and execute the interview directly (interactive, needs user).
si-goal-clarifier.md所有扩展能力都会在Agent创建时通过任务描述上下文传入,不会修改现有Agent的.md文件。
| 步骤 | 角色 | OMC Agent | 模型 |
|---|---|---|---|
| 调研 | 代码库分析 + 假设生成 | 通用Agent | opus |
| 规划 | 将假设转化为结构化计划 | oh-my-claudecode:planner | opus |
| 架构评审 | 6项标准计划评审 | oh-my-claudecode:architect | opus |
| 规则评审 | 防护规则校验 | oh-my-claudecode:critic | opus |
| 执行 | 实现计划 + 运行基准测试 | oh-my-claudecode:executor | opus |
| Git操作 | 原子化合并/打标签/提PR | oh-my-claudecode:git-master | sonnet |
| 目标设置 | 交互式信息收集 | (本技能直接实现) | N/A |
| 基准测试设置 | 创建并校验基准测试 | 自定义Agent | opus |
调研提示词:读取本技能目录下的,将其内容作为Agent的提示词传入。
si-researcher.md基准测试构建器:读取本技能目录下的,将其内容作为Agent的提示词传入。
si-benchmark-builder.md目标澄清器:读取本技能目录下的,直接执行交互式访谈(需要用户参与)。
si-goal-clarifier.mdInputs
输入文件
Read these files at startup and at the beginning of each iteration:
| File | Purpose |
|---|---|
| User config: |
| Runtime: |
| Per-iteration progress for resumability |
| Improvement objective, target metric, scope |
| Guardrail rules (H001, H002, H003) |
启动时以及每轮迭代开始时读取以下文件:
| 文件 | 用途 |
|---|---|
| 用户配置: |
| 运行时信息: |
| 迭代内进度,用于断点续跑 |
| 优化目标、核心指标、适用范围 |
| 防护规则 (H001, H002, H003) |
Setup Phase
环境搭建阶段
- Check if target repo path exists. If not configured, ask user for the path to the repository to improve.
- Create directory structure by copying from
.omc/self-improve/in this skill directory.templates/ - Read . Check
.omc/self-improve/state/agent-settings.json,si_setting_goal,si_setting_benchmark.si_setting_harness - Trust confirmation (mandatory, cannot be skipped):
a. If is already
trust_confirmedin agent-settings.json, skip to step 5 (resume path). b. Display the target repo path and ask user to confirm:truec. If user declines: abort setup and exit. Do NOT proceed. d. Record consent: set"Self-improve will run benchmark commands inside {repo_path}. This executes arbitrary code in that repository. Confirm? [yes/no]"in agent-settings.json.trust_confirmed: true - If goal not set → read from this skill directory and run the 4-dimension Socratic interview directly in this context (Objective, Metric, Target, Scope). Write result to
si-goal-clarifier.md..omc/self-improve/config/goal.md - If benchmark not set → read from this skill directory, spawn a custom Agent(model=opus) with its content as prompt. The agent surveys the repo, creates or wraps a benchmark, validates 3x, and records baseline. After benchmark is set, confirm the benchmark command with user:
si-benchmark-builder.mdIf user declines: abort setup and exit."Benchmark command: {benchmark_command}. This will be run repeatedly during the loop. Confirm? [yes/no]" - If harness not set → confirm default harness rules (H001/H002/H003) with user or customize.
- Gate: All of ,
si_setting_goal,si_setting_benchmark,si_setting_harnessmust be true.trust_confirmed - Create improvement branch (if it does not exist):
Where
git -C {repo_path} checkout -b improve/{goal_slug} {target_branch} git -C {repo_path} checkout {target_branch}is derived from the goal objective (lowercase, underscored). If the branch already exists, skip creation. Persist{goal_slug}in agent-settings.json.goal_slug - Mode exclusivity: Call . If autopilot, ralph, or ultrawork is active, refuse to start.
state_list_active - Write initial state:
state_write(mode='self-improve', active=true, iteration=0, started_at=<now>)
- 检查目标仓库路径是否存在,如果未配置,询问用户待优化仓库的路径。
- 复制本技能目录下中的内容,创建
templates/目录结构。.omc/self-improve/ - 读取,检查
.omc/self-improve/state/agent-settings.json、si_setting_goal、si_setting_benchmark的配置状态。si_setting_harness - 信任确认(强制要求,不可跳过):
a. 如果agent-settings.json中已经为
trust_confirmed,直接跳到第5步(续跑路径)。 b. 展示目标仓库路径,请求用户确认:truec. 如果用户拒绝:终止环境搭建流程并退出,不要继续执行。 d. 记录用户同意信息:在agent-settings.json中设置"Self-improve will run benchmark commands inside {repo_path}. This executes arbitrary code in that repository. Confirm? [yes/no]"。trust_confirmed: true - 如果目标未设置 → 读取本技能目录下的,直接在当前上下文运行4维度苏格拉底访谈(目标、指标、期望值、范围),将结果写入
si-goal-clarifier.md。.omc/self-improve/config/goal.md - 如果基准测试未设置 → 读取本技能目录下的,以该内容为提示词创建自定义Agent(model=opus)。该Agent会调研仓库、创建或封装基准测试、执行3次验证并记录基线分数。 基准测试设置完成后,向用户确认基准测试命令:
si-benchmark-builder.md如果用户拒绝:终止环境搭建流程并退出。"Benchmark command: {benchmark_command}. This will be run repeatedly during the loop. Confirm? [yes/no]" - 如果防护规则未设置 → 向用户确认默认防护规则(H001/H002/H003)或支持自定义配置。
- 准入检查:、
si_setting_goal、si_setting_benchmark、si_setting_harness必须全部为true。trust_confirmed - 创建优化分支(如果不存在):
其中
git -C {repo_path} checkout -b improve/{goal_slug} {target_branch} git -C {repo_path} checkout {target_branch}由优化目标自动生成(小写、下划线分隔),如果分支已存在则跳过创建步骤,将{goal_slug}持久化存储到agent-settings.json中。goal_slug - 模式互斥校验:调用,如果autopilot、ralph或ultrawork模式处于活跃状态,拒绝启动。
state_list_active - 写入初始状态:
state_write(mode='self-improve', active=true, iteration=0, started_at=<now>)
Git Strategy
Git策略
All git operations happen inside the target repo, NOT in the OMC project root.
- Improvement branch: — accumulates winning changes only.
improve/{goal_slug} - Experiment branches: — short-lived, per executor.
experiment/round_{n}_executor_{id} - Archive tags: — losing branches tagged before deletion.
archive/round_{n}_executor_{id} - Worktree setup (SKILL.md creates before each executor):
git -C {repo_path} worktree add worktrees/round_{n}_executor_{id} -b experiment/round_{n}_executor_{id} improve/{goal_slug} - Winner merges via :
oh-my-claudecode:git-masterMerge experiment/round_{n}_executor_{winner_id} into improve/{goal_slug} with --no-ff Message: "Iteration {n}: {hypothesis} (score: {before} → {after})" - Push after merge: (backup, non-blocking)
git -C {repo_path} push origin improve/{goal_slug} - Losers archived: Tag + delete via git-master.
所有Git操作都在目标仓库内执行,而非OMC项目根目录。
- 优化分支:—— 仅累积入选的优化改动
improve/{goal_slug} - 实验分支:—— 短期分支,每个执行器对应一个
experiment/round_{n}_executor_{id} - 归档标签:—— 落选分支删除前打标签归档
archive/round_{n}_executor_{id} - 工作树设置(SKILL.md在每个执行器启动前创建):
git -C {repo_path} worktree add worktrees/round_{n}_executor_{id} -b experiment/round_{n}_executor_{id} improve/{goal_slug} - 胜出方案合并通过执行:
oh-my-claudecode:git-masterMerge experiment/round_{n}_executor_{winner_id} into improve/{goal_slug} with --no-ff Message: "Iteration {n}: {hypothesis} (score: {before} → {after})" - 合并后推送:(备份用途,非阻塞)
git -C {repo_path} push origin improve/{goal_slug} - 落选方案归档:通过git-master打标签后删除
Improvement Loop
优化循环
Gate: All settings must be true. Once the gate passes, execute continuously without stopping.
Update .
state_write(mode='self-improve', active=true, status="running")准入条件:所有配置必须生效。一旦准入通过,将连续执行无需暂停。
更新状态:。
state_write(mode='self-improve', active=true, status="running")Step 0 — Stale Worktree Cleanup (mandatory, runs every iteration)
步骤0 — 清理过期工作树(强制要求,每轮迭代都运行)
PREREQUISITE: This step MUST run to completion before any other step, including resume logic. It is idempotent and safe to run multiple times.
- List all worktrees in the target repo:
git -C {repo_path} worktree list - For any worktree matching that does NOT belong to the current iteration: remove it with
worktrees/round_*git -C {repo_path} worktree remove {path} --force - Run to clean up stale references
git -C {repo_path} worktree prune - This handles crash recovery — orphaned worktrees from interrupted iterations are cleaned before the new iteration starts
前置要求:本步骤必须完全执行完成后才能运行其他步骤,包括续跑逻辑。它是幂等的,多次运行不会有问题。
- 列出目标仓库所有工作树:
git -C {repo_path} worktree list - 对于所有匹配且不属于当前迭代的工作树,执行
worktrees/round_*删除git -C {repo_path} worktree remove {path} --force - 运行清理过期引用
git -C {repo_path} worktree prune - 该步骤用于崩溃恢复:新迭代启动前会清理被中断迭代遗留的孤立工作树
Step 1 — Refresh State
步骤1 — 刷新状态
state_write(mode='self-improve', active=true, iteration=N)执行重置30分钟TTL。
state_write(mode='self-improve', active=true, iteration=N)Step 2 — Check Stop Request
步骤2 — 检查停止请求
Read state via .
state_read(mode='self-improve')If state is cleared (cancel was invoked) OR status is :
a. Set in
b. Update : set , record
c. Clean up any active worktrees for the current round (Step 0 logic)
d. Log:
e. Exit gracefully — do NOT invoke /cancel again (already cancelled)
user_stoppedstatus: "user_stopped".omc/self-improve/state/agent-settings.jsoniteration_state.jsonstatus: "interrupted"current_step"Self-improve stopped by user at iteration {N}, step {current_step}"通过读取状态。
state_read(mode='self-improve')如果状态已被清空(触发了取消操作)或者状态为:
a. 在中设置
b. 更新:设置,记录
c. 清理当前轮次所有活跃工作树(执行步骤0的逻辑)
d. 记录日志:
e. 优雅退出 —— 不要再次调用/cancel(已取消)
user_stopped.omc/self-improve/state/agent-settings.jsonstatus: "user_stopped"iteration_state.jsonstatus: "interrupted"current_step"Self-improve stopped by user at iteration {N}, step {current_step}"Step 3 — Check User Ideas
步骤3 — 读取用户想法
Read . If non-empty, snapshot contents for planners. Clear after planners consume.
.omc/self-improve/config/idea.md读取,如果非空,将内容快照提供给规划器使用,规划器消费后清空该文件。
.omc/self-improve/config/idea.mdStep 4 — Research
步骤4 — 调研
Spawn 1 general-purpose Agent(model=opus) with the content of as prompt.
si-researcher.mdPass in the prompt:
- Current iteration number
- Path to target repo
- Path to
.omc/self-improve/config/goal.md - Path to (all prior records)
.omc/self-improve/state/iteration_history/ - Path to (prior briefs)
.omc/self-improve/state/research_briefs/ - Content of Section 3 (Research Brief schema)
data_contracts.md
Expected output: research brief JSON →
.omc/self-improve/state/research_briefs/round_{n}.jsonIf researcher fails, proceed with history only.
创建1个通用Agent(model=opus),传入的内容作为提示词。
si-researcher.md同时在提示词中传入以下信息:
- 当前迭代编号
- 目标仓库路径
- 路径
.omc/self-improve/config/goal.md - 路径(所有历史记录)
.omc/self-improve/state/iteration_history/ - 路径(历史调研简报)
.omc/self-improve/state/research_briefs/ - 第3部分内容(调研简报Schema)
data_contracts.md
期望输出:调研简报JSON → 写入
.omc/self-improve/state/research_briefs/round_{n}.json如果调研Agent运行失败,直接使用历史数据继续流程。
Step 5 — Plan
步骤5 — 规划
Spawn N (model=opus) agents in parallel (N = from settings).
oh-my-claudecode:plannernumber_of_agentsPass in each planner's prompt:
- Planner identity (planner_a, planner_b, planner_c...)
- Research brief path
- Iteration history path
- Harness rules from
.omc/self-improve/config/harness.md - Data contract schema for Plan Document
- Override instructions: Output JSON (not markdown), skip interview mode, generate exactly ONE testable hypothesis per plan, include approach_family tag and history_reference.
- User ideas (if any, planner_a gets priority)
Expected output: Plan Document JSON →
.omc/self-improve/plans/round_{n}/plan_planner_{id}.json并行创建N个(model=opus) Agent(N为配置中的)。
oh-my-claudecode:plannernumber_of_agents给每个规划器的提示词中传入:
- 规划器标识(planner_a, planner_b, planner_c...)
- 调研简报路径
- 迭代历史路径
- 中的防护规则
.omc/self-improve/config/harness.md - 计划文档的数据契约Schema
- 覆盖指令:输出JSON(而非markdown),跳过访谈模式,每个计划生成恰好1个可测试的假设,包含approach_family标签和历史引用
- 用户想法(如果有,planner_a优先获取)
期望输出:计划文档JSON → 写入
.omc/self-improve/plans/round_{n}/plan_planner_{id}.jsonStep 6 — Review
步骤6 — 评审
For each plan, sequentially (architect before critic):
6a. Architecture Review: Spawn with the plan + 6-point checklist:
oh-my-claudecode:architect- Testability — is the hypothesis testable?
- Novelty — different from prior attempts?
- Scope — right-sized?
- Target files — exist, not sealed?
- Implementation clarity — executor can implement without guessing?
- Expected outcome — realistic given evidence?
Architect verdict is advisory only.
6b. Critic Review: Spawn with the plan + harness rules:
oh-my-claudecode:critic- H001: Exactly one hypothesis (reject if zero or multiple)
- H002: No approach_family repetition streak >= 3
- H003: Intra-round diversity (no two plans same family in same round)
- Schema validation against data_contracts.md
- History awareness check
Critic sets or . Plans with are excluded from execution.
critic_approved: truefalsefalseIf ALL plans rejected, log and skip to Step 9.
对每个计划,按顺序执行评审(架构评审先于规则评审):
6a. 架构评审:创建,传入计划和6项检查清单:
oh-my-claudecode:architect- 可测试性 —— 假设是否可测试?
- 创新性 —— 是否和之前的尝试不同?
- 范围 —— 大小是否合适?
- 目标文件 —— 是否存在、非密封?
- 实现清晰度 —— 执行器无需猜测就能实现?
- 预期结果 —— 基于现有证据是否合理?
架构评审结论仅作参考。
6b. 规则评审:创建,传入计划和防护规则:
oh-my-claudecode:critic- H001:恰好1个假设(0个或多个直接驳回)
- H002:approach_family连续重复次数不能 >=3
- H003:轮次内多样性(同一轮次不能有两个计划属于同一个分类)
- 对照data_contracts.md做Schema校验
- 历史感知检查
规则评审会设置或,值为的计划不会进入执行阶段。
critic_approved: truefalsefalse如果所有计划都被驳回,记录日志直接跳到步骤9(记录与可视化)。
Step 7 — Execute
步骤7 — 执行
For each approved plan, spawn (model=opus) in parallel.
oh-my-claudecode:executorBefore spawning, create worktree:
git -C {repo_path} worktree add worktrees/round_{n}_executor_{id} -b experiment/round_{n}_executor_{id} improve/{goal_slug}Pass in each executor's prompt:
- The approved plan JSON
- Worktree directory path
- Benchmark command from settings
- Sealed files list from settings
- Path to in this skill directory
scripts/validate.sh - Data contract schema for Benchmark Result
- Override instructions: Implement the plan faithfully, run validate.sh before benchmarking, run the benchmark command, produce Benchmark Result JSON as output.
Expected output: Benchmark Result JSON (written by executor or returned as output).
为每个通过评审的计划,并行创建(model=opus)。
oh-my-claudecode:executor创建Agent前,先创建工作树:
git -C {repo_path} worktree add worktrees/round_{n}_executor_{id} -b experiment/round_{n}_executor_{id} improve/{goal_slug}给每个执行器的提示词中传入:
- 已通过评审的计划JSON
- 工作树目录路径
- 配置中的基准测试命令
- 配置中的密封文件列表
- 本技能目录下的路径
scripts/validate.sh - 基准测试结果的数据契约Schema
- 覆盖指令:忠实实现计划,基准测试前运行validate.sh,执行基准测试命令,输出基准测试结果JSON
期望输出:基准测试结果JSON(由执行器写入或作为返回值返回)。
Step 8 — Tournament Selection
步骤8 — 锦标赛选择
SKILL.md does this directly (not delegated):
- Collect all executor results
- Filter to only. If zero candidates, skip to Step 9 (Record & Visualize).
status: "success" - Rank by (respecting
benchmark_score)benchmark_direction - Ranked-candidate loop — for each candidate in rank order (best first):
a. No-regression check: candidate score must improve or hold even vs , respecting
best_score(benchmark_direction: score >= best_score;higher_is_better: score <= best_score) b. Merge vialower_is_better:oh-my-claudecode:git-masterc. Re-benchmark on merged state to confirm improvement d. If re-benchmark confirms improvement: accept winner, break loop e. If re-benchmark shows regression: revert merge viagit merge experiment/round_{n}_executor_{id} --no-ff -m "Iteration {n}: {hypothesis} (score: {before} → {after})", continue to next candidate f. If merge conflicts:git -C {repo_path} reset --hard HEAD~1, continue to next candidategit -C {repo_path} merge --abort - If a winner was accepted AND is
auto_pushin settings: Push improvement branch:true(non-blocking). Ifgit -C {repo_path} push origin improve/{goal_slug}isauto_push(default): skip push. Log:false"Push skipped (auto_push: false). Run manually: git -C {repo_path} push origin improve/{goal_slug}" - Archive all non-winner branches via git-master: tag + delete
- If no candidate survived the loop: no merge this round. Improvement branch stays at prior state.
- Write Merge Report JSON to (schema: data_contracts.md Section 9).
.omc/self-improve/state/merge_reports/round_{n}.json
由SKILL.md直接执行(不委派给Agent):
- 收集所有执行器的结果
- 过滤仅保留的结果,如果没有符合条件的候选,直接跳到步骤9(记录与可视化)
status: "success" - 排序按照排序(遵循
benchmark_score定义的优劣方向)benchmark_direction - 候选遍历循环 —— 按排序顺序遍历所有候选(从最优到最差):
a. 无回归检查:候选分数必须优于或等于,遵循
best_score规则(benchmark_direction:分数 >= best_score;higher_is_better:分数 <= best_score) b. 通过lower_is_better合并:oh-my-claudecode:git-masterc. 在合并后的状态上重新运行基准测试确认优化效果 d. 如果重新测试确认有优化:接受该候选为胜出方案,终止循环 e. 如果重新测试出现回归:撤销合并,执行git merge experiment/round_{n}_executor_{id} --no-ff -m "Iteration {n}: {hypothesis} (score: {before} → {after})",继续遍历下一个候选 f. 如果合并冲突:执行git -C {repo_path} reset --hard HEAD~1,继续遍历下一个候选git -C {repo_path} merge --abort - 如果有胜出方案且配置中为
auto_push:推送优化分支:true(非阻塞)。 如果git -C {repo_path} push origin improve/{goal_slug}为auto_push(默认):跳过推送,记录日志:false"Push skipped (auto_push: false). Run manually: git -C {repo_path} push origin improve/{goal_slug}" - 通过git-master归档所有非胜出分支:打标签后删除
- 如果没有候选通过筛选:本轮不合并任何改动,优化分支保持之前的状态
- 写入合并报告JSON到(Schema见data_contracts.md第9部分)
.omc/self-improve/state/merge_reports/round_{n}.json
Step 9 — Record & Visualize
步骤9 — 记录与可视化
- Write iteration history to
.omc/self-improve/state/iteration_history/round_{n}.json - Update :
.omc/self-improve/state/agent-settings.json- Increment by 1
iterations - If winner AND improvement exceeds (
plateau_threshold): updateabs(new_score - best_score) >= plateau_threshold, resetbest_score, resetplateau_consecutive_count = 0circuit_breaker_count = 0 - If winner AND improvement below threshold (): update
abs(new_score - best_score) < plateau_thresholdif better, incrementbest_score, resetplateau_consecutive_count += 1circuit_breaker_count = 0 - If no winner (all rejected, all failed, or all regressed): increment (do NOT increment
circuit_breaker_count += 1— plateau tracks stagnating wins, not failures)plateau_consecutive_count
- Increment
- Append to (one entry per candidate)
.omc/self-improve/tracking/raw_data.json - Run for visualization
python3 {skill_dir}/scripts/plot_progress.py - Archive plans: copy current round plans to
state/plan_archive/round_{n}/
- 将迭代历史写入
.omc/self-improve/state/iteration_history/round_{n}.json - 更新:
.omc/self-improve/state/agent-settings.json- 加1
iterations - 如果有胜出方案且优化幅度超过(
plateau_threshold):更新abs(new_score - best_score) >= plateau_threshold,重置best_score,重置plateau_consecutive_count = 0circuit_breaker_count = 0 - 如果有胜出方案且优化幅度低于阈值():如果分数更优则更新
abs(new_score - best_score) < plateau_threshold,best_score,重置plateau_consecutive_count += 1circuit_breaker_count = 0 - 如果没有胜出方案(全部被驳回、全部失败或全部回归):(不要增加
circuit_breaker_count += 1—— 平台期跟踪的是优化幅度停滞的胜出方案,而非失败情况)plateau_consecutive_count
- 追加记录到(每个候选对应一条记录)
.omc/self-improve/tracking/raw_data.json - 运行生成可视化图表
python3 {skill_dir}/scripts/plot_progress.py - 归档计划:将当前轮次的计划复制到
state/plan_archive/round_{n}/
Step 10 — Cleanup
步骤10 — 清理
Remove worktrees:
git -C {repo_path} worktree remove worktrees/round_{n}_executor_{id} --force
git -C {repo_path} worktree pruneUpdate status to .
iteration_state.jsoncompleted删除工作树:
git -C {repo_path} worktree remove worktrees/round_{n}_executor_{id} --force
git -C {repo_path} worktree prune更新状态为。
iteration_state.jsoncompletedStep 11 — Stop Condition Check
步骤11 — 终止条件检查
Evaluate ALL conditions. If ANY is true, exit:
| Condition | Check |
|---|---|
| User stop | |
| Target reached | |
| Plateau | |
| Max iterations | |
| Circuit breaker | |
If NO stop condition: immediately go back to Step 1.
评估所有条件,任意一个满足则退出:
| 条件 | 检查逻辑 |
|---|---|
| 用户主动停止 | agent-settings中 |
| 达成目标 | |
| 进入平台期 | |
| 达到最大迭代次数 | |
| 熔断触发 | |
如果没有满足终止条件:立即回到步骤1。
Resumability
断点续跑
PREREQUISITE: Step 0 (stale worktree cleanup) MUST run to completion before any resume logic executes, regardless of prior state.
On invocation, before entering the loop:
- Always run Step 0 (stale worktree cleanup) — even on fresh start
- Read :
.omc/self-improve/state/agent-settings.json- If : ask user
status: "user_stopped". If no, exit. If yes, continue."Previous run was stopped at iteration {N}. Resume? [yes/no]" - If : session crashed — resume automatically (no user prompt)
status: "running" - If : fresh start
status: "idle"
- If
- Re-confirm trust gate only if is
trust_confirmedin agent-settings.jsonfalse - Read :
.omc/self-improve/state/iteration_state.json- → resume from
status: "in_progress", skip completed sub-stepscurrent_step - → start next iteration
status: "completed" - → complete recording step if needed, start next iteration
status: "failed" - File missing → start from iteration 1
前置要求:无论之前的状态如何,步骤0(清理过期工作树)必须完全执行完成后才能运行任何续跑逻辑。
调用时,进入循环之前:
- 始终先运行步骤0(清理过期工作树)—— 即使是全新启动
- 读取:
.omc/self-improve/state/agent-settings.json- 如果:询问用户
status: "user_stopped",如果用户选否直接退出,选是则继续"Previous run was stopped at iteration {N}. Resume? [yes/no]" - 如果:会话崩溃 —— 自动续跑(无需用户确认)
status: "running" - 如果:全新启动
status: "idle"
- 如果
- 仅当agent-settings.json中为
trust_confirmed时才需要重新确认信任false - 读取:
.omc/self-improve/state/iteration_state.json- → 从
status: "in_progress"继续执行,跳过已完成的子步骤current_step - → 启动下一轮迭代
status: "completed" - → 按需完成记录步骤,启动下一轮迭代
status: "failed" - 文件不存在 → 从第1轮迭代开始
Completion
结束流程
When the loop exits:
- Update agent-settings.json with final status
- If AND
target_reachedisauto_prin settings: spawn git-master to create PR fromtrueto upstream. Ifimprove/{goal_slug}isauto_pr(default): skip PR creation. Log:false"PR creation skipped (auto_pr: false). Run manually: gh pr create --head improve/{goal_slug} --base {target_branch}" - Run plot_progress.py one final time
- Print summary report:
=== Self-Improvement Loop Complete === Status: {status} Iterations: {iterations} Best Score: {best_score} (baseline: {baseline}) Improvement: {delta} ({delta_pct}%) - Run for clean state cleanup
/oh-my-claudecode:cancel
循环退出时:
- 更新agent-settings.json的最终状态
- 如果且配置中
target_reached为auto_pr:创建git-master Agent,从true向上游仓库提PR。 如果improve/{goal_slug}为auto_pr(默认):跳过PR创建,记录日志:false"PR creation skipped (auto_pr: false). Run manually: gh pr create --head improve/{goal_slug} --base {target_branch}" - 最后运行一次plot_progress.py
- 打印总结报告:
=== Self-Improvement Loop Complete === Status: {status} Iterations: {iterations} Best Score: {best_score} (baseline: {baseline}) Improvement: {delta} ({delta_pct}%) - 运行清理状态
/oh-my-claudecode:cancel
Error Handling
错误处理
| Situation | Action |
|---|---|
| Agent fails to produce output | Retry once. If still no output, log and continue. |
| Researcher produces empty brief | Proceed — planners work from history alone. |
| All plans rejected by critic | Skip execution. Log. Continue to next iteration. |
| All executors fail | Skip tournament. Record failures. Continue. |
| Merge conflict | Reject candidate, try next. |
| Re-benchmark regression | Reject candidate, revert merge, try next. |
| Push failure | Log warning. Continue — push is backup. |
| Worktree already exists | Remove and recreate. |
| Settings corrupted | Report and stop. |
| 场景 | 处理方式 |
|---|---|
| Agent未生成输出 | 重试一次,如果仍然无输出,记录日志继续流程 |
| 调研Agent生成空简报 | 继续执行 —— 规划器仅基于历史数据工作 |
| 所有计划都被规则评审驳回 | 跳过执行阶段,记录日志,继续下一轮迭代 |
| 所有执行器都失败 | 跳过锦标赛阶段,记录失败信息,继续流程 |
| 合并冲突 | 驳回该候选,尝试下一个 |
| 重新测试出现回归 | 驳回该候选,撤销合并,尝试下一个 |
| 推送失败 | 记录警告,继续流程 —— 推送仅为备份用途 |
| 工作树已存在 | 删除后重新创建 |
| 配置损坏 | 上报错误并停止 |
Approach Family Taxonomy
方案分类标签
Every plan must be tagged with exactly one:
| Tag | Description |
|---|---|
| Model/component structure changes |
| Optimizer, LR, scheduler, batch size |
| Data loading, augmentation, preprocessing |
| Mixed precision, distributed training, compiled kernels |
| Algorithmic/numerical optimizations |
| Evaluation methodology changes |
| Documentation-only changes |
| Does not fit above — explain in evidence |
每个计划必须恰好标记一个分类标签:
| 标签 | 描述 |
|---|---|
| 模型/组件结构变更 |
| 优化器、学习率、调度器、批次大小 |
| 数据加载、增强、预处理 |
| 混合精度、分布式训练、编译内核 |
| 算法/数值优化 |
| 评估方法变更 |
| 仅文档变更 |
| 不符合以上分类 —— 在证据中说明 |