auto-research-loop

Compare original and translation side by side

🇺🇸

Original

English
🇨🇳

Translation

Chinese

Auto Research Loop — Autonomous Iteration Engine

自动研究循环——自主迭代引擎

Combines Karpathy's autoresearch with Ralph Loop infrastructure. Modify, Verify, Keep/Discard, Repeat.
结合Karpathy's autoresearch与Ralph Loop基础设施。流程为:修改、验证、保留/舍弃、重复。

Invocation

调用方式

/auto-research-loop [PROMPT] [FLAGS]
— Run the loop:
"${CLAUDE_PLUGIN_ROOT}/scripts/setup-auto-research-loop.sh" $ARGUMENTS
Then follow the injected instructions. The stop hook auto-installs and intercepts exit to re-feed the prompt.
/auto-research-loop-plan
— Interactive planning wizard:
Don't run the setup script. Instead, read
${CLAUDE_PLUGIN_ROOT}/skills/auto-research-loop/references/plan-workflow.md
and walk the user through 7 phases to build a validated configuration:
  1. Capture goal
  2. Analyze codebase context
  3. Define scope (which files to modify)
  4. Define metric (must be mechanical — a command that outputs a number)
  5. Define direction (higher or lower is better)
  6. Define verify command (dry-run it to confirm it works)
  7. Confirm and launch — output a ready-to-paste
    /auto-research-loop
    command
Use this wizard when the user says "help me set up", "plan a run", "what should my metric be", or invokes the plan command.
/auto-research-loop [PROMPT] [FLAGS]
—— 运行循环:
"${CLAUDE_PLUGIN_ROOT}/scripts/setup-auto-research-loop.sh" $ARGUMENTS
随后按照注入的说明操作。停止钩子会自动安装,并拦截退出操作以重新传入提示词。
/auto-research-loop-plan
—— 交互式规划向导:
无需运行设置脚本。请阅读
${CLAUDE_PLUGIN_ROOT}/skills/auto-research-loop/references/plan-workflow.md
,并引导用户完成7个阶段以构建经过验证的配置:
  1. 捕获目标
  2. 分析代码库上下文
  3. 定义范围(要修改的文件)
  4. 定义指标(必须可量化——即能输出数字的命令)
  5. 定义方向(数值越高越好还是越低越好)
  6. 定义验证命令(先试运行以确认可用)
  7. 确认并启动——输出可直接粘贴的
    /auto-research-loop
    命令
当用户说“帮我设置”、“规划一次运行”、“我的指标应该是什么”,或是调用规划命令时,请使用此向导。

Two Modes

两种模式

Metric ModeTask Mode
When
--metric
+
--verify
provided
No metric provided
DecisionMetric improved? Keep. Worse?
git revert
Accumulate toward completion
ExitMax iterations or manualCompletion promise or max iterations
Journal
autoresearch-results.tsv
IMPLEMENTATION_PLAN.md
指标模式任务模式
适用场景提供
--metric
+
--verify
参数时
未提供指标时
决策逻辑指标是否提升?是则保留,否则
git revert
逐步推进直至任务完成
退出条件达到最大迭代次数或手动停止完成任务承诺或达到最大迭代次数
日志记录
autoresearch-results.tsv
IMPLEMENTATION_PLAN.md

The Loop

循环流程

LOOP:
  0. Scratchpad: READ .claude/auto-research-loop-scratchpad.md
  1. Review: State + git log + results/plan
  2. Ideate: Fix crashes > exploit wins > explore > simplify > radical
  3. Modify: ONE focused change
  4. Commit: git commit BEFORE verification
  5. Verify: Metric command (metric) or gate commands (task)
  6. Decide:
     Metric: IMPROVED -> keep. WORSE -> git reset --hard HEAD~1
     Task: Gates pass + promise true -> exit. Else -> continue
  7. Log: Results TSV (metric) or update plan (task)
  8. Scratchpad: UPDATE before exit
  9. Repeat
Read
${CLAUDE_PLUGIN_ROOT}/skills/auto-research-loop/references/autonomous-loop-protocol.md
for full protocol.
LOOP:
  0. 草稿本:读取.claude/auto-research-loop-scratchpad.md
  1. 回顾:状态 + git日志 + 结果/规划
  2. 构思:修复崩溃 > 利用现有成果 > 探索新方向 > 简化 > 激进改进
  3. 修改:每次仅做一个聚焦的变更
  4. 提交:验证前先执行git commit
  5. 验证:执行指标命令(指标模式)或门控命令(任务模式)
  6. 决策:
     指标模式:指标提升 -> 保留。指标下降 -> git reset --hard HEAD~1
     任务模式:通过门控且完成承诺 -> 退出。否则 -> 继续
  7. 记录:写入结果TSV文件(指标模式)或更新规划(任务模式)
  8. 草稿本:退出前更新内容
  9. 重复
完整协议请阅读
${CLAUDE_PLUGIN_ROOT}/skills/auto-research-loop/references/autonomous-loop-protocol.md

Critical Rules

核心规则

  1. NEVER STOP — loop until interrupted, max iterations, or promise met
  2. One change per iteration — atomic, attributable
  3. Mechanical verification only — no subjective judgment
  4. Simplicity wins — equal results + less code = KEEP
  5. Git is memory — commit before verify, revert on failure
  6. Scratchpad is mandatory — read at start, update before exit
  1. 永不停止——循环直至被中断、达到最大迭代次数或完成任务承诺
  2. 每次迭代一个变更——原子性、可追溯
  3. 仅机械验证——无主观判断
  4. 简洁优先——结果相同+代码更少=保留
  5. Git作为记忆——验证前提交,失败时回滚
  6. 必须使用草稿本——开始时读取,退出前更新

Domain Adaptation

领域适配

DomainMetricDirectionVerifyScope
Test coverage%higher
pytest --cov | grep TOTAL
src/**/*.py
Bundle sizeKBlower
npm run build | grep size
src/**/*.ts
ML trainingval_bpblower
uv run train.py | grep val_bpb
train.py
Performancemslower
npm run bench | grep p95
target files
To manually stop:
rm .claude/auto-research-loop.local.md
领域指标优化方向验证方式范围
测试覆盖率%越高越好
pytest --cov | grep TOTAL
src/**/*.py
包体积KB越小越好
npm run build | grep size
src/**/*.ts
ML训练val_bpb越小越好
uv run train.py | grep val_bpb
train.py
性能ms越小越好
npm run bench | grep p95
目标文件
手动停止方式:
rm .claude/auto-research-loop.local.md