autoresearch-create

Compare original and translation side by side

🇺🇸

Original

English

🇨🇳

Translation

Chinese

Autoresearch

自动研究

Autonomous experiment loop: try ideas, keep what works, discard what doesn't, never stop.

自主实验循环：尝试各种思路，保留有效方案，摒弃无效方案，持续运行永不停止。

Tools

工具

init_experiment
— configure session (name, metric, unit, direction). Call again to re-initialize with a new baseline when the optimization target changes.
run_experiment
— runs command, times it, captures output.
log_experiment
— records result.
```
keep
```
auto-commits.
```
discard
```
/
```
crash
```
/
```
checks_failed
```
→
```
git checkout -- .
```
to revert. Always include secondary
```
metrics
```
dict. Dashboard: ctrl+x.

init_experiment
— 配置会话（名称、指标、单位、优化方向）。当优化目标变更时，可再次调用该工具，基于新基准重新初始化。
run_experiment
— 运行命令，记录耗时，捕获输出结果。
log_experiment
— 记录实验结果。选择
```
keep
```
会自动提交代码。选择
```
discard
```
/
```
crash
```
/
```
checks_failed
```
时，执行
```
git checkout -- .
```
撤销更改。必须附带次要
```
metrics
```
字典。查看仪表盘：ctrl+x。

Setup

搭建步骤

Ask (or infer): Goal, Command, Metric (+ direction), Files in scope, Constraints.

git checkout -b autoresearch/<goal>-<date>

Read the source files. Understand the workload deeply before writing anything.
Write
```
autoresearch.md
```
and
```
autoresearch.sh
```
(see below). Commit both.
```
init_experiment
```
→ run baseline →
```
log_experiment
```
→ start looping immediately.

询问（或推断）：目标、命令、指标（含优化方向）、涉及文件范围、约束条件。

执行

git checkout -b autoresearch/<goal>-<date>

创建分支

阅读源代码文件。在编写任何内容前，深入理解当前工作负载。
编写
```
autoresearch.md
```
和
```
autoresearch.sh
```
（见下文），并提交这两个文件。
执行
```
init_experiment
```
→ 运行基准实验 → 执行
```
log_experiment
```
→ 立即启动循环。

autoresearch.md

autoresearch.md

This is the heart of the session. A fresh agent with no context should be able to read this file and run the loop effectively. Invest time making it excellent.

markdown

undefined

这是实验会话的核心文件。即使是无上下文的新Agent，也能通过阅读该文件有效运行实验循环。请花时间完善这份文件。

markdown

undefined

Autoresearch: <goal>

Objective

目标

<详细描述我们要优化的对象及工作负载。>

Metrics

指标

Primary: <name> (<unit>, lower/higher is better)
Secondary: <name>, <name>, ...

主指标：<名称>（<单位>，数值越低/越高越好）
次要指标：<名称>, <名称>, ...

How to Run

运行方式

./autoresearch.sh

— outputs

METRIC name=number

lines.

./autoresearch.sh

— 输出

METRIC name=number

格式的结果行。

Files in Scope

涉及文件

<列出Agent可修改的所有文件，并简要说明各文件功能。>

Off Limits

禁止修改项

<列出绝对不能改动的内容。>

Constraints

约束条件

<硬性规则：如必须通过测试、不能新增依赖等。>

What's Been Tried

已尝试方案


Update `autoresearch.md` periodically — especially the "What's Been Tried" section — so resuming agents have full context.

<随着实验推进更新此部分。记录关键成果、无效尝试和架构层面的洞察，避免Agent重复失败的方案。>


定期更新`autoresearch.md`——尤其是“已尝试方案”部分——以便恢复运行的Agent拥有完整上下文。

autoresearch.sh

autoresearch.sh

Bash script (

set -euo pipefail

) that: pre-checks fast (syntax errors in <1s), runs the benchmark, outputs

METRIC name=number

lines. Keep it fast — every second is multiplied by hundreds of runs. Update it during the loop as needed.

这是一个Bash脚本（启用

set -euo pipefail

），功能包括：快速预检查（1秒内检测语法错误）、运行基准测试、输出

METRIC name=number

格式的结果行。请保持脚本运行速度——每一秒的耗时都会在数百次运行中被放大。可在循环运行期间按需更新该脚本。

autoresearch.checks.sh

(optional)

autoresearch.checks.sh

（可选）

Bash script (

set -euo pipefail

) for backpressure/correctness checks: tests, types, lint, etc. Only create this file when the user's constraints require correctness validation (e.g., "tests must pass", "types must check").

When this file exists:

Runs automatically after every passing benchmark in
```
run_experiment
```
.
If checks fail,
```
run_experiment
```
reports it clearly — log as
```
checks_failed
```
.
Its execution time does NOT affect the primary metric.
You cannot
```
keep
```
a result when checks have failed.
Has a separate timeout (default 300s, configurable via
```
checks_timeout_seconds
```
).

When this file does not exist, everything behaves exactly as before — no changes to the loop.

Keep output minimal. Only the last 80 lines of checks output are fed back to the agent on failure. Suppress verbose progress/success output and let only errors through. This keeps context lean and helps the agent pinpoint what broke.

bash

#!/bin/bash
set -euo pipefail

这是一个Bash脚本（启用

set -euo pipefail

），用于进行正确性检查：如测试、类型检查、代码规范检查等。仅当用户的约束条件要求正确性验证时才创建该文件（例如：“必须通过测试”、“必须通过类型检查”）。

当该文件存在时：

会在每次基准测试通过后自动运行。
若检查失败，
```
run_experiment
```
会清晰报告结果——记录为
```
checks_failed
```
。
其执行时间不会影响主指标。
检查失败时，无法选择
```
keep
```
保留结果。
有独立的超时时间（默认300秒，可通过
```
checks_timeout_seconds
```
配置）。

当该文件不存在时，所有流程与之前完全一致——不会对循环运行产生任何变化。

**请保持输出内容精简。**仅会将检查失败时的最后80行输出反馈给Agent。屏蔽冗余的进度/成功输出，仅保留错误信息。这样可保持上下文简洁，帮助Agent快速定位问题。

bash

#!/bin/bash
set -euo pipefail

Example: run tests and typecheck — suppress success output, only show errors

示例：运行测试和类型检查——屏蔽成功输出，仅显示错误信息

pnpm test --run --reporter=dot 2>&1 | tail -50 pnpm typecheck 2>&1 | grep -i error || true

undefined

pnpm test --run --reporter=dot 2>&1 | tail -50 pnpm typecheck 2>&1 | grep -i error || true

undefined

Loop Rules

循环运行规则

LOOP FOREVER. Never ask "should I continue?" — the user expects autonomous work.

Primary metric is king. Improved →
```
keep
```
. Worse/equal →
```
discard
```
. Secondary metrics rarely affect this.
Simpler is better. Removing code for equal perf = keep. Ugly complexity for tiny gain = probably discard.
Don't thrash. Repeatedly reverting the same idea? Try something structurally different.
Crashes: fix if trivial, otherwise log and move on. Don't over-invest.
Think longer when stuck. Re-read source files, study the profiling data, reason about what the CPU is actually doing. The best ideas come from deep understanding, not from trying random variations.
Resuming: if
```
autoresearch.md
```
exists, read it + git log, continue looping.

NEVER STOP. The user may be away for hours. Keep going until interrupted.

**持续循环运行。**永远不要询问“是否继续？”——用户期望的是自主式工作。

**主指标优先。**若主指标提升→选择
```
keep
```
。若主指标下降/持平→选择
```
discard
```
。次要指标几乎不影响该决策。
**越简洁越好。**在性能相当的情况下，删除代码的方案→保留。为微小性能提升引入复杂实现→通常应摒弃。
**避免无效重复。**若反复撤销同一方案？尝试从架构层面进行不同的改动。
**崩溃处理：**若问题简单则修复，否则记录问题后继续。不要过度投入时间。
**陷入瓶颈时多思考。**重新阅读源代码文件，分析性能分析数据，思考CPU实际的运行状态。最优方案来自深度理解，而非随机尝试。
**恢复运行：**若
```
autoresearch.md
```
已存在，阅读该文件及git日志后，继续循环运行。

**永远不要停止。**用户可能数小时不在场。持续运行直到被中断。

Ideas Backlog

待办思路

When you discover complex but promising optimizations that you won't pursue right now, append them as bullets to
autoresearch.ideas.md
. Don't let good ideas get lost.

On resume (context limit, crash), check

autoresearch.ideas.md

— prune stale/tried entries, experiment with the rest. When all paths are exhausted, delete the file and write a final summary.

当你发现复杂但有潜力的优化方案，却暂时无法实施时，将其以项目符号形式追加到
autoresearch.ideas.md
文件中。不要让好想法被遗漏。

恢复运行时（如遇到上下文限制、崩溃后），检查

autoresearch.ideas.md

——删除过时/已尝试的条目，对剩余方案进行实验。当所有思路都已尝试完毕，删除该文件并撰写最终总结。

User Messages During Experiments

实验运行期间的用户消息处理

If the user sends a message while an experiment is running, finish the current

run_experiment

log_experiment

cycle first, then incorporate their feedback in the next iteration. Don't abandon a running experiment.

若在实验运行期间用户发送消息，需先完成当前的

run_experiment

log_experiment

周期，再在下一次迭代中纳入用户的反馈。不要中途终止正在运行的实验。

autoresearch-create

Original

Translation

Autoresearch

自动研究

Tools

工具

Setup

搭建步骤

`autoresearch.md`

`autoresearch.md`

Autoresearch: <goal>

Autoresearch: <goal>

Objective

目标

Metrics

指标

How to Run

运行方式

Files in Scope

涉及文件

Off Limits

禁止修改项

Constraints

约束条件

What's Been Tried

已尝试方案

`autoresearch.sh`

`autoresearch.sh`

`autoresearch.checks.sh`
(optional)

`autoresearch.checks.sh`
（可选）

Example: run tests and typecheck — suppress success output, only show errors

示例：运行测试和类型检查——屏蔽成功输出，仅显示错误信息

Loop Rules

循环运行规则

Ideas Backlog

待办思路

User Messages During Experiments

实验运行期间的用户消息处理

autoresearch-create

Original

Translation

Autoresearch

自动研究

Tools

工具

Setup

搭建步骤

autoresearch.md

autoresearch.md

Autoresearch: <goal>

Autoresearch: <goal>

Objective

目标

Metrics

指标

How to Run

运行方式

Files in Scope

涉及文件

Off Limits

禁止修改项

Constraints

约束条件

What's Been Tried

已尝试方案

autoresearch.sh

autoresearch.sh

autoresearch.checks.sh (optional)

autoresearch.checks.sh（可选）

Example: run tests and typecheck — suppress success output, only show errors

示例：运行测试和类型检查——屏蔽成功输出，仅显示错误信息

Loop Rules

循环运行规则

Ideas Backlog

待办思路

User Messages During Experiments

实验运行期间的用户消息处理

`autoresearch.md`

`autoresearch.md`

`autoresearch.sh`

`autoresearch.sh`

`autoresearch.checks.sh`
(optional)

`autoresearch.checks.sh`
（可选）