autoresearch

Compare original and translation side by side

🇺🇸

Original

English
🇨🇳

Translation

Chinese

Autoresearch

Autoresearch

Autonomous experiment loop: try ideas, keep what works, discard what doesn't, never stop.
自主实验循环:尝试各种思路,保留有效的方案,摒弃无效的,持续运行永不停止。

Setup

搭建步骤

  1. Ask (or infer): Goal, Command, Metric (+ direction), Files in scope, Constraints.
  2. git checkout -b autoresearch/<goal>-<date>
  3. Read the source files. Understand the workload deeply before writing anything.
  4. mkdir -p experiments
    then write
    autoresearch.md
    ,
    autoresearch.sh
    , and
    experiments/worklog.md
    (see below). Commit all three.
  5. Initialize experiment (write config header to
    autoresearch.jsonl
    ) → run baseline → log result → start looping immediately.
  1. 询问(或推断):目标执行命令指标(+ 优化方向)、涉及文件范围约束条件
  2. git checkout -b autoresearch/<goal>-<date>
  3. 阅读源代码文件。在编写任何内容前,深入理解当前的工作负载。
  4. 执行
    mkdir -p experiments
    创建目录,然后编写
    autoresearch.md
    autoresearch.sh
    experiments/worklog.md
    (详见下文)。将这三个文件提交到git。
  5. 初始化实验(将配置头写入
    autoresearch.jsonl
    )→ 运行基准测试 → 记录结果 → 立即启动循环。

autoresearch.md

autoresearch.md

This is the heart of the session. A fresh agent with no context should be able to read this file and run the loop effectively. Invest time making it excellent.
markdown
undefined
这是整个流程的核心。即使是没有上下文的新Agent,也能通过阅读该文件有效运行实验循环。请花时间确保该文件的质量。
markdown
undefined

Autoresearch: <goal>

Autoresearch: <goal>

Objective

目标

<Specific description of what we're optimizing and the workload.>
<详细描述我们要优化的内容以及对应的工作负载。>

Metrics

指标

  • Primary: <name> (<unit>, lower/higher is better)
  • Secondary: <name>, <name>, ...
  • 主要指标:<名称>(<单位>,越低/越高越好)
  • 次要指标:<名称>,<名称>,...

How to Run

运行方式

./autoresearch.sh
— outputs
METRIC name=number
lines.
./autoresearch.sh
— 输出
METRIC name=number
格式的行。

Files in Scope

涉及文件范围

<Every file the agent may modify, with a brief note on what it does.>
<Agent可修改的所有文件,附带简要功能说明。>

Off Limits

禁止修改内容

<What must NOT be touched.>
<绝对不能触碰的内容。>

Constraints

约束条件

<Hard rules: tests must pass, no new deps, etc.>
<硬性规则:必须通过测试,不能引入新依赖等。>

What's Been Tried

已尝试方案

<Update this section as experiments accumulate. Note key wins, dead ends, and architectural insights so the agent doesn't repeat failed approaches.>

Update `autoresearch.md` periodically — especially the "What's Been Tried" section — so resuming agents have full context.
<随着实验推进更新此部分。记录关键成果、无效尝试和架构洞察,避免Agent重复失败的方案。>

定期更新`autoresearch.md` — 尤其是“已尝试方案”部分 — 以便恢复运行的Agent能获取完整上下文。

autoresearch.sh

autoresearch.sh

Bash script (
set -euo pipefail
) that: pre-checks fast (syntax errors in <1s), runs the benchmark, outputs
METRIC name=number
lines. Keep it fast — every second is multiplied by hundreds of runs. Update it during the loop as needed.

Bash脚本(需包含
set -euo pipefail
):先快速预检查(1秒内完成语法错误检查),运行基准测试,输出
METRIC name=number
格式的行。请保持脚本运行速度快 — 每一秒的延迟都会在数百次运行中被放大。可在循环过程中按需更新脚本。

JSONL State Protocol

JSONL状态协议

All experiment state lives in
autoresearch.jsonl
. This is the source of truth for resuming across sessions.
所有实验状态都存储在
autoresearch.jsonl
中。这是跨会话恢复运行的唯一可信数据源。

Config Header

配置头

The first line (and any re-initialization line) is a config header:
json
{"type":"config","name":"<session name>","metricName":"<primary metric name>","metricUnit":"<unit>","bestDirection":"lower|higher"}
Rules:
  • First line of the file is always a config header.
  • Each subsequent config header (re-init) starts a new segment. Segment index increments with each config header.
  • The baseline for a segment is the first result line after the config header.
第一行(以及任何重新初始化的行)是配置头:
json
{"type":"config","name":"<session name>","metricName":"<primary metric name>","metricUnit":"<unit>","bestDirection":"lower|higher"}
规则:
  • 文件的第一行必须是配置头。
  • 后续每个配置头(重新初始化)都会开启一个新的分段。分段索引随每个配置头递增。
  • 分段的基准值是配置头之后的第一条结果行。

Result Lines

结果行

Each experiment result is appended as a JSON line:
json
{"run":1,"commit":"abc1234","metric":42.3,"metrics":{"secondary_metric":123},"status":"keep","description":"baseline","timestamp":1234567890,"segment":0}
Fields:
  • run
    : sequential run number (1-indexed, across all segments)
  • commit
    : 7-char git short hash (the commit hash AFTER the auto-commit for keeps, or current HEAD for discard/crash)
  • metric
    : primary metric value (0 for crashes)
  • metrics
    : object of secondary metric values — once you start tracking a secondary metric, include it in every subsequent result
  • status
    :
    keep
    |
    discard
    |
    crash
  • description
    : short description of what this experiment tried
  • timestamp
    : Unix epoch seconds
  • segment
    : current segment index
每个实验结果都以JSON行的形式追加到文件中:
json
{"run":1,"commit":"abc1234","metric":42.3,"metrics":{"secondary_metric":123},"status":"keep","description":"baseline","timestamp":1234567890,"segment":0}
字段说明:
  • run
    :连续运行编号(从1开始,跨所有分段)
  • commit
    :7位git短哈希(保留方案提交后的哈希,或放弃/崩溃时的当前HEAD哈希)
  • metric
    :主要指标值(崩溃时为0)
  • metrics
    :次要指标值的对象 — 一旦开始跟踪某个次要指标,后续所有结果都必须包含它
  • status
    keep
    |
    discard
    |
    crash
  • description
    :本次实验尝试内容的简短描述
  • timestamp
    :Unix时间戳(秒)
  • segment
    :当前分段索引

Initialization (equivalent of
init_experiment
)

初始化(对应
init_experiment

To initialize, write the config header to
autoresearch.jsonl
:
bash
echo '{"type":"config","name":"<name>","metricName":"<metric>","metricUnit":"<unit>","bestDirection":"<lower|higher>"}' > autoresearch.jsonl
To re-initialize (change optimization target), append a new config header:
bash
echo '{"type":"config","name":"<name>","metricName":"<metric>","metricUnit":"<unit>","bestDirection":"<lower|higher>"}' >> autoresearch.jsonl

要初始化实验,将配置头写入
autoresearch.jsonl
bash
echo '{"type":"config","name":"<name>","metricName":"<metric>","metricUnit":"<unit>","bestDirection":"<lower|higher>"}' > autoresearch.jsonl
要重新初始化(更改优化目标),追加新的配置头:
bash
echo '{"type":"config","name":"<name>","metricName":"<metric>","metricUnit":"<unit>","bestDirection":"<lower|higher>"}' >> autoresearch.jsonl

Data Integrity Protocol

数据完整性协议

CRITICAL: JSONL data must never be corrupted or lost.
至关重要:JSONL数据绝对不能损坏或丢失。

Pre-Write Validation (before appending to JSONL)

写入前验证(追加到JSONL之前)

Before writing any new experiment result, validate the JSONL file:
bash
undefined
在写入任何新实验结果前,验证JSONL文件:
bash
undefined

Validate JSONL file before writing

Validate JSONL file before writing

validate_jsonl() { local jsonl_file="autoresearch.jsonl"
if [[ -f "$jsonl_file" ]]; then
    # Count existing runs
    local run_count=$(grep -c '"run":' "$jsonl_file" 2>/dev/null || echo 0)
    echo "Current runs in JSONL: $run_count" >&2
    
    # Verify last 5 lines are valid JSON
    tail -n 5 "$jsonl_file" 2>/dev/null | while IFS= read -r line; do
        if ! echo "$line" | python3 -m json.tool >/dev/null 2>&1; then
            echo "WARNING: Invalid JSON found in state file" >&2
            return 1
        fi
    done
    
    echo "JSONL validation: OK" >&2
    return 0
fi
return 0  # File doesn't exist yet, that's OK
}
validate_jsonl() { local jsonl_file="autoresearch.jsonl"
if [[ -f "$jsonl_file" ]]; then
    # Count existing runs
    local run_count=$(grep -c '"run":' "$jsonl_file" 2>/dev/null || echo 0)
    echo "Current runs in JSONL: $run_count" >&2
    
    # Verify last 5 lines are valid JSON
    tail -n 5 "$jsonl_file" 2>/dev/null | while IFS= read -r line; do
        if ! echo "$line" | python3 -m json.tool >/dev/null 2>&1; then
            echo "WARNING: Invalid JSON found in state file" >&2
            return 1
        fi
    done
    
    echo "JSONL validation: OK" >&2
    return 0
fi
return 0  # File doesn't exist yet, that's OK
}

Call validation before any write

Call validation before any write

validate_jsonl || { echo " WARNING: JSONL validation failed. Proceeding with caution." >&2 }
undefined
validate_jsonl || { echo " WARNING: JSONL validation failed. Proceeding with caution." >&2 }
undefined

Atomic Write Pattern

原子写入模式

Never append directly to JSONL. Use atomic write pattern:
bash
write_jsonl_entry() {
    local entry="$1"
    local jsonl_file="autoresearch.jsonl"
    local temp_file="${jsonl_file}.tmp.$$"
    
    # Create temp file
    cat "$jsonl_file" > "$temp_file" 2>/dev/null || touch "$temp_file"
    
    # Append entry
    echo "$entry" >> "$temp_file"
    
    # Validate the new entry
    if ! echo "$entry" | python3 -m json.tool >/dev/null 2>&1; then
        rm -f "$temp_file"
        echo "  WARNING: Invalid JSON entry, not writing" >&2
        return 1
    fi
    
    # Atomic move (guaranteed all-or-nothing)
    mv "$temp_file" "$jsonl_file"
    
    # Verify write succeeded
    local new_count=$(grep -c '"run":' "$jsonl_file" 2>/dev/null || echo 0)
    echo "Write verification: $new_count runs in JSONL" >&2
    
    return 0
}
绝不能直接追加到JSONL文件。使用原子写入模式:
bash
write_jsonl_entry() {
    local entry="$1"
    local jsonl_file="autoresearch.jsonl"
    local temp_file="${jsonl_file}.tmp.$$"
    
    # Create temp file
    cat "$jsonl_file" > "$temp_file" 2>/dev/null || touch "$temp_file"
    
    # Append entry
    echo "$entry" >> "$temp_file"
    
    # Validate the new entry
    if ! echo "$entry" | python3 -m json.tool >/dev/null 2>&1; then
        rm -f "$temp_file"
        echo "  WARNING: Invalid JSON entry, not writing" >&2
        return 1
    fi
    
    # Atomic move (guaranteed all-or-nothing)
    mv "$temp_file" "$jsonl_file"
    
    # Verify write succeeded
    local new_count=$(grep -c '"run":' "$jsonl_file" 2>/dev/null || echo 0)
    echo "Write verification: $new_count runs in JSONL" >&2
    
    return 0
}

Post-Write Verification

写入后验证

After every write operation, verify the data was written correctly:
bash
verify_write() {
    local expected_run=$1
    local jsonl_file="autoresearch.jsonl"
    
    if [[ -f "$jsonl_file" ]]; then
        local actual_count=$(grep -c '"run":' "$jsonl_file" 2>/dev/null || echo 0)
        
        if [[ "$actual_count" -lt "$expected_run" ]]; then
            echo "  WARNING: Run count mismatch! Expected $expected_run, got $actual_count" >&2
            echo "This may indicate data loss in previous writes." >&2
            return 1
        fi
        
        echo "Write verification: OK (run $expected_run present)" >&2
        return 0
    fi
    return 1
}

每次写入操作后,验证数据是否正确写入:
bash
verify_write() {
    local expected_run=$1
    local jsonl_file="autoresearch.jsonl"
    
    if [[ -f "$jsonl_file" ]]; then
        local actual_count=$(grep -c '"run":' "$jsonl_file" 2>/dev/null || echo 0)
        
        if [[ "$actual_count" -lt "$expected_run" ]]; then
            echo "  WARNING: Run count mismatch! Expected $expected_run, got $actual_count" >&2
            echo "This may indicate data loss in previous writes." >&2
            return 1
        fi
        
        echo "Write verification: OK (run $expected_run present)" >&2
        return 0
    fi
    return 1
}

User-Confirmable Actions

用户确认操作

Before any user-confirmable action (e.g., manual intervention, major changes, discarding multiple experiments), create a backup:
bash
undefined
在执行任何需要用户确认的操作前(例如手动干预、重大变更、放弃多个实验),创建备份:
bash
undefined

Backup state before user-confirmable action

Backup state before user-confirmable action

backup_before_confirm() { echo " User confirmation required. Creating backup..." >&2
# Use backup utility if available
if [[ -f "./scripts/backup-state.sh" ]]; then
    ./scripts/backup-state.sh backup autoresearch.jsonl 2>/dev/null || true
else
    # Fallback: simple backup
    cp autoresearch.jsonl "autoresearch.jsonl.backup.$(date +%s)" 2>/dev/null || true
fi

echo "Backup created. Awaiting user confirmation..." >&2
}

**Always call `backup_before_confirm` before any operation that requires user approval.**

---
backup_before_confirm() { echo " User confirmation required. Creating backup..." >&2
# Use backup utility if available
if [[ -f "./scripts/backup-state.sh" ]]; then
    ./scripts/backup-state.sh backup autoresearch.jsonl 2>/dev/null || true
else
    # Fallback: simple backup
    cp autoresearch.jsonl "autoresearch.jsonl.backup.$(date +%s)" 2>/dev/null || true
fi

echo "Backup created. Awaiting user confirmation..." >&2
}

**在任何需要用户批准的操作前,务必调用`backup_before_confirm`。**

---

Dashboard Data Consistency Check

仪表盘数据一致性检查

When generating the dashboard, check for data consistency:
生成仪表盘时,检查数据一致性:

Data Consistency Check

数据一致性检查

If the number of runs in
autoresearch.jsonl
doesn't match the number of entries in
experiments/worklog.md
:
  1. Check for backups:
    scripts/backup-state.sh list autoresearch.jsonl
  2. If backups exist: Restore with
    scripts/backup-state.sh restore-auto
  3. If no backups: Manually recreate missing runs from worklog notes
  4. Note the discrepancy in the dashboard header
Add this warning banner to the dashboard when inconsistency is detected:
markdown
 **DATA INCONSISTENCY DETECTED**

- **Worklog documents**: <WORKLOG_RUN_COUNT> experiments
- **JSONL contains**: <JSONL_RUN_COUNT> runs
- **Missing**: <DIFF> runs **LOST!**

**Recovery steps:**
1. Check backups: `scripts/backup-state.sh list autoresearch.jsonl`
2. Restore if available: `scripts/backup-state.sh restore-auto`
3. Otherwise, manually recreate missing runs from worklog

如果
autoresearch.jsonl
中的运行次数与
experiments/worklog.md
中的条目数量不匹配:
  1. 检查备份
    scripts/backup-state.sh list autoresearch.jsonl
  2. 如果有备份:使用
    scripts/backup-state.sh restore-auto
    恢复
  3. 如果没有备份:根据工作日志手动重建缺失的运行记录
  4. 记录差异:在仪表盘头部标注差异
当检测到不一致时,在仪表盘添加以下警告横幅:
markdown
 **数据不一致警告**

- **工作日志记录**<WORKLOG_RUN_COUNT> 次实验
- **JSONL文件记录**<JSONL_RUN_COUNT> 次运行
- **缺失**<DIFF> 次运行 **已丢失!**

**恢复步骤:**
1. 检查备份:`scripts/backup-state.sh list autoresearch.jsonl`
2. 如有可用备份,执行恢复:`scripts/backup-state.sh restore-auto`
3. 否则,根据工作日志手动重建缺失的运行记录

Running Experiments (equivalent of
run_experiment
)

运行实验(对应
run_experiment

Run the benchmark command, capturing timing and output:
bash
START_TIME=$(date +%s%N)
bash -c "./autoresearch.sh" 2>&1 | tee /tmp/autoresearch-output.txt
EXIT_CODE=$?
END_TIME=$(date +%s%N)
DURATION=$(echo "scale=3; ($END_TIME - $START_TIME) / 1000000000" | bc)
echo "Duration: ${DURATION}s, Exit code: ${EXIT_CODE}"
After running:
  • Parse
    METRIC name=number
    lines from the output to extract metric values
  • If exit code != 0 → this is a crash
  • Read the output to understand what happened

运行基准测试命令,捕获运行时间和输出:
bash
START_TIME=$(date +%s%N)
bash -c "./autoresearch.sh" 2>&1 | tee /tmp/autoresearch-output.txt
EXIT_CODE=$?
END_TIME=$(date +%s%N)
DURATION=$(echo "scale=3; ($END_TIME - $START_TIME) / 1000000000" | bc)
echo "Duration: ${DURATION}s, Exit code: ${EXIT_CODE}"
运行完成后:
  • 从输出中解析
    METRIC name=number
    格式的行,提取指标值
  • 如果退出码≠0 → 本次运行崩溃
  • 阅读输出内容,了解运行情况

Logging Results (equivalent of
log_experiment
)

记录结果(对应
log_experiment

After each experiment run, follow this exact protocol:
每次实验运行后,严格遵循以下步骤:

1. Determine status

1. 确定状态

  • keep: primary metric improved (lower if
    bestDirection=lower
    , higher if
    bestDirection=higher
    )
  • discard: primary metric worse or equal to best kept result
  • crash: command failed (non-zero exit code)
Secondary metrics are for monitoring only — they almost never affect keep/discard decisions. Only discard a primary improvement if a secondary metric degraded catastrophically, and explain why in the description.
  • keep(保留):主要指标得到优化(如果
    bestDirection=lower
    则值更低,如果
    bestDirection=higher
    则值更高)
  • discard(放弃):主要指标比已保留的最佳结果更差或持平
  • crash(崩溃):命令执行失败(非零退出码)
次要指标仅用于监控 — 几乎不会影响保留/放弃的决策。只有当主要指标优化但次要指标出现灾难性退化时,才可以放弃该优化方案,并在描述中说明原因。

2. Git operations

2. Git操作

If keep:
bash
git add -A
git diff --cached --quiet && echo "nothing to commit" || git commit -m "<description>

Result: {\"status\":\"keep\",\"<metricName>\":<value>,<secondary metrics>}"
Then get the new commit hash:
bash
git rev-parse --short=7 HEAD
If discard or crash:
bash
git checkout -- .
git clean -fd
Use the current HEAD hash (before revert) as the commit field.
如果是保留状态:
bash
git add -A
git diff --cached --quiet && echo "nothing to commit" || git commit -m "<description>

Result: {\"status\":\"keep\",\"<metricName>\":<value>,<secondary metrics>}"
然后获取新的提交哈希:
bash
git rev-parse --short=7 HEAD
如果是放弃或崩溃状态:
bash
git checkout -- .
git clean -fd
使用恢复前的当前HEAD哈希作为commit字段的值。

3. Append result to JSONL

3. 将结果追加到JSONL

bash
echo '{"run":<N>,"commit":"<hash>","metric":<value>,"metrics":{<secondaries>},"status":"<status>","description":"<desc>","timestamp":'$(date +%s)',"segment":<seg>}' >> autoresearch.jsonl
bash
echo '{"run":<N>,"commit":"<hash>","metric":<value>,"metrics":{<secondaries>},"status":"<status>","description":"<desc>","timestamp":'$(date +%s)',"segment":<seg>}' >> autoresearch.jsonl

4. Update dashboard

4. 更新仪表盘

After every log, regenerate
autoresearch-dashboard.md
(see Dashboard section below).
每次记录结果后,重新生成
autoresearch-dashboard.md
(详见下文仪表盘部分)。

5. Append to worklog

5. 追加到工作日志

After every experiment, append a concise entry to
experiments/worklog.md
. This file survives context compactions and crashes, giving any resuming agent (or the user) a complete narrative of the session. Format:
markdown
undefined
每次实验后,向
experiments/worklog.md
追加简洁的条目。该文件会在上下文压缩或崩溃后保留,为恢复运行的Agent(或用户)提供完整的会话记录。格式如下:
markdown
undefined

Run N: <short description> — <primary_metric>=<value> (<STATUS>)

第N次运行:<简短描述> — <主要指标>=<值>(<状态>)

  • Timestamp: YYYY-MM-DD HH:MM
  • What changed: <1-2 sentences describing the code/config change>
  • Result: <metric values>, <delta vs best>
  • Insight: <what was learned, why it worked/failed>
  • Next: <what to try next based on this result>

Also update the "Key Insights" and "Next Ideas" sections at the bottom of the worklog when you learn something new.

**On setup**, create `experiments/worklog.md` with the session header, data summary, and baseline result. **On resume**, read `experiments/worklog.md` to recover context.
  • 时间戳:YYYY-MM-DD HH:MM
  • 变更内容:<1-2句话描述代码/配置变更>
  • 结果:<指标值>,与最佳结果的差异
  • 洞察:<学到的内容,方案有效/无效的原因>
  • 下一步:<基于本次结果的后续尝试方向>

当有新的发现时,更新工作日志底部的“关键洞察”和“后续思路”部分。

**搭建时**,创建`experiments/worklog.md`并添加会话头部、数据摘要和基准测试结果。**恢复运行时**,读取`experiments/worklog.md`以恢复上下文。

6. Secondary metric consistency

6. 次要指标一致性

Once you start tracking a secondary metric, you MUST include it in every subsequent result. Parse the JSONL to discover which secondary metrics have been tracked and ensure all are present.
If you want to add a new secondary metric mid-session, that's fine — but from that point forward, always include it.

一旦开始跟踪某个次要指标,后续所有结果都必须包含它。解析JSONL文件以确定已跟踪的次要指标,并确保所有结果都包含这些指标。
如果在会话中途要添加新的次要指标是允许的 — 但从添加后开始,必须始终包含该指标。

Dashboard

仪表盘

After each experiment, regenerate
autoresearch-dashboard.md
:
markdown
undefined
每次实验后,重新生成
autoresearch-dashboard.md
markdown
undefined

Autoresearch Dashboard: <name>

Autoresearch仪表盘:<名称>

Runs: 12 | Kept: 8 | Discarded: 3 | Crashed: 1 Baseline: <metric_name>: <value><unit> (#1) Best: <metric_name>: <value><unit> (#8, -26.2%)
#commit<metric_name>statusdescription
1abc123442.3skeepbaseline
2def567840.1s (-5.2%)keepoptimize hot loop
3abc123443.0s (+1.7%)discardtry vectorization
...

Include delta percentages vs baseline for each metric value. Show ALL runs in the current segment (not just recent ones).

---
**总运行次数:**12 | **保留次数:**8 | **放弃次数:**3 | **崩溃次数:**1 基准值:<指标名称>:<值><单位>(第1次运行) 最佳结果:<指标名称>:<值><单位>(第8次运行,-26.2%)
编号commit<指标名称>状态描述
1abc123442.3skeepbaseline
2def567840.1s (-5.2%)keep优化热循环
3abc123443.0s (+1.7%)discard尝试向量化
...

为每个指标值添加与基准值的差异百分比。显示当前分段的**所有**运行记录(而不仅仅是最近的)。

---

State File Backup (Enhanced)

状态文件备份(增强版)

BEFORE user-confirmable actions, create backups:
bash
undefined
在执行需要用户确认的操作前,创建备份:
bash
undefined

Before any major operation requiring user confirmation

Before any major operation requiring user confirmation

if [[ -f "./scripts/backup-state.sh" ]]; then ./scripts/backup-state.sh backup autoresearch.jsonl 2>/dev/null || true else cp autoresearch.jsonl "autoresearch.jsonl.backup.$(date +%s)" 2>/dev/null || true fi

**Best practices**:
- Always backup before major changes or user confirmations
- Keep the last 5 backups (delete older ones)
- Restore from backup if experiment crashes or state becomes corrupted

**Automated cleanup**:
```bash
if [[ -f "./scripts/backup-state.sh" ]]; then ./scripts/backup-state.sh backup autoresearch.jsonl 2>/dev/null || true else cp autoresearch.jsonl "autoresearch.jsonl.backup.$(date +%s)" 2>/dev/null || true fi

**最佳实践**:
- 在重大变更或需要用户确认的操作前始终创建备份
- 保留最近5个备份(删除更早的备份)
- 如果实验崩溃或状态损坏,从备份恢复

**自动清理**:
```bash

Keep only last 5 backups

Keep only last 5 backups

ls -t autoresearch.jsonl.bak.* 2>/dev/null | tail -n +6 | xargs rm -f 2>/dev/null || true

**Warning**: If JSONL data loss is detected, check backups immediately before continuing.

---
ls -t autoresearch.jsonl.bak.* 2>/dev/null | tail -n +6 | xargs rm -f 2>/dev/null || true

**警告**:如果检测到JSONL数据丢失,请立即检查备份后再继续。

---

Data Loss Detection and Recovery

数据丢失检测与恢复

If you detect data loss (e.g., dashboard shows inconsistency, JSONL count doesn't match worklog):
  1. Immediate actions:
    bash
    # Check for data loss
    JSONL_COUNT=$(grep -c '"run":' autoresearch.jsonl 2>/dev/null || echo 0)
    WORKLOG_COUNT=$(grep -c "^### Run" experiments/worklog.md 2>/dev/null || echo 0)
    
    if [[ "$JSONL_COUNT" -ne "$WORKLOG_COUNT" ]]; then
        echo "  DATA LOSS DETECTED: JSONL has $JSONL_COUNT runs, worklog has $WORKLOG_COUNT runs" >&2
    fi
  2. Check backups:
    bash
    ./scripts/backup-state.sh list autoresearch.jsonl
  3. Recovery options:
    • Best: Restore from backup if recent enough
    • Alternative: Manually recreate missing runs from worklog notes
    • Last resort: Start new segment with new config header
  4. Prevention: Always backup before user-confirmable actions (see "User-Confirmable Actions" above)

如果检测到数据丢失(例如仪表盘显示不一致,JSONL计数与工作日志不匹配):
  1. 立即执行以下操作
    bash
    # Check for data loss
    JSONL_COUNT=$(grep -c '"run":' autoresearch.jsonl 2>/dev/null || echo 0)
    WORKLOG_COUNT=$(grep -c "^### Run" experiments/worklog.md 2>/dev/null || echo 0)
    
    if [[ "$JSONL_COUNT" -ne "$WORKLOG_COUNT" ]]; then
        echo "  DATA LOSS DETECTED: JSONL has $JSONL_COUNT runs, worklog has $WORKLOG_COUNT runs" >&2
    fi
  2. 检查备份
    bash
    ./scripts/backup-state.sh list autoresearch.jsonl
  3. 恢复选项
    • 最佳方案:如果有足够新的备份,从备份恢复
    • 替代方案:根据工作日志手动重建缺失的运行记录
    • 最后手段:使用新的配置头开启新的分段
  4. 预防措施:在执行需要用户确认的操作前始终创建备份(见上文“用户确认操作”部分)

Loop Rules

循环规则

LOOP FOREVER. Never ask "should I continue?" — the user expects autonomous work.
  • Primary metric is king. Improved →
    keep
    . Worse/equal →
    discard
    . Secondary metrics rarely affect this.
  • Simpler is better. Removing code for equal perf = keep. Ugly complexity for tiny gain = probably discard.
  • Don't thrash. Repeatedly reverting the same idea? Try something structurally different.
  • Crashes: fix if trivial, otherwise log and move on. Don't over-invest.
  • Think longer when stuck. Re-read source files, study the profiling data, reason about what the CPU is actually doing. The best ideas come from deep understanding, not from trying random variations.
  • Resuming: if
    autoresearch.md
    exists, first check if
    autoresearch.jsonl
    exists:
    • If it exists: read it +
      experiments/worklog.md
      + git log, continue looping
    • If it doesn't exist: see "Missing State File" section below (fallback behavior)
NEVER STOP. The user may be away for hours. Keep going until interrupted.
持续循环运行。 永远不要询问“是否继续?” — 用户期望自主运行。
  • 主要指标优先。指标优化 →
    keep
    。指标变差/持平 →
    discard
    。次要指标几乎不影响此决策。
  • 简洁优先。移除代码但性能不变 → 保留。为微小提升引入复杂代码 → 可能放弃。
  • 不要重复无效操作。反复恢复同一思路?尝试结构上不同的方案。
  • 崩溃处理:如果容易修复则修复,否则记录后继续。不要过度投入。
  • 遇到瓶颈时深入思考。重新阅读源代码,分析性能分析数据,思考CPU实际在做什么。最好的思路来自深入理解,而非随机尝试。
  • 恢复运行时:如果
    autoresearch.md
    存在,首先检查
    autoresearch.jsonl
    是否存在:
    • 如果存在:读取该文件 +
      experiments/worklog.md
      + git日志,继续循环
    • 如果不存在:见下文“状态文件缺失”部分( fallback 行为)
永远不要停止。 用户可能离开数小时。持续运行直到被中断。

Missing State File

状态文件缺失

If
autoresearch.jsonl
is missing when resuming:
  1. Preserve context from
    autoresearch.md
    - Read the objective, metrics, and files in scope
  2. Ask for user confirmation - "State file missing. Options:
    • A) Create new state (fresh start)
    • B) Continue with autoresearch.md context only
    • C) Restore from backup (if available) "
  3. If fresh start: initialize new JSONL with config header
  4. If continuing with context only: proceed with autoresearch.md data but note the limitation
恢复运行时如果
autoresearch.jsonl
缺失:
  1. 保留
    autoresearch.md
    中的上下文
    — 读取目标、指标和涉及文件范围
  2. 请求用户确认 — “状态文件缺失。可选操作:
    • A) 创建新状态(全新开始)
    • B) 仅使用autoresearch.md的上下文继续
    • C) 从备份恢复(如果有可用备份) "
  3. 如果选择全新开始:使用配置头初始化新的JSONL文件
  4. 如果选择仅使用上下文继续:基于autoresearch.md的数据推进,但需注明限制

Ideas Backlog

思路待办清单

When you discover complex but promising optimizations that you decide not to pursue right now, append them as bullet points to
autoresearch.ideas.md
. Don't let good ideas get lost.
If the loop stops (context limit, crash, etc.) and
autoresearch.ideas.md
exists, you'll be asked to:
  1. Read the ideas file and use it as inspiration for new experiment paths
  2. Prune ideas that are duplicated, already tried, or clearly bad
  3. Create experiments based on the remaining ideas
  4. If nothing is left, try to come up with your own new ideas
  5. If all paths are exhausted, delete
    autoresearch.ideas.md
    and write a final summary report
When there is no
autoresearch.ideas.md
file and the loop ends, the research is complete.
当发现复杂但有前景的优化方案但决定暂时不执行时,将其作为项目符号追加到
autoresearch.ideas.md
。不要让好思路被遗忘。
如果循环停止(达到上下文限制、崩溃等)且
autoresearch.ideas.md
存在,需执行以下操作:
  1. 读取思路文件并以此为灵感设计新的实验方向
  2. 移除重复、已尝试或明显不可行的思路
  3. 基于剩余思路创建实验
  4. 如果没有剩余思路,尝试提出新的实验思路
  5. 如果所有路径都已穷尽,删除
    autoresearch.ideas.md
    并撰写最终总结报告
如果没有
autoresearch.ideas.md
文件且循环结束,则研究完成。

User Steers

用户指导

User messages sent while an experiment is running should be noted and incorporated into the NEXT experiment. Finish your current experiment first — don't stop or ask for confirmation. Incorporate the user's idea in the next experiment.
实验运行过程中收到的用户消息需记录下来,并在下一次实验中纳入考虑。先完成当前实验 — 不要停止或请求确认。将用户的思路融入下一次实验。

Updating autoresearch.md

更新autoresearch.md

Periodically update
autoresearch.md
— especially the "What's Been Tried" section — so that a fresh agent resuming the loop has full context on what worked, what didn't, and what architectural insights have been gained. Do this every 5-10 experiments or after any significant breakthrough.
定期更新
autoresearch.md
— 尤其是“已尝试方案”部分 — 以便恢复运行的新Agent能获取完整的上下文,了解哪些方案有效、哪些无效,以及获得了哪些架构洞察。建议每5-10次实验或取得重大突破后更新一次。