Loading...
Loading...
Set up and run an autonomous experiment loop for any optimization target. Use when asked to start autoresearch or run experiments.
npx skill4agent add dabiggm0e/autoresearch-opencode autoresearchgit checkout -b autoresearch/<goal>-<date>mkdir -p experimentsautoresearch.mdautoresearch.shexperiments/worklog.mdautoresearch.jsonlautoresearch.md# Autoresearch: <goal>
## Objective
<Specific description of what we're optimizing and the workload.>
## Metrics
- **Primary**: <name> (<unit>, lower/higher is better)
- **Secondary**: <name>, <name>, ...
## How to Run
`./autoresearch.sh` — outputs `METRIC name=number` lines.
## Files in Scope
<Every file the agent may modify, with a brief note on what it does.>
## Off Limits
<What must NOT be touched.>
## Constraints
<Hard rules: tests must pass, no new deps, etc.>
## What's Been Tried
<Update this section as experiments accumulate. Note key wins, dead ends,
and architectural insights so the agent doesn't repeat failed approaches.>autoresearch.mdautoresearch.shset -euo pipefailMETRIC name=numberautoresearch.jsonl{"type":"config","name":"<session name>","metricName":"<primary metric name>","metricUnit":"<unit>","bestDirection":"lower|higher"}{"run":1,"commit":"abc1234","metric":42.3,"metrics":{"secondary_metric":123},"status":"keep","description":"baseline","timestamp":1234567890,"segment":0}runcommitmetricmetricsstatuskeepdiscardcrashdescriptiontimestampsegmentinit_experimentautoresearch.jsonlecho '{"type":"config","name":"<name>","metricName":"<metric>","metricUnit":"<unit>","bestDirection":"<lower|higher>"}' > autoresearch.jsonlecho '{"type":"config","name":"<name>","metricName":"<metric>","metricUnit":"<unit>","bestDirection":"<lower|higher>"}' >> autoresearch.jsonl# Validate JSONL file before writing
validate_jsonl() {
local jsonl_file="autoresearch.jsonl"
if [[ -f "$jsonl_file" ]]; then
# Count existing runs
local run_count=$(grep -c '"run":' "$jsonl_file" 2>/dev/null || echo 0)
echo "Current runs in JSONL: $run_count" >&2
# Verify last 5 lines are valid JSON
tail -n 5 "$jsonl_file" 2>/dev/null | while IFS= read -r line; do
if ! echo "$line" | python3 -m json.tool >/dev/null 2>&1; then
echo "WARNING: Invalid JSON found in state file" >&2
return 1
fi
done
echo "JSONL validation: OK" >&2
return 0
fi
return 0 # File doesn't exist yet, that's OK
}
# Call validation before any write
validate_jsonl || {
echo " WARNING: JSONL validation failed. Proceeding with caution." >&2
}write_jsonl_entry() {
local entry="$1"
local jsonl_file="autoresearch.jsonl"
local temp_file="${jsonl_file}.tmp.$$"
# Create temp file
cat "$jsonl_file" > "$temp_file" 2>/dev/null || touch "$temp_file"
# Append entry
echo "$entry" >> "$temp_file"
# Validate the new entry
if ! echo "$entry" | python3 -m json.tool >/dev/null 2>&1; then
rm -f "$temp_file"
echo " WARNING: Invalid JSON entry, not writing" >&2
return 1
fi
# Atomic move (guaranteed all-or-nothing)
mv "$temp_file" "$jsonl_file"
# Verify write succeeded
local new_count=$(grep -c '"run":' "$jsonl_file" 2>/dev/null || echo 0)
echo "Write verification: $new_count runs in JSONL" >&2
return 0
}verify_write() {
local expected_run=$1
local jsonl_file="autoresearch.jsonl"
if [[ -f "$jsonl_file" ]]; then
local actual_count=$(grep -c '"run":' "$jsonl_file" 2>/dev/null || echo 0)
if [[ "$actual_count" -lt "$expected_run" ]]; then
echo " WARNING: Run count mismatch! Expected $expected_run, got $actual_count" >&2
echo "This may indicate data loss in previous writes." >&2
return 1
fi
echo "Write verification: OK (run $expected_run present)" >&2
return 0
fi
return 1
}# Backup state before user-confirmable action
backup_before_confirm() {
echo " User confirmation required. Creating backup..." >&2
# Use backup utility if available
if [[ -f "./scripts/backup-state.sh" ]]; then
./scripts/backup-state.sh backup autoresearch.jsonl 2>/dev/null || true
else
# Fallback: simple backup
cp autoresearch.jsonl "autoresearch.jsonl.backup.$(date +%s)" 2>/dev/null || true
fi
echo "Backup created. Awaiting user confirmation..." >&2
}backup_before_confirmautoresearch.jsonlexperiments/worklog.mdscripts/backup-state.sh list autoresearch.jsonlscripts/backup-state.sh restore-auto **DATA INCONSISTENCY DETECTED**
- **Worklog documents**: <WORKLOG_RUN_COUNT> experiments
- **JSONL contains**: <JSONL_RUN_COUNT> runs
- **Missing**: <DIFF> runs **LOST!**
**Recovery steps:**
1. Check backups: `scripts/backup-state.sh list autoresearch.jsonl`
2. Restore if available: `scripts/backup-state.sh restore-auto`
3. Otherwise, manually recreate missing runs from worklogrun_experimentSTART_TIME=$(date +%s%N)
bash -c "./autoresearch.sh" 2>&1 | tee /tmp/autoresearch-output.txt
EXIT_CODE=$?
END_TIME=$(date +%s%N)
DURATION=$(echo "scale=3; ($END_TIME - $START_TIME) / 1000000000" | bc)
echo "Duration: ${DURATION}s, Exit code: ${EXIT_CODE}"METRIC name=numberlog_experimentbestDirection=lowerbestDirection=highergit add -A
git diff --cached --quiet && echo "nothing to commit" || git commit -m "<description>
Result: {\"status\":\"keep\",\"<metricName>\":<value>,<secondary metrics>}"git rev-parse --short=7 HEADgit checkout -- .
git clean -fdecho '{"run":<N>,"commit":"<hash>","metric":<value>,"metrics":{<secondaries>},"status":"<status>","description":"<desc>","timestamp":'$(date +%s)',"segment":<seg>}' >> autoresearch.jsonlautoresearch-dashboard.mdexperiments/worklog.md### Run N: <short description> — <primary_metric>=<value> (<STATUS>)
- Timestamp: YYYY-MM-DD HH:MM
- What changed: <1-2 sentences describing the code/config change>
- Result: <metric values>, <delta vs best>
- Insight: <what was learned, why it worked/failed>
- Next: <what to try next based on this result>experiments/worklog.mdexperiments/worklog.mdautoresearch-dashboard.md# Autoresearch Dashboard: <name>
**Runs:** 12 | **Kept:** 8 | **Discarded:** 3 | **Crashed:** 1
**Baseline:** <metric_name>: <value><unit> (#1)
**Best:** <metric_name>: <value><unit> (#8, -26.2%)
| # | commit | <metric_name> | status | description |
|---|--------|---------------|--------|-------------|
| 1 | abc1234 | 42.3s | keep | baseline |
| 2 | def5678 | 40.1s (-5.2%) | keep | optimize hot loop |
| 3 | abc1234 | 43.0s (+1.7%) | discard | try vectorization |
...# Before any major operation requiring user confirmation
if [[ -f "./scripts/backup-state.sh" ]]; then
./scripts/backup-state.sh backup autoresearch.jsonl 2>/dev/null || true
else
cp autoresearch.jsonl "autoresearch.jsonl.backup.$(date +%s)" 2>/dev/null || true
fi# Keep only last 5 backups
ls -t autoresearch.jsonl.bak.* 2>/dev/null | tail -n +6 | xargs rm -f 2>/dev/null || true# Check for data loss
JSONL_COUNT=$(grep -c '"run":' autoresearch.jsonl 2>/dev/null || echo 0)
WORKLOG_COUNT=$(grep -c "^### Run" experiments/worklog.md 2>/dev/null || echo 0)
if [[ "$JSONL_COUNT" -ne "$WORKLOG_COUNT" ]]; then
echo " DATA LOSS DETECTED: JSONL has $JSONL_COUNT runs, worklog has $WORKLOG_COUNT runs" >&2
fi./scripts/backup-state.sh list autoresearch.jsonlkeepdiscardautoresearch.mdautoresearch.jsonlexperiments/worklog.mdautoresearch.jsonlautoresearch.mdautoresearch.ideas.mdautoresearch.ideas.mdautoresearch.ideas.mdautoresearch.ideas.mdautoresearch.md