Loading...
Loading...
Runs ML experiments reproducibly — single runs or autonomous BFS batches. Single mode: isolated venv, time-budgeted, failure-handled, logs to RESEARCH.md. BFS mode (opt-in): designs N hypotheses, runs each for a fixed budget, compares via a single verifiable metric, keeps improvements and git-resets failures — fully autonomous until done. Respects the RESEARCH.md supervision policy for notifications, approvals, and stop limits. Trigger phrases: "run experiment", "train model", "explore design space", "find best config", "autoresearch".
npx skill4agent add zy-ning/oh_my_co-researcher experimentRESEARCH.mdRESEARCH.md## Supervision Policypython -m py_compile path/to/script.pyNaNInferror_thresholdRESEARCH.mdRESEARCH.mdtrain.pyval_bpbval_accexperiment-startbfs-startApprovewildgit commit -am "hypothesis: <one-line description>"grep "^<metric_key>:" run.log | awk '{print $2}'keepresults.tsvgit reset --hard HEAD~1discardresults.tsvcommit_hash metric_value status descriptionRESEARCH.mdtarget_reached<hash><metric>=<value>python train.py --lr 0.012026-03-29 14:32 — val_acc=0.923, hash=abc1234train.pyval_bpbresults.tsv