Loading...
Loading...
Transform Claude Code into an AI Scientist that orchestrates research workflows using tree-based hypothesis exploration. Triggers on "research project", "scientific experiment", "run experiments", "AI scientist", "tree search experimentation", "systematic study".
npx skill4agent add sundial-org/skills ai-co-scientistpython scripts/tree.py init <project_path>python scripts/visualize.py <project_path>
open <project_path>/.co-scientist/viz/index.html| Stage | Name | Goal |
|---|---|---|
| 0 | Literature Review | Search for prior work, identify gaps |
| 1 | Hypothesis Formulation | Define clear, falsifiable hypothesis |
| 2 | Experimental Design | Identify variables, establish baselines |
| 3 | Systematic Experimentation | Tree-based exploration of hypothesis space |
| 4 | Validation & Synthesis | Validate findings, synthesize conclusions |
Before we proceed with Experimental Design, please confirm:
- Independent variables (what we manipulate): [list them]
- Dependent variables (what we measure): [list them]
- Control variables (what we hold constant): [list them]
- Resource budget: [max iterations, compute time]
Do these look correct? Any adjustments needed?python scripts/tree.py complete-stage <project_path> successgit add -Agit commit -m "$(cat <<'EOF'
[Co-Scientist] Stage N: <Stage Name> - <Brief Summary>
<Detailed description of what was accomplished>
Key findings:
- <Finding 1>
- <Finding 2>
Next steps: <What Stage N+1 will address>
EOF
)"[Co-Scientist] Stage 0: Literature Review - Data augmentation for robustness
Reviewed 12 papers on data augmentation and adversarial robustness.
Key findings:
- Most prior work focuses on geometric transforms
- Gap: limited study of aggressive augmentation (>50%)
- Candidate methods: RandAugment, AutoAugment, AugMax
Next steps: Formulate testable hypothesis about augmentation intensity[Co-Scientist] Stage 3: Experimentation - 15 experiments completed
Tree exploration complete with 15 nodes (12 successful, 3 buggy).
Key findings:
- Best result: 75% augmentation achieves 58.9% adversarial accuracy
- Diminishing returns above 75% with clean accuracy degradation
- Geometric transforms outperform color-only
Next steps: Validate 75% configuration with multiple seedspython scripts/tree.py loop-back <target_stage> "<reason>"python scripts/tree.py get-candidatespython scripts/tree.py add-node <parent_id> "<plan>" <code_file>python scripts/tree.py update <node_id> --status=success --metrics='{"value": 0.85, "name": "accuracy", "maximize": true}' --analysis="<analysis>"python scripts/tree.py mark-buggy <node_id> "<error_description>"python scripts/tree.py commit <node_id>python scripts/visualize.py <project_path># Project management
python scripts/tree.py init <project_path>
python scripts/tree.py load <project_path>
# Stage management
python scripts/tree.py start-stage <stage_num>
python scripts/tree.py complete-stage <outcome>
python scripts/tree.py loop-back <target_stage> "<reason>"
# Node operations
python scripts/tree.py add-node <parent_id> "<plan>" <code_file>
python scripts/tree.py update <node_id> [--status=...] [--metrics=...] [--analysis=...]
python scripts/tree.py mark-buggy <node_id> "<error>"
python scripts/tree.py commit <node_id>
# Query operations
python scripts/tree.py get-best <top_k>
python scripts/tree.py get-candidates
python scripts/tree.py export-treesbash scripts/compile_latex.sh <paper_path>python scripts/tree.py load <project_path><project_path>/.co-scientist/project.jsonstage_history.jsontrees/viz/index.htmlUser: "I want to research whether data augmentation improves model robustness"
AI Co-Scientist:
1. Initialize project
2. Stage 0: Search for prior work on data augmentation and robustness
3. Checkpoint: "Here's what I found. Gaps include X, Y. Shall we proceed?"
4. **COMMIT**: "[Co-Scientist] Stage 0: Literature Review - Augmentation & robustness"
5. Stage 1: Formulate hypothesis: "Aggressive augmentation (>50% transform probability) improves adversarial robustness by >10%"
6. Checkpoint: "Does this hypothesis look testable? What would refute it?"
7. **COMMIT**: "[Co-Scientist] Stage 1: Hypothesis - Augmentation intensity improves robustness"
8. Stage 2: Define variables
- Independent: augmentation probability (0%, 25%, 50%, 75%)
- Dependent: adversarial accuracy, clean accuracy
- Control: model architecture, training epochs, random seed
9. Checkpoint: "Please verify these variables and set resource budget"
10. **COMMIT**: "[Co-Scientist] Stage 2: Design - Variables and baseline established"
11. Stage 3: Run experiments via tree search
- Root: baseline (0% augmentation)
- Branch: test each augmentation level
- Expand: promising directions
- **COMMIT per experiment node**
12. Checkpoint after tree exploration: "Results suggest X. Continue or loop back?"
13. **COMMIT**: "[Co-Scientist] Stage 3: Experimentation - 15 nodes, best=75%"
14. Stage 4: Validate best configuration with multiple seeds, ablations
15. **COMMIT**: "[Co-Scientist] Stage 4: Validation - Results confirmed"
16. Synthesize conclusions and optionally write paper| Action | Command |
|---|---|
| Start new project | |
| View visualization | |
| Add experiment | |
| Mark success | |
| Commit node | |
| Get best results | |
| Advance stage | |
| Commit stage | |
| Loop back | |