experiment-code

Compare original and translation side by side

🇺🇸

Original

English
🇨🇳

Translation

Chinese

Experiment Code

实验代码

Generate and iteratively improve ML experiment code for research papers.
生成并迭代改进用于研究论文的ML实验代码。

Input

输入

  • $0
    — Task:
    generate
    ,
    improve
    ,
    debug
    ,
    plot
  • $1
    — Research plan, idea description, or error message
  • $0
    — 任务:
    generate
    (生成)、
    improve
    (改进)、
    debug
    (调试)、
    plot
    (绘图)
  • $1
    — 研究计划、想法描述或错误信息

References

参考资料

  • Experiment prompts and patterns:
    ~/.claude/skills/experiment-code/references/experiment-prompts.md
  • Code patterns (error handling, repair, hill-climbing):
    ~/.claude/skills/experiment-code/references/code-patterns.md
  • 实验提示与模式:
    ~/.claude/skills/experiment-code/references/experiment-prompts.md
  • 代码模式(错误处理、修复、爬山法):
    ~/.claude/skills/experiment-code/references/code-patterns.md

Action:
generate

操作:
generate
(生成)

Generate initial experiment code following this structure:
  1. Plan experiments first — List all runs needed (hyperparameter sweeps, ablations, baselines)
  2. Write self-contained code — All code in project directory, no external imports from reference repos
  3. Include proper logging — Save results to JSON, print intermediate metrics
  4. Generate figures — At minimum Figure_1.png and Figure_2.png
按照以下结构生成初始实验代码:
  1. 先规划实验 — 列出所有需要的运行任务(超参数搜索、消融实验、基线实验)
  2. 编写独立完整的代码 — 所有代码都在项目目录中,不从参考仓库导入外部内容
  3. 包含完善的日志记录 — 将结果保存为JSON格式,打印中间指标
  4. 生成图表 — 至少生成Figure_1.png和Figure_2.png

Mandatory Structure

强制目录结构

project/
├── experiment.py      # Main experiment script
├── plot.py            # Visualization script
├── notes.txt          # Experiment descriptions and results
├── run_1/             # Results from run 1
│   └── final_info.json
├── run_2/
└── ...
project/
├── experiment.py      # 主实验脚本
├── plot.py            # 可视化脚本
├── notes.txt          # 实验描述与结果记录
├── run_1/             # 运行任务1的结果
│   └── final_info.json
├── run_2/
└── ...

Constraints

约束条件

  • No placeholder code (
    pass
    ,
    ...
    ,
    raise NotImplementedError
    )
  • Must use actual datasets (not toy data unless explicitly requested)
  • PyTorch or scikit-learn preferred (no TensorFlow/Keras)
  • Each run uses:
    python experiment.py --out_dir=run_i
  • 不允许使用占位符代码(
    pass
    ...
    raise NotImplementedError
  • 必须使用真实数据集(除非明确要求,否则不使用玩具数据)
  • 优先使用PyTorch或scikit-learn(不使用TensorFlow/Keras)
  • 每个运行任务使用命令:
    python experiment.py --out_dir=run_i

Action:
improve

操作:
improve
(改进)

Improve existing experiment code:
  1. Read current code and results
  2. Reflect on what worked and what didn't
  3. Apply targeted edits (prefer small edits over full rewrites)
  4. Re-run and compare scores
  5. Keep the best-performing code variant
改进现有实验代码:
  1. 读取当前代码和结果
  2. 反思有效和无效的部分
  3. 进行针对性修改(优先小幅度修改而非重写全部代码)
  4. 重新运行并对比分数
  5. 保留性能最佳的代码版本

Action:
debug

操作:
debug
(调试)

Fix experiment code errors:
  1. Read the error message (truncate to last 1500 chars if very long)
  2. Identify the root cause
  3. Apply minimal fix
  4. Up to 4 retry attempts before changing approach
修复实验代码错误:
  1. 读取错误信息(如果过长,截取最后1500个字符)
  2. 确定根本原因
  3. 应用最小化修复
  4. 最多尝试4次重试,若失败则更换方法

Action:
plot

操作:
plot
(绘图)

Generate publication-quality plots from experiment results:
  1. Read all
    run_*/final_info.json
    files
  2. Generate comparison plots with proper labels
  3. Use the figure-generation skill for styling
根据实验结果生成符合出版要求的图表:
  1. 读取所有
    run_*/final_info.json
    文件
  2. 生成带有恰当标签的对比图表
  3. 使用图表生成技能进行样式设置

Rules

规则

  • Always plan experiments before writing code
  • After each run, document results in notes.txt
  • Include print statements explaining what results show
  • Method MUST not get 0% accuracy — verify accuracy calculations
  • Use seeds for reproducibility
  • Before each experiment include a print statement explaining exactly what the results are meant to show
  • 编写代码前必须先规划实验
  • 每次运行后,在notes.txt中记录结果
  • 包含打印语句,解释结果所展示的内容
  • 方法的准确率不能为0% — 需验证准确率计算逻辑
  • 使用随机种子保证可复现性
  • 每次实验前,添加打印语句明确说明该实验结果要展示的内容

Related Skills

相关技能

  • Upstream: experiment-design, algorithm-design
  • Downstream: data-analysis, backward-traceability
  • See also: code-debugging, paper-to-code
  • 上游:experiment-designalgorithm-design
  • 下游:data-analysisbackward-traceability
  • 另可参考:code-debuggingpaper-to-code