paper-to-code

Compare original and translation side by side

🇺🇸

Original

English

🇨🇳

Translation

Chinese

Paper to Code

论文转代码

Convert a research paper into a complete, runnable code repository.

将研究论文转换为完整、可运行的代码仓库。

Input

输入

```
$0
```
— Paper PDF path, paper text, or paper URL

```
$0
```
— 论文PDF路径、论文文本或论文URL

References

参考资料

Paper2Code prompts (planning, analysis, coding stages):

~/.claude/skills/paper-to-code/references/paper-to-code-prompts.md

Paper2Code 提示词（规划、分析、编码阶段）：

~/.claude/skills/paper-to-code/references/paper-to-code-prompts.md

Workflow (from Paper2Code)

工作流（源自Paper2Code）

Stage 1: Planning

阶段1：规划

Four-turn conversation to create a comprehensive plan:

Overall Plan: Extract methodology, experiments, datasets, hyperparameters, evaluation metrics
Architecture Design: Generate file list, Mermaid classDiagram, sequenceDiagram
Task Breakdown: Logic analysis per file, dependency-ordered task list, required packages
Configuration: Extract training details into
```
config.yaml
```

四轮对话制定全面计划：

整体规划：提取方法论、实验、数据集、超参数、评估指标
架构设计：生成文件列表、Mermaid classDiagram、sequenceDiagram
任务拆分：逐文件逻辑分析、按依赖排序的任务列表、所需包
配置：将训练细节提取到
```
config.yaml
```
中

Stage 2: Analysis

阶段2：分析

For each file in the task list (dependency order):

Conduct detailed logic analysis
Map paper methodology to code structure
Reference the config.yaml for all settings
Follow the UML class diagram interfaces strictly

针对任务列表中的每个文件（按依赖顺序）：

开展详细的逻辑分析
将论文方法论映射到代码结构
参考config.yaml获取所有设置
严格遵循UML类图接口

Stage 3: Coding

阶段3：编码

For each file in dependency order:

Generate code with access to all previously generated files
Follow the design's data structures and interfaces exactly
Reference config.yaml — never fabricate configuration values
Write complete code — no TODOs or placeholders

针对每个文件（按依赖顺序）：

在可访问所有已生成文件的前提下生成代码
严格遵循设计的数据结构和接口
参考config.yaml —— 绝不能编造配置值
编写完整代码 —— 不允许有TODO或占位符

Stage 4: Debugging (if needed)

阶段4：调试（如有需要）

If execution fails:

Collect error messages
Identify root cause using SEARCH/REPLACE diff format
Apply minimal fixes preserving original intent
Re-run until successful

若执行失败：

收集错误信息
使用SEARCH/REPLACE差异格式确定根本原因
应用最小化修复，保留原始意图
重新运行直至成功

Output Structure

输出结构

reproduced_code/
├── config.yaml        # Training configuration
├── main.py            # Entry point
├── model.py           # Model architecture
├── dataset_loader.py  # Data loading
├── trainer.py         # Training loop
├── evaluation.py      # Metrics and evaluation
├── reproduce.sh       # Run script
└── requirements.txt   # Dependencies

reproduced_code/
├── config.yaml        # 训练配置
├── main.py            # 入口文件
├── model.py           # 模型架构
├── dataset_loader.py  # 数据加载
├── trainer.py         # 训练循环
├── evaluation.py      # 指标与评估
├── reproduce.sh       # 运行脚本
└── requirements.txt   # 依赖项

Key Constraints

关键约束

Dependency order: Each file is generated with access to all previously generated files
Interface contracts: Mermaid diagrams serve as rigid interface definitions across all stages
No fabrication: Only use configurations explicitly stated in the paper
Complete code: Every function must be fully implemented

依赖顺序：每个文件生成时可访问所有已生成的文件
接口约定：Mermaid图作为所有阶段的严格接口定义
禁止编造：仅使用论文中明确说明的配置
完整代码：每个函数必须完全实现

Rules

规则

Follow the paper's methodology exactly — do not invent improvements
Generate code in dependency order (data loading → model → training → evaluation → main)
Use config.yaml for all hyperparameters and settings
Every class/method in UML diagram must exist in code
Generate a reproduce.sh script for one-command execution
If paper details are ambiguous, note them explicitly

严格遵循论文的方法论 —— 不得自行改进
按依赖顺序生成代码（数据加载 → 模型 → 训练 → 评估 → 主程序）
所有超参数和设置均使用config.yaml
UML图中的每个类/方法必须在代码中存在
生成reproduce.sh脚本以支持一键执行
若论文细节不明确，需明确标注