paper-to-code

Original🇺🇸 English
Not Translated

Convert an ML research paper into a complete, runnable code repository. 3-stage pipeline from Paper2Code — Planning (UML + dependency graph) → Analysis (per-file logic) → Coding (dependency-ordered generation). Use for reproducing paper methods.

2installs
Added on

NPX Install

npx skill4agent add lingzhi227/agent-research-skills paper-to-code

SKILL.md Content

Paper to Code

Convert a research paper into a complete, runnable code repository.

Input

  • $0
    — Paper PDF path, paper text, or paper URL

References

  • Paper2Code prompts (planning, analysis, coding stages):
    ~/.claude/skills/paper-to-code/references/paper-to-code-prompts.md

Workflow (from Paper2Code)

Stage 1: Planning

Four-turn conversation to create a comprehensive plan:
  1. Overall Plan: Extract methodology, experiments, datasets, hyperparameters, evaluation metrics
  2. Architecture Design: Generate file list, Mermaid classDiagram, sequenceDiagram
  3. Task Breakdown: Logic analysis per file, dependency-ordered task list, required packages
  4. Configuration: Extract training details into
    config.yaml

Stage 2: Analysis

For each file in the task list (dependency order):
  1. Conduct detailed logic analysis
  2. Map paper methodology to code structure
  3. Reference the config.yaml for all settings
  4. Follow the UML class diagram interfaces strictly

Stage 3: Coding

For each file in dependency order:
  1. Generate code with access to all previously generated files
  2. Follow the design's data structures and interfaces exactly
  3. Reference config.yaml — never fabricate configuration values
  4. Write complete code — no TODOs or placeholders

Stage 4: Debugging (if needed)

If execution fails:
  1. Collect error messages
  2. Identify root cause using SEARCH/REPLACE diff format
  3. Apply minimal fixes preserving original intent
  4. Re-run until successful

Output Structure

reproduced_code/
├── config.yaml        # Training configuration
├── main.py            # Entry point
├── model.py           # Model architecture
├── dataset_loader.py  # Data loading
├── trainer.py         # Training loop
├── evaluation.py      # Metrics and evaluation
├── reproduce.sh       # Run script
└── requirements.txt   # Dependencies

Key Constraints

  • Dependency order: Each file is generated with access to all previously generated files
  • Interface contracts: Mermaid diagrams serve as rigid interface definitions across all stages
  • No fabrication: Only use configurations explicitly stated in the paper
  • Complete code: Every function must be fully implemented

Rules

  • Follow the paper's methodology exactly — do not invent improvements
  • Generate code in dependency order (data loading → model → training → evaluation → main)
  • Use config.yaml for all hyperparameters and settings
  • Every class/method in UML diagram must exist in code
  • Generate a reproduce.sh script for one-command execution
  • If paper details are ambiguous, note them explicitly

Related Skills

  • Upstream: literature-search
  • Downstream: experiment-code
  • See also: code-debugging, algorithm-design