ai-paper-reproduction

Compare original and translation side by side

🇺🇸

Original

English

🇨🇳

Translation

Chinese

ai-paper-reproduction

Use when

适用场景

The user wants Codex to reproduce an AI paper repository.
The target is a code repository with a README, scripts, configs, or documented commands.
The goal is a minimal trustworthy run, not unlimited experimentation.
The user needs standardized outputs that another human or model can audit quickly.
The task spans more than one stage, such as intake plus setup, or setup plus execution plus reporting.

用户需要Codex复现AI论文对应的代码仓库
复现目标为包含README、脚本、配置文件或有文档记录的命令的代码仓库
目标是完成最小可信运行，而非无限制的实验
用户需要可供他人或模型可快速审核的标准化输出
任务涉及多个阶段，例如仓库接入+环境搭建，或环境搭建+执行+报告生成

Do not use when

不适用场景

The task is a general literature review or paper summary.
The task is to design a new model, benchmark suite, or training pipeline from scratch.
The repository is not centered on AI or does not expose a documented reproduction path.
The user primarily wants a deep code refactor rather than README-first reproduction.
The user is explicitly asking for only one narrow phase that a sub-skill already covers cleanly.

任务为通用文献综述或论文总结
任务为从零开始设计新模型、基准测试套件或训练流水线
仓库核心并非AI相关，或未提供有文档记录的复现路径
用户主要需求是深度代码重构，而非以README为优先的复现
用户明确要求仅完成某单一窄范围阶段，且已有子技能可以完美覆盖

Success criteria

成功标准

README is treated as the primary source of reproduction intent.
A minimum trustworthy target is selected and justified.
Documented inference is preferred over evaluation, and evaluation is preferred over training.
Any repo edits remain conservative, explicit, and auditable.
```
repro_outputs/
```
is generated with consistent structure and stable machine-readable fields.
Final user-facing explanation is short and follows the user's language when practical.

将README作为复现意图的首要参考来源
选择最小可信复现目标，并给出选择理由
优先选择有文档记录的推理任务，其次是评估任务，最后是训练任务
所有对仓库的修改都保持保守、明确、可审核
生成结构一致、包含稳定的机器可读字段的
```
repro_outputs/
```
目录
最终面向用户的说明简洁，且尽量使用用户使用的语言

Interaction and usability policy

交互与可用性规则

Keep the workflow simple enough for a new user to understand quickly.
Prefer short, concrete plans over exhaustive research.
Expose commands, assumptions, blockers, and evidence.
Avoid turning the skill into an opaque automation layer.
Preserve a low learning cost for both humans and downstream agents.

保持工作流足够简单，新用户可快速理解
优先选择简短具体的方案，而非穷尽式调研
公开所有命令、假设、阻塞问题和相关依据
避免将该技能封装为不透明的自动化层
对人类和下游Agent都保持较低的学习成本

Language policy

语言规则

Human-readable Markdown outputs should follow the user's language when it is clear.
If the user's language is unclear, default to concise English.
Machine-readable fields, filenames, keys, and enum values stay in stable English.
Paths, package names, CLI commands, config keys, and code identifiers remain unchanged.

See

references/language-policy.md

当用户使用的语言明确时，人类可读的Markdown输出使用用户的语言
若用户使用的语言不明确，默认使用简洁的英文
机器可读字段、文件名、键名、枚举值保持使用稳定的英文
路径、包名、CLI命令、配置键和代码标识符保持不变

参考

references/language-policy.md

Reproduction policy

复现规则

Core priority order:

documented inference
documented evaluation
documented training startup or partial verification
full training only when the user explicitly asks later

Rules:

README-first: use repository files to clarify, not casually override, the README.
Aim for minimal trustworthy reproduction rather than maximum task coverage.
Treat smoke tests, startup verification, and early-step checks as valid training evidence when full training is not appropriate.
Record unresolved gaps rather than fabricating confidence.

核心优先级顺序：

有文档记录的推理
有文档记录的评估
有文档记录的训练启动或部分验证
仅当用户后续明确要求时才执行完整训练

规则：

以README为优先：使用仓库文件补充说明README内容，而非随意覆盖README的说明
目标是完成最小可信复现，而非覆盖最多任务
当不适合执行完整训练时，冒烟测试、启动验证和早期步骤检查都可视为有效的训练相关验证证据
记录未解决的差异，而非虚构可信度

Patch policy

补丁规则

Prefer no code changes.
Prefer safer adjustments first:
- command-line arguments
- environment variables
- path fixes
- dependency version fixes
- dependency file fixes such as
```
requirements.txt
```
  or
```
environment.yml
```
Avoid changing:
- model architecture
- core inference semantics
- core training logic
- loss functions
- experiment meaning
If repository files must change:
- create a patch branch first using
```
repro/YYYY-MM-DD-short-task
```
- apply low-risk changes before medium-risk changes
- avoid high-risk changes by default
- commit only verified groups of changes
- keep verified patch commits sparse, usually
```
0-2
```
- use commit messages in the form
```
repro: <scope> for documented <command>
```

See

references/patch-policy.md

优先不修改代码
优先使用更安全的调整方式：
- 命令行参数
- 环境变量
- 路径修复
- 依赖版本修复
- 依赖文件修复，例如
```
requirements.txt
```
  或
```
environment.yml
```
避免修改：
- 模型架构
- 核心推理语义
- 核心训练逻辑
- 损失函数
- 实验含义
若必须修改仓库文件：
- 首先使用
```
repro/YYYY-MM-DD-short-task
```
  格式创建补丁分支
- 先应用低风险修改，再应用中风险修改
- 默认避免高风险修改
- 仅提交经过验证的修改组
- 保持经过验证的补丁提交数量稀疏，通常为
```
0-2
```
  个
- 提交信息使用
```
repro: <scope> for documented <command>
```
  格式

参考

references/patch-policy.md

Workflow

工作流

Read README and repo signals.
Call
```
repo-intake-and-plan
```
to scan the repository and extract documented commands.
Select the smallest trustworthy reproduction target.
Call
```
env-and-assets-bootstrap
```
to prepare environment assumptions and asset paths.
Run a conservative smoke check or documented command with
```
minimal-run-and-audit
```
.
Use
```
paper-context-resolver
```
only if README and repo files leave a narrow reproduction-critical gap that blocks the current target.
Write the standardized outputs.
Give the user a short final note in the user's language.

读取README和仓库信号
调用
```
repo-intake-and-plan
```
扫描仓库并提取有文档记录的命令
选择最小的可信复现目标
调用
```
env-and-assets-bootstrap
```
准备环境假设和资源路径
使用
```
minimal-run-and-audit
```
执行保守的冒烟检查或有文档记录的命令
仅当README和仓库文件存在影响当前复现目标的窄范围关键差异时，才调用
```
paper-context-resolver
```
生成标准化输出
使用用户的语言向用户发送简短的最终说明

Required outputs

要求输出

Always target:

text

repro_outputs/
  SUMMARY.md
  COMMANDS.md
  LOG.md
  status.json
  PATCHES.md   # only if patches were applied

Use the templates under

assets/

and the field rules in

references/output-spec.md

始终生成如下结构：

text

repro_outputs/
  SUMMARY.md
  COMMANDS.md
  LOG.md
  status.json
  PATCHES.md   # 仅当应用了补丁时存在

使用

assets/

目录下的模板和

references/output-spec.md

中的字段规则

Reporting policy

报告规则

Put the shortest high-value summary in
```
SUMMARY.md
```
.
Put copyable commands in
```
COMMANDS.md
```
.
Put process evidence, assumptions, failures, and decisions in
```
LOG.md
```
.
Put durable machine-readable state in
```
status.json
```
.
Put branch, commit, validation, and README-fidelity impact in
```
PATCHES.md
```
when needed.
Distinguish verified facts from inferred guesses.

将最简短的高价值总结放在
```
SUMMARY.md
```
中
将可直接复制的命令放在
```
COMMANDS.md
```
中
将流程依据、假设、失败情况和决策记录放在
```
LOG.md
```
中
将持久化的机器可读状态放在
```
status.json
```
中
需要时将分支、提交、验证信息和README保真度影响放在
```
PATCHES.md
```
中
区分已验证的事实和推断的猜测

Maintainability notes

可维护性说明

Keep this skill narrow: README-first AI repo reproduction only.
Push specialized logic into sub-skills or helper scripts.
Prefer stable templates and simple schemas over ad hoc prose.
Keep machine-readable outputs backward compatible when possible.
Add new evidence sources only when they improve auditability without raising learning cost.

保持该技能的定位窄而专：仅用于以README为优先的AI仓库复现
将专用逻辑下沉到子技能或辅助脚本中
优先使用稳定的模板和简单的模式，而非临时编写定制化描述
尽可能保持机器可读输出的向后兼容性
仅当新的证据来源可以提升可审核性且不提高学习成本时才添加