arxiv-paper-translator
Original:🇺🇸 English
Translated
Translate academic papers from arXiv to Chinese. Use when users want to (1) translate arXiv papers from English to Chinese, or (2) create technical reports summarizing academic papers. Works with arXiv paper IDs like "2206.04655".
2installs
Added on
NPX Install
npx skill4agent add yrom/arxiv-paper-translator arxiv-paper-translatorTags
Translated version includes tags in frontmatterSKILL.md Content
View Translation Comparison →arXiv Paper Translator
Translate academic papers from arXiv by downloading LaTeX source, translating content while preserving structure, and generating translated PDFs with technical reports.
Workflow Overview
- Download & Extract - Get LaTeX source from arXiv
- Translate - Translate English narrative content to Chinese following LaTeX-specific rules
- REVIEW PHASE - MUST COMPLETE before compiling
- CJK Support & Localize Labels - Add xeCJK, localize labels
- Compile .tex Files - Generate translated PDF using XeLaTeX
- Report - Create technical summary document
Prerequisites
Check local xelatex installation:
bash
xelatex --versionIf not installed, make sure Docker is installed and available.
bash
docker --versionThis skill requires XeLaTeX to compile translated PDFs. If not installed locally, Docker will be used instead.
Recommend using xu-cheng/latex-docker Docker images.
e.g. Tex Live full distribution (only linux/amd64):
bash
# NOTICE: ghcr.1ms.run is a mirror of ghcr.io.
docker pull ghcr.1ms.run/xu-cheng/texlive-debian:20260101 --platform linux/amd64
# => docker pull ghcr.io/xu-cheng/latex-debian:20260101 --platform linux/amd64If both local XeLaTeX and Docker are not installed, then STOP trying to run this skill.
And Ask user question: "XeLaTeX or Docker is required to compile translated PDFs. Which one do you want to use? I'll help you to setup."
Step 1: Download LaTeX Source
Extract ARXIV_ID from user input.
Download and extract source code from arXiv:
bash
# Download LaTeX source (replace ARXIV_ID with user-specified paper ID)
ARXIV_ID="2206.04655"
mkdir -p arXiv_${ARXIV_ID}
wget -q https://arxiv.org/e-print/${ARXIV_ID} -O arXiv_${ARXIV_ID}/paper_source.tar.gz
mkdir -p arXiv_${ARXIV_ID}/paper_source
tar -xzf arXiv_${ARXIV_ID}/paper_source.tar.gz -C arXiv_${ARXIV_ID}/paper_sourceVerify extraction:
bash
# List files to understand structure
tree arXiv_${ARXIV_ID}/paper_sourceStep 2: Translate LaTeX Files
IMPORTANT: Before translating, read references/translation_guidelines.md for detailed rules.
Translation Workflow
Step 2.1. Copy all files from to :
paper_source/paper_cn/Option 1 - Using cp (standard):
bash
cd arXiv_${ARXIV_ID}
mkdir -p paper_cn
cp -r paper_source/* paper_cn/Option 2 - Using rsync (better for incremental sync):
bash
cd arXiv_${ARXIV_ID}
mkdir -p paper_cn
rsync -av paper_source/ paper_cn/All .tex files in will be translated in-place later.
paper_cn/Step 2.2. Gather Context (MANDATORY):
Before ANY translation, you MUST extract:
- Paper Title: From in main file
\title{...} - Abstract: From or
\begin{abstract}...\end{abstract}in main file\abstract{...} - Paper Structure: List all sections and which .tex file contains each
- Key Terminologies: Build terminology table from paper content
For some glossaries or terminologies you don't know how to translate, you can ASK user question for definition.
This information is REQUIRED for translation tasks.
Read references/translation_prompt.md for the prompt template.
Step 2.3. Dispatch Translation Tasks
Identify files to translate:
- Find main file (contains \documentclass{...}, usually main.tex, paper.tex, template.tex, etc.)
- Filter .tex files that need translation (skip macro-only files if any, or user specified files)
- Create list of files to translate
Translation Strategy:
-
Translate main file first (sequential)
- Builds shared terminology context
- Ensures consistency for other files
-
Translate other files:
- If 3+ files: Dispatch in parallel
- If 1-2 files: Sequential translation
Each translation Task:
- Task type: general-purpose subagent
- Input: File path in directory
paper_cn/ - Action: Read file → Translate → Edit file (Update file content with translated text)
- Must follow references/translation_prompt.md
- Must use gathered context (title, abstract, structure, terminologies)
Example command to find main .tex file:
bash
find paper_cn/ -name "*.tex" -exec grep -l '\\documentclass' {} \; | head -1Step 3: Review Translation
After all translation Tasks are completed, you MUST review the translated content following references/review_checklist.md to verify:
- File Completeness Check
- LaTeX Command Spelling
- CJK Catcode Issues
- Translation Quality Check
- Content Spot-Check
Perform fixes as needed based on review findings.
CRITICAL: Before proceeding to Step 4, you must confirm:
- All review checks completed
- Any issues identified and fixed
- Translation quality verified
Step 4: Add Chinese Support
IMPORTANT: Follow references/chinese_support.md to configure CJK fonts and localize labels.
Modify main .tex file to include xeCJK package and set CJK fonts.
e.g. for Fandol font (which is included in TexLive Docker image):
latex
\usepackage{xeCJK}
\setCJKmainfont{FandolSong}[ItalicFont=FandolKai] % 宋体 - 正文,\emph 用楷体
\setCJKsansfont{FandolHei} % 黑体 - 标题、\textsf
\setCJKmonofont{FandolFang} % 仿宋 - 代码、\textttIf running locally, Ask user for font preference before configuring. Check available fonts with .
fc-list :lang=zh familyStep 5: Compile Translated PDF
Option 1: Local XeLaTeX
bash
# Basic compilation
xelatex main.tex
# If paper has bibliography (recommended approach)
xelatex main.tex
bibtex main
xelatex main.tex
xelatex main.texOr use for automated compilation:
latexmkbash
latexmk -xelatex main.texOption 2: Docker with TeX Live
bash
# change working directory to arXiv_${ARXIV_ID}
cd /path/to/arXiv_${ARXIV_ID}
docker run --rm \
-v "$(pwd)/paper_cn":/workspace \
-w /workspace \
ghcr.1ms.run/xu-cheng/texlive-debian:20260101 \
latexmk -xelatex main.texStep 6: Generate Technical Report
If user requests a technical summary, spawn a subagent following references/summary_prompt.md to create a technical summary using assets/report_template.md.
Save report:
arXiv_${ARXIV_ID}/technical_report.mdFinal Deliverables
- Translated PDF:
paper_cn/<main-file>.pdf - Technical report:
arXiv_${ARXIV_ID}/technical_report.md - TeX Source: directory with all translated LaTeX files
paper_cn/
Common Issues & Solutions
| Issue | Solution |
|---|---|
| Downloaded file is single .tex, not .tar.gz | |
| Main file not named main.tex | |
| Compilation fails with encoding error | |
Command misspelling (e.g. | See review checklist step 2 — diff command sets to find typos |
| Undefined control sequence - \xmax概率 | xeCJK catcode issue — insert |
| Undefined control sequence - \chinese{弋} | Original uses CJK package's \chinese macro; add \newcommand{\chinese}[1]{#1} after xeCJK config to prevent catcode issue |
| Custom .sty/.cls files | Copy to |
| Mixed CJK/Latin characters may cause xeCJK font switching errors (e.g. |
Undefined references (e.g., | Ensure ALL referenced files are present in |
References
- Translation rules: references/translation_guidelines.md
- Translation prompt: references/translation_prompt.md
- Review checklist: references/review_checklist.md
- Chinese support: references/chinese_support.md
- Report template: assets/report_template.md
- Makefile template (optional): assets/Makefile.template