arxiv-package

Compare original and translation side by side

🇺🇸

Original

English
🇨🇳

Translation

Chinese

arXiv Submission Packager

arXiv提交打包工具

Purpose

用途

Assemble a TeX/LaTeX project into a clean, correctly structured
.tar.gz
or
.zip
archive ready for arXiv upload. Handles file selection, artifact exclusion, directory structure, 00README generation, and validation of the final package.
Companion skills:
  • arxiv-preflight
    — validate before packaging
  • arxiv-figures
    — optimize figures before packaging
将TeX/LaTeX项目组装为结构规范、整洁的
.tar.gz
.zip
归档文件,以便上传至arXiv。工具可处理文件筛选、构建产物排除、目录结构整理、00README文件生成以及最终包的验证工作。
配套工具:
  • arxiv-preflight
    — 打包前验证项目
  • arxiv-figures
    — 打包前优化图片

Workflow

工作流程

1. Inventory

1. 项目文件盘点

Scan the project directory. Classify every file:
INCLUDE — Required
  • Main
    .tex
    file(s)
  • All
    \input
    /
    \include
    targets (recursively trace)
  • All
    \includegraphics
    targets
  • .bbl
    files (pre-processed bibliography)
  • .ind
    files (pre-processed index)
  • .gls
    /
    .nls
    files (glossary/nomenclature)
  • Custom
    .sty
    ,
    .cls
    ,
    .bst
    files not in TeX Live
  • 00README.XXX
    (if present)
  • anc/
    directory contents (ancillary files)
EXCLUDE — Build Artifacts
  • .aux
    ,
    .log
    ,
    .toc
    ,
    .lot
    ,
    .lof
    ,
    .out
    ,
    .nav
    ,
    .snm
    ,
    .vrb
  • .dvi
    ,
    .ps
    ,
    .pdf
    (except figure PDFs and ancillary PDFs)
  • .synctex
    ,
    .synctex.gz
    ,
    .fls
    ,
    .fdb_latexmk
  • .blg
    ,
    .ilg
    ,
    .glg
    ,
    .nlg
    (log files from bib/index/glossary processing)
  • *.bak
    ,
    *~
    ,
    *.swp
    ,
    *.swo
    (editor backups)
  • .DS_Store
    ,
    Thumbs.db
    ,
    desktop.ini
    (OS artifacts)
EXCLUDE — Should Not Submit
  • Journal templates / style guides (unless custom
    .sty
    needed for compilation)
  • Referee letters, cover letters
  • Build scripts (
    Makefile
    ,
    latexmkrc
    ) — not needed by arXiv
  • .git/
    ,
    .svn/
    , version control directories
  • IDE configs (
    .vscode/
    ,
    .idea/
    ,
    .texpadtmp/
    )
FLAG — Needs Decision
  • .bib
    files (include alongside
    .bbl
    or omit — arXiv processes both)
  • Multiple
    .tex
    files with
    \documentclass
    (needs
    00README.XXX
    with
    toplevelfile
    )
  • Large data files (ancillary or exclude)
扫描项目目录,对所有文件进行分类:
包含 — 必需文件
  • .tex
    文件
  • 所有
    \input
    /
    \include
    引用的目标文件(递归追踪)
  • 所有
    \includegraphics
    引用的目标文件
  • .bbl
    文件(预处理后的参考文献)
  • .ind
    文件(预处理后的索引)
  • .gls
    /
    .nls
    文件(术语表/命名规范)
  • TeX Live中未包含的自定义
    .sty
    .cls
    .bst
    文件
  • 已存在的
    00README.XXX
    文件
  • anc/
    目录下的所有内容(辅助文件)
排除 — 构建产物
  • .aux
    .log
    .toc
    .lot
    .lof
    .out
    .nav
    .snm
    .vrb
  • .dvi
    .ps
    .pdf
    (图片PDF和辅助PDF除外)
  • .synctex
    .synctex.gz
    .fls
    .fdb_latexmk
  • .blg
    .ilg
    .glg
    .nlg
    (参考文献/索引/术语表处理生成的日志文件)
  • *.bak
    *~
    *.swp
    *.swo
    (编辑器备份文件)
  • .DS_Store
    Thumbs.db
    desktop.ini
    (系统生成的冗余文件)
排除 — 不应提交的文件
  • 期刊模板/样式指南(除非编译需要自定义
    .sty
    文件)
  • 审稿意见、投稿信
  • 构建脚本(
    Makefile
    latexmkrc
    )—— arXiv无需此类文件
  • .git/
    .svn/
    等版本控制目录
  • IDE配置文件(
    .vscode/
    .idea/
    .texpadtmp/
标记 — 需要人工决策
  • .bib
    文件(是否与
    .bbl
    一同提交——arXiv支持两种处理方式)
  • 多个包含
    \documentclass
    .tex
    文件(需在
    00README.XXX
    中声明
    toplevelfile
  • 大型数据文件(归入辅助文件或直接排除)

2. Generate 00README.XXX

2. 生成00README.XXX文件

Create or update the
00README.XXX
file when needed:
undefined
在需要时创建或更新
00README.XXX
文件:
undefined

Auto-generated by arxiv-package

Auto-generated by arxiv-package

Main TeX file (only if ambiguous — multiple \documentclass files)

Main TeX file (only if ambiguous — multiple \documentclass files)

main.tex toplevelfile
main.tex toplevelfile

Files to exclude from processing

Files to exclude from processing

cover-letter.pdf ignore notes.txt ignore
cover-letter.pdf ignore notes.txt ignore

Optional directives

Optional directives

nostamp

nostamp

nohypertex

nohypertex


Only generate if:
- Multiple `.tex` files contain `\documentclass` (declare `toplevelfile`)
- Files need explicit `ignore` directives
- User requests `nostamp` or other directives

仅在以下场景生成该文件:
- 多个`.tex`文件包含`\documentclass`(需声明`toplevelfile`)
- 需要为某些文件添加明确的`ignore`指令
- 用户要求添加`nostamp`或其他指令

3. Validate Structure

3. 验证归档结构

Before creating the archive:
  1. All
    \input
    /
    \include
    targets resolve to files in the package
  2. All
    \includegraphics
    targets resolve to files in the package
  3. All
    \bibliography
    targets have corresponding
    .bbl
    files
  4. No absolute paths in any TeX source
  5. No filenames with spaces or special characters
  6. anc/
    directory contains no
    .tex
    files
  7. No hidden files (will be deleted by arXiv upon announcement)
  8. Total uncompressed size is reasonable (flag >50MB — may need exception request)
创建归档前需完成以下验证:
  1. 所有
    \input
    /
    \include
    引用的目标文件均存在于包内
  2. 所有
    \includegraphics
    引用的目标文件均存在于包内
  3. 所有
    \bibliography
    引用的目标文件均有对应的
    .bbl
    文件
  4. TeX源文件中无绝对路径
  5. 文件名无空格或特殊字符
  6. anc/
    目录下无
    .tex
    文件
  7. 无隐藏文件(arXiv会在发布时删除此类文件)
  8. 未压缩总大小合理(超过50MB需标记——可能需要申请例外)

4. Create Archive

4. 创建归档文件

bash
undefined
bash
undefined

From project root, create tar.gz excluding unwanted files

From project root, create tar.gz excluding unwanted files

tar czf submission.tar.gz
--exclude='.aux' --exclude='.log' --exclude='.toc'
--exclude='
.lot' --exclude='.lof' --exclude='.out'
--exclude='.nav' --exclude='.snm' --exclude='.vrb'
--exclude='
.dvi' --exclude='.synctex'
--exclude='.fls' --exclude='.fdb_latexmk'
--exclude='.blg' --exclude='.ilg' --exclude='.glg'
--exclude='
.bak' --exclude='~' --exclude='.swp'
--exclude='.git' --exclude='.svn'
--exclude='.DS_Store' --exclude='Thumbs.db'
--exclude='.vscode' --exclude='.idea'
[list of files/directories to include]
tar czf submission.tar.gz
--exclude='.aux' --exclude='.log' --exclude='.toc'
--exclude='
.lot' --exclude='.lof' --exclude='.out'
--exclude='.nav' --exclude='.snm' --exclude='.vrb'
--exclude='
.dvi' --exclude='.synctex'
--exclude='.fls' --exclude='.fdb_latexmk'
--exclude='.blg' --exclude='.ilg' --exclude='.glg'
--exclude='
.bak' --exclude='~' --exclude='.swp'
--exclude='.git' --exclude='.svn'
--exclude='.DS_Store' --exclude='Thumbs.db'
--exclude='.vscode' --exclude='.idea'
[list of files/directories to include]

Or create zip

Or create zip

zip -r submission.zip [files]
-x '.aux' -x '.log' -x '*.toc' ...

Prefer explicit file inclusion over directory-wide inclusion with exclusions.
This prevents accidental inclusion of sensitive or unnecessary files.
zip -r submission.zip [files]
-x '.aux' -x '.log' -x '*.toc' ...

优先使用明确的文件包含方式,而非“全目录包含+排除指定文件”的方式。这能避免意外包含敏感或不必要的文件。

5. Verify Archive

5. 验证归档文件

After creation:
  1. List archive contents — verify no unexpected files
  2. Extract to temporary directory
  3. Attempt compilation in the extracted directory (if TeX toolchain available)
  4. Report archive size (compressed and uncompressed)
  5. Compare file list against the include/exclude inventory
创建完成后需执行以下验证:
  1. 列出归档内容——确认无意外文件
  2. 将归档解压至临时目录
  3. 尝试在解压后的目录中编译项目(若TeX工具链可用)
  4. 报告归档大小(压缩后和未压缩)
  5. 对比文件列表与之前的包含/排除盘点结果

6. Report

6. 生成报告

markdown
undefined
markdown
undefined

arXiv Package Report

arXiv Package Report

Archive: submission.tar.gz Compressed size: [size] Uncompressed size: [size] File count: [count]
Archive: submission.tar.gz Compressed size: [size] Uncompressed size: [size] File count: [count]

Contents

Contents

FileSizeType
main.tex45 KBSource
references.bbl12 KBBibliography
fig1.pdf380 KBFigure
anc/data.csv1.2 MBAncillary
FileSizeType
main.tex45 KBSource
references.bbl12 KBBibliography
fig1.pdf380 KBFigure
anc/data.csv1.2 MBAncillary

Excluded

Excluded

FileReason
main.auxBuild artifact
main.logBuild artifact
referee-response.pdfNot for submission
FileReason
main.auxBuild artifact
main.logBuild artifact
referee-response.pdfNot for submission

00README.XXX

00README.XXX

[Contents if generated]
[Contents if generated]

Verification

Verification

  • Archive extracts cleanly
  • All source references resolve
  • Compilation succeeds (if tested)
  • No sensitive files included
undefined
  • Archive extracts cleanly
  • All source references resolve
  • Compilation succeeds (if tested)
  • No sensitive files included
undefined

Archive Format Notes

归档格式说明

  • .tar.gz
    preferred over
    .zip
    (standard in academic TeX workflows)
  • arXiv accepts both
    .tar
    (gzipped) and
    .zip
  • Do not nest archives (no
    .tar.gz
    inside a
    .zip
    )
  • Files should be at root level or in logical subdirectories — no wrapper directory unless the project uses subdirectories for organization
  • 学术TeX工作流中优先使用
    .tar.gz
    格式,而非
    .zip
  • arXiv同时接受
    .tar
    (gzip压缩)和
    .zip
    格式
  • 请勿嵌套归档(禁止在
    .zip
    中包含
    .tar.gz
  • 文件应位于根目录或逻辑子目录中——除非项目本身使用子目录组织,否则不要添加外层包裹目录

Core Principles

核心原则

  • Explicit inclusion over blanket exclusion. Build the file list from what's needed, not from everything minus what's not needed. This catches files that shouldn't be submitted but aren't in the exclusion list.
  • Verify after packaging. The archive is the submission. Test it, not the source directory.
  • Preserve directory structure. If the project uses subdirectories for figures or sections, maintain that structure. TeX paths depend on it.
  • No sensitive files. Check for
    .env
    , credentials, personal notes, referee correspondence before packaging.
  • 明确包含优先于批量排除:根据需求构建文件列表,而非“所有文件减去排除项”。这能捕获不在排除列表中但不应提交的文件。
  • 打包后验证:归档文件即为最终提交物,请验证归档而非源目录。
  • 保留目录结构:若项目使用子目录存放图片或章节,请保持该结构。TeX路径依赖于目录结构。
  • 无敏感文件:打包前检查是否存在
    .env
    、凭据文件、个人笔记、审稿往来邮件等。