arxiv-figures
Compare original and translation side by side
🇺🇸
Original
English🇨🇳
Translation
ChinesearXiv Figure Optimizer
arXiv图片优化工具
Purpose
用途
Analyze, optimize, and convert figures in a TeX/LaTeX project to meet arXiv
requirements and size constraints. Produces correctly formatted, efficiently
compressed figures that compile without errors.
Companion skills:
- — full submission validation
arxiv-preflight - — tarball packaging
arxiv-package
分析、优化并转换TeX/LaTeX项目中的图片,使其满足arXiv的要求和大小限制。生成格式正确、压缩高效的图片,确保编译无错误。
配套技能:
- — 完整提交验证
arxiv-preflight - — 压缩包打包
arxiv-package
Format Rules
格式规则
By Processor
按处理器分类
| Processor | Accepted | Rejected |
|---|---|---|
| LaTeX (DVI mode) | | |
| PDFLaTeX | | |
| 处理器 | 支持格式 | 不支持格式 |
|---|---|---|
| LaTeX(DVI模式) | | |
| PDFLaTeX | | |
By Content Type
按内容类型分类
| Content | Optimal Format | Rationale |
|---|---|---|
| Photographs | JPEG | Lossy compression suits continuous tone |
| Line drawings / diagrams | PDF (vector) | Scalable, sharp at any resolution |
| Plots with text labels | PDF (vector) | Text remains crisp and searchable |
| Screenshots / raster art | PNG | Lossless compression for sharp edges |
| Mixed photo + text | PNG or PDF | Depends on dominant content |
| 内容类型 | 最优格式 | 理由 |
|---|---|---|
| 照片 | JPEG | 有损压缩适合连续色调内容 |
| 线条图/示意图 | PDF(矢量) | 可缩放,任意分辨率下都清晰 |
| 带文本标签的图表 | PDF(矢量) | 文本保持清晰且可搜索 |
| 截图/光栅图 | PNG | 无损压缩适合清晰边缘内容 |
| 照片+文本混合内容 | PNG或PDF | 取决于主要内容类型 |
Workflow
工作流程
1. Inventory
1. 盘点
Scan the project for all figures:
- Parse calls from all
\includegraphicsfiles.tex - Identify the TeX processor (DVI vs PDFLaTeX) from document preamble or build config
- For each figure: record path, format, file size, dimensions (pixels or vector bounds)
- Flag missing figures, wrong-format figures, oversized figures
扫描项目中的所有图片:
- 解析所有文件中的
.tex调用\includegraphics - 从文档序言或构建配置中识别TeX处理器(DVI或PDFLaTeX)
- 为每张图片记录路径、格式、文件大小、尺寸(像素或矢量边界)
- 标记缺失的图片、格式错误的图片、过大的图片
2. Analyze
2. 分析
For each figure, determine:
- Format compliance — does the format match the processor?
- File size — flag individual figures >2MB, total >15MB
- Resolution — PNG/JPEG: flag >34 Megapixels (arXiv warning threshold since Feb 2026)
- Content type — photograph vs diagram vs plot (determines optimal format)
- Redundant metadata — PNG: ICC profiles, alpha channels, EXIF, interlacing
- EPS efficiency — verbose PostScript from plotting programs (common with matplotlib, R, MATLAB)
针对每张图片,确定以下内容:
- 格式合规性 — 格式是否匹配处理器要求?
- 文件大小 — 标记单个图片>2MB、总大小>15MB的情况
- 分辨率 — PNG/JPEG:标记分辨率>3400万像素(自2026年2月起的arXiv警告阈值)
- 内容类型 — 照片、示意图还是图表(决定最优格式)
- 冗余元数据 — PNG:ICC配置文件、Alpha通道、EXIF、隔行扫描
- EPS效率 — 绘图程序生成的冗余PostScript(matplotlib、R、MATLAB中常见)
3. Optimize
3. 优化
Apply transformations in order of impact:
Format Conversion (when format violates processor requirements)
bash
undefined按影响优先级应用转换操作:
格式转换(当格式不符合处理器要求时)
bash
undefinedEPS → PDF (for PDFLaTeX)
EPS → PDF(适用于PDFLaTeX)
epstopdf figure.eps
epstopdf figure.eps
or
或
ps2pdf -dEPSCrop figure.eps figure.pdf
ps2pdf -dEPSCrop figure.eps figure.pdf
PDF/PNG/JPG → EPS (for DVI mode)
PDF/PNG/JPG → EPS(适用于DVI模式)
convert figure.png figure.eps
**Size Reduction — Vector Figures**
```bashconvert figure.png figure.eps
**矢量图片大小压缩**
```bashDistill verbose EPS
精简冗余EPS
eps2eps input.eps output.eps
eps2eps input.eps output.eps
or convert to PDF
或转换为PDF
ps2pdf -dEPSCrop input.eps output.pdf
**Size Reduction — Raster Figures**
```bashps2pdf -dEPSCrop input.eps output.pdf
**光栅图片大小压缩**
```bashStrip PNG metadata, remove alpha, optimize compression
清除PNG元数据、移除Alpha通道、优化压缩率
convert input.png -strip -alpha remove -define png:compression-level=9 output.png
convert input.png -strip -alpha remove -define png:compression-level=9 output.png
Reduce oversized PNG resolution (keep ≤300 DPI at print size)
降低过大PNG的分辨率(打印尺寸下保持≤300 DPI)
convert input.png -resize 3000x3000> -strip output.png
convert input.png -resize 3000x3000> -strip output.png
JPEG quality optimization (80-90 is visually lossless for most content)
JPEG质量优化(80-90对大多数内容来说视觉上无损失)
convert input.jpg -quality 85 -strip output.jpg
convert input.jpg -quality 85 -strip output.jpg
Downsample oversized JPEG
降低过大JPEG的分辨率
convert input.jpg -resize 3000x3000> -quality 85 -strip output.jpg
**PNG Optimization** (avoid arXiv warnings)
- Remove palette indexing if unnecessary
- Remove alpha channel if background is solid
- Strip ICC color profiles
- Remove metadata chunks
- Disable interlacing
**EPS BoundingBox Fix** (prevents `Missing number, treated as zero`)
- Verify `%%BoundingBox` appears near top of file, not only at end
- If only `%%BoundingBox: (atend)`, extract actual values and place at topconvert input.jpg -resize 3000x3000> -quality 85 -strip output.jpg
**PNG优化**(避免arXiv警告)
- 必要时移除调色板索引
- 若背景为纯色则移除Alpha通道
- 清除ICC颜色配置文件
- 移除元数据块
- 禁用隔行扫描
**EPS边界框修复**(防止`Missing number, treated as zero`错误)
- 验证`%%BoundingBox`是否出现在文件顶部附近,而非仅在末尾
- 若仅存在`%%BoundingBox: (atend)`,提取实际值并放置在文件顶部4. Update TeX Source
4. 更新TeX源文件
If figures were renamed or reformatted:
- Update paths
\includegraphics - Remove explicit extensions where possible (allows processor flexibility)
- Verify settings if used
\graphicspath
若图片被重命名或转换格式:
- 更新的路径
\includegraphics - 尽可能移除显式扩展名(提升处理器兼容性)
- 若使用则验证其设置
\graphicspath
5. Verify
5. 验证
After optimization:
- Attempt local compilation to verify all figures render
- Compare visual output of optimized vs original figures
- Report size reduction per figure and total
优化完成后:
- 尝试本地编译以验证所有图片可正常渲染
- 对比优化后与原始图片的视觉输出
- 报告每张图片及整体的大小压缩情况
6. Report
6. 生成报告
markdown
undefinedmarkdown
undefinedFigure Optimization Report
图片优化报告
Processor: [detected]
Total figures: [count]
Size before: [total MB]
Size after: [total MB]
Reduction: [percentage]
处理器: [检测到的处理器]
图片总数: [数量]
优化前总大小: [总MB数]
优化后总大小: [总MB数]
压缩率: [百分比]
Changes Made
已执行的修改
| Figure | Original | Optimized | Size Before | Size After | Action |
|---|---|---|---|---|---|
| fig1 | fig1.eps | fig1.pdf | 12.3 MB | 0.4 MB | EPS→PDF conversion |
| fig2 | fig2.png | fig2.png | 8.1 MB | 1.2 MB | Strip metadata, downsample |
| 图片 | 原始信息 | 优化后信息 | 优化前大小 | 优化后大小 | 操作 |
|---|---|---|---|---|---|
| fig1 | fig1.eps | fig1.pdf | 12.3 MB | 0.4 MB | EPS→PDF格式转换 |
| fig2 | fig2.png | fig2.png | 8.1 MB | 1.2 MB | 清除元数据、降低分辨率 |
Warnings
警告
[Any remaining issues — e.g., figures still above thresholds]
undefined[剩余问题——例如,仍超过阈值的图片]
undefinedTools Reference
工具参考
| Tool | Install | Use Case |
|---|---|---|
ImageMagick ( | System package | Format conversion, resizing, stripping |
Ghostscript ( | System package | EPS/PS optimization and conversion |
| TeX Live | EPS → PDF conversion |
| TeX Live | Trim PDF whitespace |
| System package | PNG lossless optimization |
| System package | PNG lossy size reduction |
| System package | JPEG lossless optimization |
| 工具 | 安装方式 | 使用场景 |
|---|---|---|
ImageMagick ( | 系统包 | 格式转换、调整大小、清除元数据 |
Ghostscript ( | 系统包 | EPS/PS优化与转换 |
| TeX Live | EPS → PDF转换 |
| TeX Live | 裁剪PDF空白区域 |
| 系统包 | PNG无损优化 |
| 系统包 | PNG有损压缩 |
| 系统包 | JPEG无损优化 |
Core Principles
核心原则
- Never degrade visual quality below print-readable. Optimization means removing waste (metadata, unnecessary resolution, verbose encoding), not destroying information.
- Match format to processor. A figure in the wrong format blocks compilation. This is the highest priority fix.
- Preserve vector where possible. Converting vector to raster is a one-way quality loss. Only do this when the vector version is pathologically large (>10MB) and cannot be distilled.
- Report everything. The user decides which optimizations to accept. Show before/after sizes and explain each transformation.
- 绝不降低图片至打印可读以下的视觉质量。优化指清除冗余内容(元数据、不必要的分辨率、冗余编码),而非破坏信息。
- 格式匹配处理器。格式错误的图片会导致编译失败,这是最高优先级的修复项。
- 尽可能保留矢量格式。将矢量格式转换为光栅格式是不可逆的质量损失。仅当矢量版本异常庞大(>10MB)且无法精简时才执行此操作。
- 全面报告所有操作。由用户决定是否接受优化方案。展示优化前后的大小并解释每项转换操作。