pandoc

Compare original and translation side by side

🇺🇸

Original

English
🇨🇳

Translation

Chinese

Pandoc Document Conversion Skill

Pandoc文档转换Skill

Convert documents between formats using pandoc, the universal document converter.
使用通用文档转换器pandoc实现不同格式文档之间的转换。

Prerequisites

前置要求

bash
undefined
bash
undefined

Check if pandoc is installed

检查pandoc是否已安装

pandoc --version
pandoc --version

Install via Homebrew if needed

如需安装,可通过Homebrew进行

brew install pandoc
undefined
brew install pandoc
undefined

Common Conversions

常见转换场景

Markdown to Word (.docx)

Markdown转Word(.docx)

bash
undefined
bash
undefined

Basic conversion

基础转换

pandoc input.md -o output.docx
pandoc input.md -o output.docx

With table of contents

带目录

pandoc input.md --toc -o output.docx
pandoc input.md --toc -o output.docx

With custom reference doc (for styling)

使用自定义参考文档(用于样式设置)

pandoc input.md --reference-doc=template.docx -o output.docx
pandoc input.md --reference-doc=template.docx -o output.docx

Standalone with metadata

包含元数据的独立文档

pandoc input.md -s --metadata title="Document Title" -o output.docx
undefined
pandoc input.md -s --metadata title="Document Title" -o output.docx
undefined

Markdown to PDF

Markdown转PDF

bash
undefined
bash
undefined

Requires LaTeX - install one of:

需要LaTeX环境 - 可安装以下任意一种:

brew install --cask basictex # Smaller (~100MB)

brew install --cask basictex # 轻量版(约100MB)

brew install --cask mactex-no-gui # Full (~4GB)

brew install --cask mactex-no-gui # 完整版(约4GB)

After install: eval "$(/usr/libexec/path_helper)" or new terminal

安装完成后:执行 eval "$(/usr/libexec/path_helper)" 或重启终端

Basic conversion (uses pdflatex)

基础转换(使用pdflatex)

pandoc input.md -o output.pdf
pandoc input.md -o output.pdf

With table of contents and custom margins

带目录和自定义边距

pandoc input.md -s --toc --toc-depth=2 -V geometry:margin=1in -o output.pdf
pandoc input.md -s --toc --toc-depth=2 -V geometry:margin=1in -o output.pdf

Using xelatex (better Unicode support - box drawings, arrows, etc.)

使用xelatex(更好的Unicode支持 - 如方框绘制、箭头等)

export PATH="/Library/TeX/texbin:$PATH" pandoc input.md --pdf-engine=xelatex -V geometry:margin=1in -o output.pdf

**PDF Engine Selection:**

| Engine | Use When |
|--------|----------|
| `pdflatex` | Default, ASCII content only |
| `xelatex` | Unicode characters (arrows, box-drawing, emojis) |
| `lualatex` | Complex typography, OpenType fonts |
export PATH="/Library/TeX/texbin:$PATH" pandoc input.md --pdf-engine=xelatex -V geometry:margin=1in -o output.pdf

**PDF引擎选择:**

| 引擎 | 适用场景 |
|--------|----------|
| `pdflatex` | 默认选项,仅适用于ASCII内容 |
| `xelatex` | 处理Unicode字符(箭头、方框绘制、表情符号等) |
| `lualatex` | 复杂排版、OpenType字体 |

Markdown to HTML

Markdown转HTML

Critical: Always use
-f gfm
(GitHub Flavored Markdown) for proper line break and list handling. Standard markdown collapses consecutive lines into paragraphs.
bash
undefined
重要提示: 为了正确处理换行和列表,始终使用
-f gfm
(GitHub风格Markdown)。标准Markdown会将连续行合并为段落。
bash
undefined

RECOMMENDED: GitHub Flavored Markdown with full styling

推荐方案:带完整样式的GitHub风格Markdown

pandoc -f gfm -s -H <(cat << 'STYLE'
<style> body{font-family:-apple-system,BlinkMacSystemFont,"Segoe UI",Roboto,sans-serif;max-width:800px;margin:0 auto;padding:2em;line-height:1.6} h1{border-bottom:2px solid #333;padding-bottom:0.3em} h2{border-bottom:1px solid #ccc;padding-bottom:0.2em;margin-top:1.5em} h3{margin-top:1.2em} ul,ol{margin:0.5em 0 0.5em 1.5em;padding-left:1em} ul{list-style-type:disc}ol{list-style-type:decimal} li{margin:0.3em 0}ul ul,ol ul{list-style-type:circle;margin:0.2em 0 0.2em 1em} table{border-collapse:collapse;width:100%;margin:1em 0} th,td{border:1px solid #ddd;padding:8px;text-align:left} th{background-color:#f5f5f5} code{background-color:#f4f4f4;padding:2px 6px;border-radius:3px} pre{background-color:#f4f4f4;padding:1em;overflow-x:auto;border-radius:5px} blockquote{border-left:4px solid #ddd;margin:1em 0;padding-left:1em;color:#666} </style>
STYLE ) input.md -o output.html
pandoc -f gfm -s -H <(cat << 'STYLE'
<style> body{font-family:-apple-system,BlinkMacSystemFont,"Segoe UI",Roboto,sans-serif;max-width:800px;margin:0 auto;padding:2em;line-height:1.6} h1{border-bottom:2px solid #333;padding-bottom:0.3em} h2{border-bottom:1px solid #ccc;padding-bottom:0.2em;margin-top:1.5em} h3{margin-top:1.2em} ul,ol{margin:0.5em 0 0.5em 1.5em;padding-left:1em} ul{list-style-type:disc}ol{list-style-type:decimal} li{margin:0.3em 0}ul ul,ol ul{list-style-type:circle;margin:0.2em 0 0.2em 1em} table{border-collapse:collapse;width:100%;margin:1em 0} th,td{border:1px solid #ddd;padding:8px;text-align:left} th{background-color:#f5f5f5} code{background-color:#f4f4f4;padding:2px 6px;border-radius:3px} pre{background-color:#f4f4f4;padding:1em;overflow-x:auto;border-radius:5px} blockquote{border-left:4px solid #ddd;margin:1em 0;padding-left:1em;color:#666} </style>
STYLE ) input.md -o output.html

Quick version (minimal styling)

快速版本(极简样式)

pandoc -f gfm -s input.md -o output.html
pandoc -f gfm -s input.md -o output.html

With hard line breaks (newlines become <br>)

强制硬换行(换行符转为<br>

pandoc -f markdown+hard_line_breaks -s input.md -o output.html
pandoc -f markdown+hard_line_breaks -s input.md -o output.html

Self-contained (embeds images/CSS)

自包含版本(嵌入图片/CSS)

pandoc -f gfm -s --embed-resources --standalone input.md -o output.html

**Format Options:**

| Option | Use When |
|--------|----------|
| `-f gfm` | Default choice - handles lists, line breaks, tables correctly |
| `-f markdown+hard_line_breaks` | Force all newlines to become `<br>` |
| `-f commonmark` | Strict CommonMark compliance |
pandoc -f gfm -s --embed-resources --standalone input.md -o output.html

**格式选项:**

| 选项 | 适用场景 |
|--------|----------|
| `-f gfm` | 默认选择 - 正确处理列表、换行、表格 |
| `-f markdown+hard_line_breaks` | 强制所有换行符转为`<br>` |
| `-f commonmark` | 严格遵循CommonMark标准 |

HTML for Print-to-PDF (No LaTeX Required)

HTML转PDF(无需LaTeX)

When LaTeX isn't available, create styled HTML and print to PDF from browser:
bash
undefined
当没有LaTeX环境时,可创建带样式的HTML,再通过浏览器打印为PDF:
bash
undefined

Create inline CSS file

创建内联CSS文件

cat > /tmp/print-style.css << 'EOF' body { font-family: -apple-system, BlinkMacSystemFont, "Segoe UI", Roboto, sans-serif; max-width: 800px; margin: 0 auto; padding: 2em; line-height: 1.6; } h1 { border-bottom: 2px solid #333; padding-bottom: 0.3em; } h2 { border-bottom: 1px solid #ccc; padding-bottom: 0.2em; margin-top: 1.5em; } h3 { margin-top: 1.2em; } /* Lists - critical for proper rendering / ul, ol { margin: 0.5em 0 0.5em 1.5em; padding-left: 1em; } ul { list-style-type: disc; } ol { list-style-type: decimal; } li { margin: 0.3em 0; } ul ul, ol ul { list-style-type: circle; margin: 0.2em 0 0.2em 1em; } ul ol, ol ol { margin: 0.2em 0 0.2em 1em; } ul ul ul, ol ul ul { list-style-type: square; } / Tables / table { border-collapse: collapse; width: 100%; margin: 1em 0; } th, td { border: 1px solid #ddd; padding: 8px; text-align: left; } th { background-color: #f5f5f5; } / Code / code { background-color: #f4f4f4; padding: 2px 6px; border-radius: 3px; } pre { background-color: #f4f4f4; padding: 1em; overflow-x: auto; border-radius: 5px; } / Other */ blockquote { border-left: 4px solid #ddd; margin: 1em 0; padding-left: 1em; color: #666; } @media print { body { max-width: none; } } EOF
cat > /tmp/print-style.css << 'EOF' body { font-family: -apple-system, BlinkMacSystemFont, "Segoe UI", Roboto, sans-serif; max-width: 800px; margin: 0 auto; padding: 2em; line-height: 1.6; } h1 { border-bottom: 2px solid #333; padding-bottom: 0.3em; } h2 { border-bottom: 1px solid #ccc; padding-bottom: 0.2em; margin-top: 1.5em; } h3 { margin-top: 1.2em; } /* 列表 - 确保正确渲染的关键 / ul, ol { margin: 0.5em 0 0.5em 1.5em; padding-left: 1em; } ul { list-style-type: disc; } ol { list-style-type: decimal; } li { margin: 0.3em 0; } ul ul, ol ul { list-style-type: circle; margin: 0.2em 0 0.2em 1em; } ul ol, ol ol { margin: 0.2em 0 0.2em 1em; } ul ul ul, ol ul ul { list-style-type: square; } / 表格 / table { border-collapse: collapse; width: 100%; margin: 1em 0; } th, td { border: 1px solid #ddd; padding: 8px; text-align: left; } th { background-color: #f5f5f5; } / 代码 / code { background-color: #f4f4f4; padding: 2px 6px; border-radius: 3px; } pre { background-color: #f4f4f4; padding: 1em; overflow-x: auto; border-radius: 5px; } / 其他 */ blockquote { border-left: 4px solid #ddd; margin: 1em 0; padding-left: 1em; color: #666; } @media print { body { max-width: none; } } EOF

Convert with embedded styles (ALWAYS use -f gfm)

转换并嵌入样式(始终使用-f gfm)

pandoc -f gfm input.md -s --toc --toc-depth=2 -c /tmp/print-style.css --embed-resources --standalone -o output.html
pandoc -f gfm input.md -s --toc --toc-depth=2 -c /tmp/print-style.css --embed-resources --standalone -o output.html

Open and print to PDF (Cmd+P > Save as PDF)

打开并打印为PDF(Cmd+P > 存储为PDF)

open output.html
undefined
open output.html
undefined

Word to Markdown

Word转Markdown

bash
undefined
bash
undefined

Extract markdown from docx

从docx提取Markdown

pandoc input.docx -o output.md
pandoc input.docx -o output.md

With ATX-style headers

生成ATX风格标题

pandoc input.docx --atx-headers -o output.md
undefined
pandoc input.docx --atx-headers -o output.md
undefined

Useful Options

实用选项

OptionDescription
-s
/
--standalone
Produce standalone document with header/footer
--toc
Generate table of contents
--toc-depth=N
TOC depth (default: 3)
-V key=value
Set template variable
--metadata key=value
Set metadata field
--reference-doc=FILE
Use FILE for styling (docx/odt)
--template=FILE
Use custom template
--highlight-style=STYLE
Syntax highlighting (pygments, tango, etc.)
--number-sections
Number section headings
-f FORMAT
Input format (if not auto-detected)
-t FORMAT
Output format (if not auto-detected)
选项说明
-s
/
--standalone
生成带页眉/页脚的独立文档
--toc
生成目录
--toc-depth=N
目录层级(默认:3)
-V key=value
设置模板变量
--metadata key=value
设置元数据字段
--reference-doc=FILE
使用指定文件作为样式参考(docx/odt)
--template=FILE
使用自定义模板
--highlight-style=STYLE
语法高亮样式(pygments、tango等)
--number-sections
为章节标题编号
-f FORMAT
指定输入格式(若未自动检测)
-t FORMAT
指定输出格式(若未自动检测)

Format Identifiers

格式标识符

FormatIdentifier
Markdown
markdown
,
gfm
(GitHub),
commonmark
Word
docx
PDF
pdf
HTML
html
,
html5
LaTeX
latex
RST
rst
EPUB
epub
ODT
odt
RTF
rtf
格式标识符
Markdown
markdown
,
gfm
(GitHub风格),
commonmark
Word
docx
PDF
pdf
HTML
html
,
html5
LaTeX
latex
RST
rst
EPUB
epub
ODT
odt
RTF
rtf

Google Docs Workflow

Google Docs工作流

To get markdown into Google Docs with formatting preserved:
bash
undefined
如需将Markdown导入Google Docs并保留格式:
bash
undefined

1. Convert to docx

1. 转换为docx

pandoc document.md -o document.docx
pandoc document.md -o document.docx

2. Upload to Google Drive

2. 上传至Google Drive

3. Right-click > Open with > Google Docs

3. 右键 > 打开方式 > Google Docs


Google Docs imports .docx files well and preserves:
- Headings
- Bold/italic
- Lists (bulleted and numbered)
- Tables
- Links
- Code blocks (as monospace)

Google Docs对.docx文件的导入效果良好,可保留以下内容:
- 标题
- 粗体/斜体
- 列表(项目符号和编号)
- 表格
- 链接
- 代码块(以等宽字体显示)

PSI Document Conversion

PSI文档转换

For PSI documents with tables and complex formatting:
bash
undefined
针对包含表格和复杂格式的PSI文档:
bash
undefined

Convert PSI markdown to Word

将PSI Markdown转换为Word

pandoc PSI-document.md
--standalone
--toc
--toc-depth=2
-o PSI-document.docx
pandoc PSI-document.md
--standalone
--toc
--toc-depth=2
-o PSI-document.docx

Open for review

打开进行审阅

open PSI-document.docx
undefined
open PSI-document.docx
undefined

Troubleshooting

故障排除

Lists/Lines Running Together (HTML)

列表/内容行合并(HTML)

If bullet points, numbered lists, or consecutive lines merge into one paragraph:
Cause: Standard markdown treats consecutive lines as one paragraph. Lists need blank lines before them.
Fix: Use GitHub Flavored Markdown (
-f gfm
):
bash
undefined
若项目符号、编号列表或连续行合并为单个段落:
原因: 标准Markdown将连续行视为单个段落,列表前需要空行。
解决方法: 使用GitHub风格Markdown(
-f gfm
):
bash
undefined

Always use -f gfm for reliable formatting

始终使用-f gfm以确保格式可靠

pandoc -f gfm -s input.md -o output.html
pandoc -f gfm -s input.md -o output.html

For documents where newlines should be <br> tags

如需将换行符转为<br>标签

pandoc -f markdown+hard_line_breaks -s input.md -o output.html

**Why this happens:**
- Standard markdown: `Line 1\nLine 2` → `<p>Line 1 Line 2</p>` (merged)
- GFM: Better list detection, handles edge cases
- `+hard_line_breaks`: `Line 1\nLine 2` → `Line 1<br>Line 2` (preserved)
pandoc -f markdown+hard_line_breaks -s input.md -o output.html

**原因说明:**
- 标准Markdown:`Line 1\nLine 2` → `<p>Line 1 Line 2</p>`(合并)
- GFM:更好的列表检测,处理边缘情况
- `+hard_line_breaks`:`Line 1\nLine 2` → `Line 1<br>Line 2`(保留换行)

Tables Not Rendering

表格未渲染

Pandoc requires proper markdown table syntax:
markdown
| Header 1 | Header 2 |
|----------|----------|
| Cell 1   | Cell 2   |
Pandoc要求使用正确的Markdown表格语法:
markdown
| 表头1 | 表头2 |
|----------|----------|
| 单元格1   | 单元格2   |

Code Blocks Missing Highlighting

代码块缺少高亮

Use fenced code blocks with language identifier:
markdown
```python
def example():
    pass
undefined
使用带语言标识的围栏代码块:
markdown
```python
def example():
    pass
undefined

PDF Generation Fails

PDF生成失败

"pdflatex not found" - Install LaTeX:
bash
undefined
"pdflatex not found" - 安装LaTeX:
bash
undefined

Smaller option (~100MB)

轻量选项(约100MB)

brew install --cask basictex
brew install --cask basictex

Full option (~4GB)

完整选项(约4GB)

brew install --cask mactex-no-gui
brew install --cask mactex-no-gui

After install, update PATH

安装完成后,更新PATH

eval "$(/usr/libexec/path_helper)"
eval "$(/usr/libexec/path_helper)"

Or open a new terminal

或重启终端


**Unicode character errors (box-drawing, arrows, emojis):**
```bash

**Unicode字符错误(方框绘制、箭头、表情符号):**
```bash

Use xelatex instead of pdflatex

使用xelatex替代pdflatex

export PATH="/Library/TeX/texbin:$PATH" pandoc input.md --pdf-engine=xelatex -o output.pdf

**No LaTeX available** - Use HTML print-to-PDF workflow:
```bash
pandoc input.md -s --toc -o output.html
open output.html
export PATH="/Library/TeX/texbin:$PATH" pandoc input.md --pdf-engine=xelatex -o output.pdf

**无LaTeX环境** - 使用HTML转PDF工作流:
```bash
pandoc input.md -s --toc -o output.html
open output.html

Then Cmd+P > Save as PDF

然后按Cmd+P > 存储为PDF

undefined
undefined

Self-Test

自测

bash
undefined
bash
undefined

Verify pandoc installation

验证pandoc安装

pandoc --version | head -1
pandoc --version | head -1

Test basic conversion

测试基础转换

echo "# Test\n\nHello world" | pandoc -f markdown -t html
undefined
echo "# Test\n\nHello world" | pandoc -f markdown -t html
undefined