pandoc
Compare original and translation side by side
🇺🇸
Original
English🇨🇳
Translation
ChinesePandoc Document Conversion Skill
Pandoc文档转换Skill
Convert documents between formats using pandoc, the universal document converter.
使用通用文档转换器pandoc实现不同格式文档之间的转换。
Prerequisites
前置要求
bash
undefinedbash
undefinedCheck if pandoc is installed
检查pandoc是否已安装
pandoc --version
pandoc --version
Install via Homebrew if needed
如需安装,可通过Homebrew进行
brew install pandoc
undefinedbrew install pandoc
undefinedCommon Conversions
常见转换场景
Markdown to Word (.docx)
Markdown转Word(.docx)
bash
undefinedbash
undefinedBasic conversion
基础转换
pandoc input.md -o output.docx
pandoc input.md -o output.docx
With table of contents
带目录
pandoc input.md --toc -o output.docx
pandoc input.md --toc -o output.docx
With custom reference doc (for styling)
使用自定义参考文档(用于样式设置)
pandoc input.md --reference-doc=template.docx -o output.docx
pandoc input.md --reference-doc=template.docx -o output.docx
Standalone with metadata
包含元数据的独立文档
pandoc input.md -s --metadata title="Document Title" -o output.docx
undefinedpandoc input.md -s --metadata title="Document Title" -o output.docx
undefinedMarkdown to PDF
Markdown转PDF
bash
undefinedbash
undefinedRequires LaTeX - install one of:
需要LaTeX环境 - 可安装以下任意一种:
brew install --cask basictex # Smaller (~100MB)
brew install --cask basictex # 轻量版(约100MB)
brew install --cask mactex-no-gui # Full (~4GB)
brew install --cask mactex-no-gui # 完整版(约4GB)
After install: eval "$(/usr/libexec/path_helper)" or new terminal
安装完成后:执行 eval "$(/usr/libexec/path_helper)" 或重启终端
Basic conversion (uses pdflatex)
基础转换(使用pdflatex)
pandoc input.md -o output.pdf
pandoc input.md -o output.pdf
With table of contents and custom margins
带目录和自定义边距
pandoc input.md -s --toc --toc-depth=2 -V geometry:margin=1in -o output.pdf
pandoc input.md -s --toc --toc-depth=2 -V geometry:margin=1in -o output.pdf
Using xelatex (better Unicode support - box drawings, arrows, etc.)
使用xelatex(更好的Unicode支持 - 如方框绘制、箭头等)
export PATH="/Library/TeX/texbin:$PATH"
pandoc input.md --pdf-engine=xelatex -V geometry:margin=1in -o output.pdf
**PDF Engine Selection:**
| Engine | Use When |
|--------|----------|
| `pdflatex` | Default, ASCII content only |
| `xelatex` | Unicode characters (arrows, box-drawing, emojis) |
| `lualatex` | Complex typography, OpenType fonts |export PATH="/Library/TeX/texbin:$PATH"
pandoc input.md --pdf-engine=xelatex -V geometry:margin=1in -o output.pdf
**PDF引擎选择:**
| 引擎 | 适用场景 |
|--------|----------|
| `pdflatex` | 默认选项,仅适用于ASCII内容 |
| `xelatex` | 处理Unicode字符(箭头、方框绘制、表情符号等) |
| `lualatex` | 复杂排版、OpenType字体 |Markdown to HTML
Markdown转HTML
Critical: Always use (GitHub Flavored Markdown) for proper line break and list handling. Standard markdown collapses consecutive lines into paragraphs.
-f gfmbash
undefined重要提示: 为了正确处理换行和列表,始终使用 (GitHub风格Markdown)。标准Markdown会将连续行合并为段落。
-f gfmbash
undefinedRECOMMENDED: GitHub Flavored Markdown with full styling
推荐方案:带完整样式的GitHub风格Markdown
pandoc -f gfm -s -H <(cat << 'STYLE'
<style>
body{font-family:-apple-system,BlinkMacSystemFont,"Segoe UI",Roboto,sans-serif;max-width:800px;margin:0 auto;padding:2em;line-height:1.6}
h1{border-bottom:2px solid #333;padding-bottom:0.3em}
h2{border-bottom:1px solid #ccc;padding-bottom:0.2em;margin-top:1.5em}
h3{margin-top:1.2em}
ul,ol{margin:0.5em 0 0.5em 1.5em;padding-left:1em}
ul{list-style-type:disc}ol{list-style-type:decimal}
li{margin:0.3em 0}ul ul,ol ul{list-style-type:circle;margin:0.2em 0 0.2em 1em}
table{border-collapse:collapse;width:100%;margin:1em 0}
th,td{border:1px solid #ddd;padding:8px;text-align:left}
th{background-color:#f5f5f5}
code{background-color:#f4f4f4;padding:2px 6px;border-radius:3px}
pre{background-color:#f4f4f4;padding:1em;overflow-x:auto;border-radius:5px}
blockquote{border-left:4px solid #ddd;margin:1em 0;padding-left:1em;color:#666}
</style>
STYLE
) input.md -o output.html
pandoc -f gfm -s -H <(cat << 'STYLE'
<style>
body{font-family:-apple-system,BlinkMacSystemFont,"Segoe UI",Roboto,sans-serif;max-width:800px;margin:0 auto;padding:2em;line-height:1.6}
h1{border-bottom:2px solid #333;padding-bottom:0.3em}
h2{border-bottom:1px solid #ccc;padding-bottom:0.2em;margin-top:1.5em}
h3{margin-top:1.2em}
ul,ol{margin:0.5em 0 0.5em 1.5em;padding-left:1em}
ul{list-style-type:disc}ol{list-style-type:decimal}
li{margin:0.3em 0}ul ul,ol ul{list-style-type:circle;margin:0.2em 0 0.2em 1em}
table{border-collapse:collapse;width:100%;margin:1em 0}
th,td{border:1px solid #ddd;padding:8px;text-align:left}
th{background-color:#f5f5f5}
code{background-color:#f4f4f4;padding:2px 6px;border-radius:3px}
pre{background-color:#f4f4f4;padding:1em;overflow-x:auto;border-radius:5px}
blockquote{border-left:4px solid #ddd;margin:1em 0;padding-left:1em;color:#666}
</style>
STYLE
) input.md -o output.html
Quick version (minimal styling)
快速版本(极简样式)
pandoc -f gfm -s input.md -o output.html
pandoc -f gfm -s input.md -o output.html
With hard line breaks (newlines become <br>)
强制硬换行(换行符转为<br>)
pandoc -f markdown+hard_line_breaks -s input.md -o output.html
pandoc -f markdown+hard_line_breaks -s input.md -o output.html
Self-contained (embeds images/CSS)
自包含版本(嵌入图片/CSS)
pandoc -f gfm -s --embed-resources --standalone input.md -o output.html
**Format Options:**
| Option | Use When |
|--------|----------|
| `-f gfm` | Default choice - handles lists, line breaks, tables correctly |
| `-f markdown+hard_line_breaks` | Force all newlines to become `<br>` |
| `-f commonmark` | Strict CommonMark compliance |pandoc -f gfm -s --embed-resources --standalone input.md -o output.html
**格式选项:**
| 选项 | 适用场景 |
|--------|----------|
| `-f gfm` | 默认选择 - 正确处理列表、换行、表格 |
| `-f markdown+hard_line_breaks` | 强制所有换行符转为`<br>` |
| `-f commonmark` | 严格遵循CommonMark标准 |HTML for Print-to-PDF (No LaTeX Required)
HTML转PDF(无需LaTeX)
When LaTeX isn't available, create styled HTML and print to PDF from browser:
bash
undefined当没有LaTeX环境时,可创建带样式的HTML,再通过浏览器打印为PDF:
bash
undefinedCreate inline CSS file
创建内联CSS文件
cat > /tmp/print-style.css << 'EOF'
body { font-family: -apple-system, BlinkMacSystemFont, "Segoe UI", Roboto, sans-serif;
max-width: 800px; margin: 0 auto; padding: 2em; line-height: 1.6; }
h1 { border-bottom: 2px solid #333; padding-bottom: 0.3em; }
h2 { border-bottom: 1px solid #ccc; padding-bottom: 0.2em; margin-top: 1.5em; }
h3 { margin-top: 1.2em; }
/* Lists - critical for proper rendering /
ul, ol { margin: 0.5em 0 0.5em 1.5em; padding-left: 1em; }
ul { list-style-type: disc; }
ol { list-style-type: decimal; }
li { margin: 0.3em 0; }
ul ul, ol ul { list-style-type: circle; margin: 0.2em 0 0.2em 1em; }
ul ol, ol ol { margin: 0.2em 0 0.2em 1em; }
ul ul ul, ol ul ul { list-style-type: square; }
/ Tables /
table { border-collapse: collapse; width: 100%; margin: 1em 0; }
th, td { border: 1px solid #ddd; padding: 8px; text-align: left; }
th { background-color: #f5f5f5; }
/ Code /
code { background-color: #f4f4f4; padding: 2px 6px; border-radius: 3px; }
pre { background-color: #f4f4f4; padding: 1em; overflow-x: auto; border-radius: 5px; }
/ Other */
blockquote { border-left: 4px solid #ddd; margin: 1em 0; padding-left: 1em; color: #666; }
@media print { body { max-width: none; } }
EOF
cat > /tmp/print-style.css << 'EOF'
body { font-family: -apple-system, BlinkMacSystemFont, "Segoe UI", Roboto, sans-serif;
max-width: 800px; margin: 0 auto; padding: 2em; line-height: 1.6; }
h1 { border-bottom: 2px solid #333; padding-bottom: 0.3em; }
h2 { border-bottom: 1px solid #ccc; padding-bottom: 0.2em; margin-top: 1.5em; }
h3 { margin-top: 1.2em; }
/* 列表 - 确保正确渲染的关键 /
ul, ol { margin: 0.5em 0 0.5em 1.5em; padding-left: 1em; }
ul { list-style-type: disc; }
ol { list-style-type: decimal; }
li { margin: 0.3em 0; }
ul ul, ol ul { list-style-type: circle; margin: 0.2em 0 0.2em 1em; }
ul ol, ol ol { margin: 0.2em 0 0.2em 1em; }
ul ul ul, ol ul ul { list-style-type: square; }
/ 表格 /
table { border-collapse: collapse; width: 100%; margin: 1em 0; }
th, td { border: 1px solid #ddd; padding: 8px; text-align: left; }
th { background-color: #f5f5f5; }
/ 代码 /
code { background-color: #f4f4f4; padding: 2px 6px; border-radius: 3px; }
pre { background-color: #f4f4f4; padding: 1em; overflow-x: auto; border-radius: 5px; }
/ 其他 */
blockquote { border-left: 4px solid #ddd; margin: 1em 0; padding-left: 1em; color: #666; }
@media print { body { max-width: none; } }
EOF
Convert with embedded styles (ALWAYS use -f gfm)
转换并嵌入样式(始终使用-f gfm)
pandoc -f gfm input.md -s --toc --toc-depth=2 -c /tmp/print-style.css --embed-resources --standalone -o output.html
pandoc -f gfm input.md -s --toc --toc-depth=2 -c /tmp/print-style.css --embed-resources --standalone -o output.html
Open and print to PDF (Cmd+P > Save as PDF)
打开并打印为PDF(Cmd+P > 存储为PDF)
open output.html
undefinedopen output.html
undefinedWord to Markdown
Word转Markdown
bash
undefinedbash
undefinedExtract markdown from docx
从docx提取Markdown
pandoc input.docx -o output.md
pandoc input.docx -o output.md
With ATX-style headers
生成ATX风格标题
pandoc input.docx --atx-headers -o output.md
undefinedpandoc input.docx --atx-headers -o output.md
undefinedUseful Options
实用选项
| Option | Description |
|---|---|
| Produce standalone document with header/footer |
| Generate table of contents |
| TOC depth (default: 3) |
| Set template variable |
| Set metadata field |
| Use FILE for styling (docx/odt) |
| Use custom template |
| Syntax highlighting (pygments, tango, etc.) |
| Number section headings |
| Input format (if not auto-detected) |
| Output format (if not auto-detected) |
| 选项 | 说明 |
|---|---|
| 生成带页眉/页脚的独立文档 |
| 生成目录 |
| 目录层级(默认:3) |
| 设置模板变量 |
| 设置元数据字段 |
| 使用指定文件作为样式参考(docx/odt) |
| 使用自定义模板 |
| 语法高亮样式(pygments、tango等) |
| 为章节标题编号 |
| 指定输入格式(若未自动检测) |
| 指定输出格式(若未自动检测) |
Format Identifiers
格式标识符
| Format | Identifier |
|---|---|
| Markdown | |
| Word | |
| |
| HTML | |
| LaTeX | |
| RST | |
| EPUB | |
| ODT | |
| RTF | |
| 格式 | 标识符 |
|---|---|
| Markdown | |
| Word | |
| |
| HTML | |
| LaTeX | |
| RST | |
| EPUB | |
| ODT | |
| RTF | |
Google Docs Workflow
Google Docs工作流
To get markdown into Google Docs with formatting preserved:
bash
undefined如需将Markdown导入Google Docs并保留格式:
bash
undefined1. Convert to docx
1. 转换为docx
pandoc document.md -o document.docx
pandoc document.md -o document.docx
2. Upload to Google Drive
2. 上传至Google Drive
3. Right-click > Open with > Google Docs
3. 右键 > 打开方式 > Google Docs
Google Docs imports .docx files well and preserves:
- Headings
- Bold/italic
- Lists (bulleted and numbered)
- Tables
- Links
- Code blocks (as monospace)
Google Docs对.docx文件的导入效果良好,可保留以下内容:
- 标题
- 粗体/斜体
- 列表(项目符号和编号)
- 表格
- 链接
- 代码块(以等宽字体显示)PSI Document Conversion
PSI文档转换
For PSI documents with tables and complex formatting:
bash
undefined针对包含表格和复杂格式的PSI文档:
bash
undefinedConvert PSI markdown to Word
将PSI Markdown转换为Word
pandoc PSI-document.md
--standalone
--toc
--toc-depth=2
-o PSI-document.docx
--standalone
--toc
--toc-depth=2
-o PSI-document.docx
pandoc PSI-document.md
--standalone
--toc
--toc-depth=2
-o PSI-document.docx
--standalone
--toc
--toc-depth=2
-o PSI-document.docx
Open for review
打开进行审阅
open PSI-document.docx
undefinedopen PSI-document.docx
undefinedTroubleshooting
故障排除
Lists/Lines Running Together (HTML)
列表/内容行合并(HTML)
If bullet points, numbered lists, or consecutive lines merge into one paragraph:
Cause: Standard markdown treats consecutive lines as one paragraph. Lists need blank lines before them.
Fix: Use GitHub Flavored Markdown ():
-f gfmbash
undefined若项目符号、编号列表或连续行合并为单个段落:
原因: 标准Markdown将连续行视为单个段落,列表前需要空行。
解决方法: 使用GitHub风格Markdown():
-f gfmbash
undefinedAlways use -f gfm for reliable formatting
始终使用-f gfm以确保格式可靠
pandoc -f gfm -s input.md -o output.html
pandoc -f gfm -s input.md -o output.html
For documents where newlines should be <br> tags
如需将换行符转为<br>标签
pandoc -f markdown+hard_line_breaks -s input.md -o output.html
**Why this happens:**
- Standard markdown: `Line 1\nLine 2` → `<p>Line 1 Line 2</p>` (merged)
- GFM: Better list detection, handles edge cases
- `+hard_line_breaks`: `Line 1\nLine 2` → `Line 1<br>Line 2` (preserved)pandoc -f markdown+hard_line_breaks -s input.md -o output.html
**原因说明:**
- 标准Markdown:`Line 1\nLine 2` → `<p>Line 1 Line 2</p>`(合并)
- GFM:更好的列表检测,处理边缘情况
- `+hard_line_breaks`:`Line 1\nLine 2` → `Line 1<br>Line 2`(保留换行)Tables Not Rendering
表格未渲染
Pandoc requires proper markdown table syntax:
markdown
| Header 1 | Header 2 |
|----------|----------|
| Cell 1 | Cell 2 |Pandoc要求使用正确的Markdown表格语法:
markdown
| 表头1 | 表头2 |
|----------|----------|
| 单元格1 | 单元格2 |Code Blocks Missing Highlighting
代码块缺少高亮
Use fenced code blocks with language identifier:
markdown
```python
def example():
passundefined使用带语言标识的围栏代码块:
markdown
```python
def example():
passundefinedPDF Generation Fails
PDF生成失败
"pdflatex not found" - Install LaTeX:
bash
undefined"pdflatex not found" - 安装LaTeX:
bash
undefinedSmaller option (~100MB)
轻量选项(约100MB)
brew install --cask basictex
brew install --cask basictex
Full option (~4GB)
完整选项(约4GB)
brew install --cask mactex-no-gui
brew install --cask mactex-no-gui
After install, update PATH
安装完成后,更新PATH
eval "$(/usr/libexec/path_helper)"
eval "$(/usr/libexec/path_helper)"
Or open a new terminal
或重启终端
**Unicode character errors (box-drawing, arrows, emojis):**
```bash
**Unicode字符错误(方框绘制、箭头、表情符号):**
```bashUse xelatex instead of pdflatex
使用xelatex替代pdflatex
export PATH="/Library/TeX/texbin:$PATH"
pandoc input.md --pdf-engine=xelatex -o output.pdf
**No LaTeX available** - Use HTML print-to-PDF workflow:
```bash
pandoc input.md -s --toc -o output.html
open output.htmlexport PATH="/Library/TeX/texbin:$PATH"
pandoc input.md --pdf-engine=xelatex -o output.pdf
**无LaTeX环境** - 使用HTML转PDF工作流:
```bash
pandoc input.md -s --toc -o output.html
open output.htmlThen Cmd+P > Save as PDF
然后按Cmd+P > 存储为PDF
undefinedundefinedSelf-Test
自测
bash
undefinedbash
undefinedVerify pandoc installation
验证pandoc安装
pandoc --version | head -1
pandoc --version | head -1
Test basic conversion
测试基础转换
echo "# Test\n\nHello world" | pandoc -f markdown -t html
undefinedecho "# Test\n\nHello world" | pandoc -f markdown -t html
undefined