course-ebook-publishing

Compare original and translation side by side

🇺🇸

Original

English

🇨🇳

Translation

Chinese

Course Ebook Publishing

课程电子书发布

Schema authority: this skill reads the live
window.COURSE
object whose shape is defined in
_shared/domain-primitives.md
. Quiz / pre-test / post-test items are filtered OUT of the ebook (per §10 quiz item rules + ebook content policy).
Filename convention (English-first): outputs land in
dist/{name}.pdf
/
dist/{name}.docx
. Source markdown is composed under
dist/master.md
.

This skill turns a finished teaching website into a book — PDF (primary) and optionally DOCX (for editorial review or further authoring). It is a post-site step: it consumes

window.COURSE

from the live site, never re-authoring content.

Schema 权威性：此Skill会读取实时的
window.COURSE
对象，其结构定义在
_shared/domain-primitives.md
中。测验/预测试/后测试内容会被过滤出电子书（遵循第10条测验内容规则及电子书内容政策）。
文件名约定（英文优先）：输出文件将保存至
dist/{name}.pdf
/
dist/{name}.docx
。源markdown文件合成后保存于
dist/master.md
。

此Skill可将已完成的教学网站转换为书籍格式——主要是PDF，也可生成DOCX（用于编辑审阅或进一步创作）。这是一个网站完成后的步骤：它从实时网站读取

window.COURSE

对象，不会重新创作内容。

When to Invoke

调用时机

Site is feature-complete (Stages 1–5 done) and content is stable.
Stakeholders ask for a printable hand-out, an archivable record, or a deliverable for non-web channels.
Sometimes invoked after
```
course-corporate-edition
```
to produce a client deliverable bundle.

Do NOT invoke when site is mid-development. The ebook pipeline reads

window.COURSE

; if the site changes daily, the ebook drifts daily.

网站功能完整（阶段1–5已完成）且内容稳定。
利益相关方需要可打印的讲义、可存档的记录，或用于非网络渠道的交付物。
有时会在
```
course-corporate-edition
```
之后调用，以生成客户交付包。

请勿调用的情况：网站处于开发中期。电子书流水线读取

window.COURSE

对象；若网站每日变更，电子书内容也会随之每日偏离。

Architecture: Single Source → Two Outputs

架构：单一数据源 → 两种输出格式

window.COURSE  ─┐
                ├─→ compose-ebook.mjs ─→ master.md ─→ pandoc ─→ ebook.docx
asset folders ─┤                                  └─→ Playwright page.pdf() ─→ ebook.pdf
materials/   ─┘

Why a single composed markdown intermediate: keeps PDF and DOCX content identical. Pandoc handles markdown → DOCX cleanly; Playwright renders the same markdown via an HTML wrapper → PDF. Two outputs, one source of truth.

window.COURSE  ─┐
                ├─→ compose-ebook.mjs ─→ master.md ─→ pandoc ─→ ebook.docx
asset folders ─┤                                  └─→ Playwright page.pdf() ─→ ebook.pdf
materials/   ─┘

为何使用单一合成的markdown中间件：确保PDF和DOCX内容完全一致。Pandoc可将markdown干净地转换为DOCX；Playwright通过HTML包装器渲染同一markdown文件以生成PDF。两种输出格式共享同一数据源。

File Layout (Standard)

标准文件布局

scripts/
├── build-ebook.mjs              ← CLI entry point (--md-only, --no-docx, --no-pdf, --output, --keep-html)
├── render-pdf.mjs               ← markdown → HTML → PDF via Playwright
└── lib/
    ├── compose-ebook.mjs        ← loadSources + composeXxx functions (cover, TOC, chapters, appendix)
    └── reference.docx           ← pandoc style template (generated by gen-reference-docx.mjs)

style-ebook.css                   ← @page rules, cover, page-number footer, print-only styles
dist/                             ← output: ebook.md, ebook.pdf, ebook.docx

scripts/
├── build-ebook.mjs              ← CLI入口点（--md-only, --no-docx, --no-pdf, --output, --keep-html）
├── render-pdf.mjs               ← markdown → HTML → PDF via Playwright
└── lib/
    ├── compose-ebook.mjs        ← loadSources + composeXxx函数（封面、目录、章节、附录）
    └── reference.docx           ← pandoc样式模板（由gen-reference-docx.mjs生成）

style-ebook.css                   ← @page规则、封面、页码页脚、仅打印样式
dist/                             ←输出文件：ebook.md, ebook.pdf, ebook.docx

Loading Source Data:

window.COURSE

from index.html

加载源数据：从index.html读取

window.COURSE

When the site uses external

course-data.js

, just import it. When it uses inlined COURSE (corporate edition), extract via regex + vm sandbox:

import vm from 'node:vm';
async function loadCourseFromIndexHtml(htmlPath) {
  const html = await fs.readFile(htmlPath, 'utf8');
  const match = html.match(/<script>\s*window\.COURSE\s*=\s*\{[\s\S]*?\};\s*<\/script>/);
  if (!match) throw new Error('window.COURSE 區塊找不到');
  const code = match[0].replace(/<\/?script>/g, '');
  const sandbox = { window: {}, console };
  vm.createContext(sandbox);
  vm.runInContext(code, sandbox);
  return sandbox.window.COURSE;
}

Why vm sandbox not

eval

: vm isolates the script's globals from the build process; an accidental side-effect can't leak.

当网站使用外部

course-data.js

时，直接导入即可。当网站使用内联的COURSE对象（企业版）时，通过正则表达式+vm沙箱提取：

import vm from 'node:vm';
async function loadCourseFromIndexHtml(htmlPath) {
  const html = await fs.readFile(htmlPath, 'utf8');
  const match = html.match(/<script>\s*window\.COURSE\s*=\s*\{[\s\S]*?\};\s*<\/script>/);
  if (!match) throw new Error('window.COURSE 區塊找不到');
  const code = match[0].replace(/<\/?script>/g, '');
  const sandbox = { window: {}, console };
  vm.createContext(sandbox);
  vm.runInContext(code, sandbox);
  return sandbox.window.COURSE;
}

为何使用vm沙箱而非
eval
：vm沙箱可将脚本的全局变量与构建进程隔离；意外的副作用不会泄露到构建进程中。

PDF Rendering: Playwright

page.pdf()

(not Chrome CLI)

PDF渲染：使用Playwright

page.pdf()

（而非Chrome CLI）

Original implementation:

pandoc → HTML → msedge --headless --print-to-pdf

. Switch to Playwright — Chrome CLI doesn't support

footerTemplate

(no page numbers).

import { chromium } from 'playwright';
const browser = await chromium.launch();
const page = await browser.newPage();
await page.goto(`file://${path.resolve(htmlPath)}`, { waitUntil: 'networkidle' });
await page.pdf({
  path: outputPath,
  format: 'A4',
  printBackground: true,
  displayHeaderFooter: true,
  headerTemplate: '<div></div>',
  footerTemplate: `
    <div style="font-size:9pt; width:100%; text-align:center; color:#666;">
      <span class="pageNumber"></span> / <span class="totalPages"></span>
    </div>`,
  margin: { top: '20mm', right: '20mm', bottom: '20mm', left: '20mm' },
});
await browser.close();

最初的实现方案：

pandoc → HTML → msedge --headless --print-to-pdf

。切换为Playwright——Chrome CLI不支持

footerTemplate

（无法添加页码）。

import { chromium } from 'playwright';
const browser = await chromium.launch();
const page = await browser.newPage();
await page.goto(`file://${path.resolve(htmlPath)}`, { waitUntil: 'networkidle' });
await page.pdf({
  path: outputPath,
  format: 'A4',
  printBackground: true,
  displayHeaderFooter: true,
  headerTemplate: '<div></div>',
  footerTemplate: `
    <div style="font-size:9pt; width:100%; text-align:center; color:#666;">
      <span class="pageNumber"></span> / <span class="totalPages"></span>
    </div>`,
  margin: { top: '20mm', right: '20mm', bottom: '20mm', left: '20mm' },
});
await browser.close();

Cover page without page number

无页码的封面页

The cover should be number-less. CSS:

css

@page { margin: 20mm; }
@page :first { margin: 0; }    /* full bleed cover */

Playwright respects

@page :first

— the footer doesn't print on it.

封面不应显示页码。CSS代码如下：

css

@page { margin: 20mm; }
@page :first { margin: 0; }    /* full bleed cover */

Playwright支持

@page :first

规则——封面页不会打印页脚。

Book-Style Layout (style-ebook.css)

书籍风格布局（style-ebook.css）

The example workshop's "book-grade typography" commit was driven by aesthetics, not just formality. Key rules worth porting:

css

/* 章首跨頁分頁 */
h1.chapter { page-break-before: always; }

/* 圖文並排（Grid 排版） */
.figure-grid {
  display: grid;
  grid-template-columns: repeat(2, 1fr);
  gap: 12mm;
}
.figure-grid .three-up { grid-template-columns: repeat(3, 1fr); }

/* 提示詞區塊：橘色 PROMPT 徽章 + 暗底 code */
.prompt-block {
  background: #1a1a1a; color: #e0e0e0;
  border-left: 4px solid #ff8c00;
  padding: 12pt; margin: 8pt 0;
}
.prompt-block::before {
  content: 'PROMPT'; display: inline-block;
  background: #ff8c00; color: white;
  padding: 2pt 8pt; border-radius: 3pt;
  font-size: 8pt; margin-bottom: 6pt;
}

/* 任務勾選方框（純印刷） */
.task::before {
  content: '☐'; margin-right: 8pt;
  font-size: 14pt; color: #888;
}

/* 學習目標綠色徽章 */
.goal::before {
  content: '✓'; margin-right: 6pt;
  color: #2e7d32; font-weight: bold;
}

示例工作坊的“书籍级排版”提交主要为了提升美观度，而非仅满足形式要求。以下是值得复用的关键规则：

css

/* 章首跨頁分頁 */
h1.chapter { page-break-before: always; }

/* 圖文並排（Grid 排版） */
.figure-grid {
  display: grid;
  grid-template-columns: repeat(2, 1fr);
  gap: 12mm;
}
.figure-grid .three-up { grid-template-columns: repeat(3, 1fr); }

/* 提示詞區塊：橘色 PROMPT 徽章 + 暗底 code */
.prompt-block {
  background: #1a1a1a; color: #e0e0e0;
  border-left: 4px solid #ff8c00;
  padding: 12pt; margin: 8pt 0;
}
.prompt-block::before {
  content: 'PROMPT'; display: inline-block;
  background: #ff8c00; color: white;
  padding: 2pt 8pt; border-radius: 3pt;
  font-size: 8pt; margin-bottom: 6pt;
}

/* 任務勾選方框（純印刷） */
.task::before {
  content: '☐'; margin-right: 8pt;
  font-size: 14pt; color: #888;
}

/* 學習目標綠色徽章 */
.goal::before {
  content: '✓'; margin-right: 6pt;
  color: #2e7d32; font-weight: bold;
}

DOCX via pandoc + reference template

通过pandoc+参考模板生成DOCX

const args = [
  mdPath,
  '-f', 'gfm+attributes+raw_html',
  '-t', 'docx',
  '--standalone',
  '--toc', '--toc-depth=2',
  '--resource-path', path.dirname(mdPath),
  '--reference-doc', 'scripts/lib/reference.docx',
  '-o', outPath,
];
spawn('pandoc', args);

reference.docx
is a one-time-generated template with your chosen fonts (e.g. Microsoft JhengHei UI for Chinese), heading colors, paragraph spacing. Generate via

scripts/gen-reference-docx.mjs

— pandoc copies the styles.

Caveat: raw HTML support in DOCX is partial.

<figure-grid>

blocks won't render as grids in Word — they degrade to vertical stack. Use markdown

![alt](path)

syntax for images so they survive both formats.

const args = [
  mdPath,
  '-f', 'gfm+attributes+raw_html',
  '-t', 'docx',
  '--standalone',
  '--toc', '--toc-depth=2',
  '--resource-path', path.dirname(mdPath),
  '--reference-doc', 'scripts/lib/reference.docx',
  '-o', outPath,
];
spawn('pandoc', args);

reference.docx

**是一次性生成的模板，包含你选择的字体（例如中文使用微软正黑体）、标题颜色、段落间距。可通过

scripts/gen-reference-docx.mjs

生成——pandoc会复制其中的样式。

注意事项：DOCX对原生HTML的支持有限。

<figure-grid>

块在Word中不会渲染为网格——会退化为垂直堆叠。请使用markdown的

![alt](path)

语法插入图片，以确保在两种格式中都能正常显示。

Compose Module Anatomy

合成模块结构

compose-ebook.mjs

exports one function per chapter type:

export async function loadSources() { /* read COURSE + materials + assets */ }
export async function composeCover(meta) { /* big title, day list, meta card */ }
export async function composeOverview(meta) { /* TOC + course overview */ }
export async function composeChapter(day, assetIndex) { /* hero image + units */ }
export async function composeAppendixMaterials(materials) { /* full-text material appendix */ }
export async function composeAppendixSkills(skills) { /* QR-coded skill links */ }

Each returns a markdown string.

build-ebook.mjs

concatenates them with

\n\n---\n\n

compose-ebook.mjs

为每种章节类型导出一个函数：

export async function loadSources() { /* read COURSE + materials + assets */ }
export async function composeCover(meta) { /* big title, day list, meta card */ }
export async function composeOverview(meta) { /* TOC + course overview */ }
export async function composeChapter(day, assetIndex) { /* hero image + units */ }
export async function composeAppendixMaterials(materials) { /* full-text material appendix */ }
export async function composeAppendixSkills(skills) { /* QR-coded skill links */ }

每个函数返回一个markdown字符串。

build-ebook.mjs

使用

\n\n---\n\n

将它们拼接在一起。

Asset Path Strategy

资源路径策略

Inside the composed markdown, images use paths relative to the markdown file (

./assets/illustrations/foo.png

). Pandoc's

--resource-path

resolves them; Playwright's

file://

HTML loader does too. No absolute paths.

For corporate editions with asset fallback, resolve the actual file location at compose time and embed the resolved path:

async function pickAsset(name, roots) {
  for (const root of roots) {
    const full = path.join(root, name);
    try { await fs.access(full); return path.relative(mdDir, full); }
    catch { continue; }
  }
  return null;  // caller decides whether to drop the image or fall back to placeholder
}

在合成的markdown文件中，图片使用相对于markdown文件的路径（

./assets/illustrations/foo.png

）。Pandoc的

--resource-path

参数可解析这些路径；Playwright的

file://

HTML加载器也支持这种路径。请勿使用绝对路径。

对于带有资源回退机制的企业版，在合成时解析实际文件位置并嵌入解析后的路径：

async function pickAsset(name, roots) {
  for (const root of roots) {
    const full = path.join(root, name);
    try { await fs.access(full); return path.relative(mdDir, full); }
    catch { continue; }
  }
  return null;  // caller decides whether to drop the image or fall back to placeholder
}

What to Filter Out

需要过滤的内容

Some content lives on the website but should not be in the ebook:

Quiz items: leaking exam questions defeats the assessment. The example workshop filters out the entire quiz chapter.
Pre-test / post-test sections: same reason.
Interactive-only features: "點擊複製" buttons make no sense in print. Render the raw prompt text, not the button.
Dynamic illustrations: anything generated client-side via canvas/JS won't render.

Add filter conditions to the compose functions; don't try to render-then-strip.

有些内容适合在网站上展示，但不应出现在电子书中：

测验内容：泄露考题会削弱评估的有效性。示例工作坊会过滤掉整个测验章节。
预测试/后测试部分：原因同上。
仅交互式功能：“点击复制”按钮在印刷版中毫无意义。应渲染原始提示文本，而非按钮。
动态插图：任何通过canvas/JS在客户端生成的内容都无法被渲染。

请在合成函数中添加过滤条件；不要尝试先渲染再剥离内容。

Verification

验证

After generating, verify:

// scripts/verify-ebook.mjs
// - PDF file exists and is > 100KB
// - First page (cover) has no page number
// - TOC pages have entries
// - Every image reference resolved (no broken-image placeholders)

For DOCX, open in Word once and visually check: TOC linked, fonts applied, no missing-image squares.

生成电子书后，需进行验证：

// scripts/verify-ebook.mjs
// - PDF文件存在且大小>100KB
// -第一页（封面）无页码
// -目录页有正确条目
// -所有图片引用已解析（无损坏图片占位符）

对于DOCX文件，需在Word中打开并进行视觉检查：目录是否可跳转、字体是否正确应用、是否存在缺失图片的占位框。

CLI Conventions

CLI约定

The example workshop's

build:ebook

script supports:

Flag	Behavior
(no flag)	Build both PDF + DOCX
`--md-only`	Stop after composing markdown (debug content)
`--no-docx`	Skip DOCX (faster iteration on PDF)
`--no-pdf`	Skip PDF (faster iteration on DOCX)
`--keep-html`	Don't delete intermediate HTML (debug CSS)
`--output PATH`	Custom PDF output path

Provide these from day one — you'll iterate the layout many times.

示例工作坊的

build:ebook

脚本支持以下参数：

参数	行为
无参数	同时构建PDF + DOCX
`--md-only`	合成markdown后停止（用于调试内容）
`--no-docx`	跳过DOCX生成（加速PDF迭代）
`--no-pdf`	跳过PDF生成（加速DOCX迭代）
`--keep-html`	不删除中间HTML文件（用于调试CSS）
`--output PATH`	自定义PDF输出路径

从项目初期就提供这些参数——你会多次迭代布局。

Anti-Patterns

反模式

Re-authoring content in the ebook builder — the ebook is a derivative. If you find yourself adding new prose in the compose functions, that prose belongs in
```
course-content-authoring
```
instead.
Using Chrome CLI for PDF — switch to Playwright early; you'll want footers eventually.
Hardcoding image paths — use
```
pickAsset()
```
with a list of roots, especially when supporting corporate edition fallbacks.
Forgetting
@page :first { margin: 0 }
— cover ends up with a stray page number in the footer.
Re-running the ebook build every time the site changes — only build at stable checkpoints; the intermediate
```
dist/
```
folder is large.

在电子书构建器中重新创作内容——电子书是衍生产品。如果你发现自己在合成函数中添加新的文本，这些文本应该放在
```
course-content-authoring
```
中。
使用Chrome CLI生成PDF——尽早切换到Playwright；你最终会需要页脚功能。
硬编码图片路径——使用
```
pickAsset()
```
函数并传入根路径列表，尤其是在支持企业版回退机制时。
忘记添加
@page :first { margin: 0 }
——封面页的页脚会出现多余的页码。
网站每次变更都重新生成电子书——仅在稳定的检查点构建；中间产物
```
dist/
```
文件夹体积较大。

Hand-off

交付

When this skill finishes:

```
dist/{course-name}.pdf
```
and optionally
```
.docx
```
are produced.
Verify scripts pass.
If for a client, the file goes into the deliverable bundle.

Tell the user: "ebook ready. The site and ebook are now in sync; from now on, treat the site as canonical and rebuild the ebook only at release milestones — don't edit the ebook directly, it'll get overwritten."

当此Skill执行完成后：

会生成
```
dist/{course-name}.pdf
```
及可选的
```
.docx
```
文件。
验证脚本执行通过。
如果是为客户生成，需将文件加入交付包。

告知用户：“电子书已准备就绪。网站与电子书内容现已同步；从现在起，请将网站视为权威数据源，仅在发布里程碑时重新构建电子书——请勿直接编辑电子书，否则会被覆盖。”

course-ebook-publishing

Original

Translation

Course Ebook Publishing

课程电子书发布

When to Invoke

调用时机

Architecture: Single Source → Two Outputs

架构：单一数据源 → 两种输出格式

File Layout (Standard)

标准文件布局

Loading Source Data:
`window.COURSE`
from index.html

加载源数据：从index.html读取
`window.COURSE`

PDF Rendering: Playwright
`page.pdf()`
(not Chrome CLI)

PDF渲染：使用Playwright
`page.pdf()`
（而非Chrome CLI）

Cover page without page number

无页码的封面页

Book-Style Layout (style-ebook.css)

书籍风格布局（style-ebook.css）

DOCX via pandoc + reference template

通过pandoc+参考模板生成DOCX

Compose Module Anatomy

合成模块结构

Asset Path Strategy

资源路径策略

What to Filter Out

需要过滤的内容

Verification

验证

CLI Conventions

CLI约定

Anti-Patterns

反模式

Hand-off

交付

course-ebook-publishing

Original

Translation

Course Ebook Publishing

课程电子书发布

When to Invoke

调用时机

Architecture: Single Source → Two Outputs

架构：单一数据源 → 两种输出格式

File Layout (Standard)

标准文件布局

Loading Source Data: window.COURSE from index.html

加载源数据：从index.html读取window.COURSE

PDF Rendering: Playwright page.pdf() (not Chrome CLI)

PDF渲染：使用Playwright page.pdf()（而非Chrome CLI）

Cover page without page number

无页码的封面页

Book-Style Layout (style-ebook.css)

书籍风格布局（style-ebook.css）

DOCX via pandoc + reference template

通过pandoc+参考模板生成DOCX

Compose Module Anatomy

合成模块结构

Asset Path Strategy

资源路径策略

What to Filter Out

需要过滤的内容

Verification

验证

CLI Conventions

CLI约定

Anti-Patterns

反模式

Hand-off

交付

Loading Source Data:
`window.COURSE`
from index.html

加载源数据：从index.html读取
`window.COURSE`

PDF Rendering: Playwright
`page.pdf()`
(not Chrome CLI)

PDF渲染：使用Playwright
`page.pdf()`
（而非Chrome CLI）