course-ebook-publishing
Compare original and translation side by side
🇺🇸
Original
English🇨🇳
Translation
ChineseCourse Ebook Publishing
课程电子书发布
Schema authority: this skill reads the liveobject whose shape is defined inwindow.COURSE. Quiz / pre-test / post-test items are filtered OUT of the ebook (per §10 quiz item rules + ebook content policy)._shared/domain-primitives.mdFilename convention (English-first): outputs land in/dist/{name}.pdf. Source markdown is composed underdist/{name}.docx.dist/master.md
This skill turns a finished teaching website into a book — PDF (primary) and optionally DOCX (for editorial review or further authoring). It is a post-site step: it consumes from the live site, never re-authoring content.
window.COURSESchema 权威性:此Skill会读取实时的对象,其结构定义在window.COURSE中。测验/预测试/后测试内容会被过滤出电子书(遵循第10条测验内容规则及电子书内容政策)。_shared/domain-primitives.md文件名约定(英文优先):输出文件将保存至/dist/{name}.pdf。源markdown文件合成后保存于dist/{name}.docx。dist/master.md
此Skill可将已完成的教学网站转换为书籍格式——主要是PDF,也可生成DOCX(用于编辑审阅或进一步创作)。这是一个网站完成后的步骤:它从实时网站读取对象,不会重新创作内容。
window.COURSEWhen to Invoke
调用时机
- Site is feature-complete (Stages 1–5 done) and content is stable.
- Stakeholders ask for a printable hand-out, an archivable record, or a deliverable for non-web channels.
- Sometimes invoked after to produce a client deliverable bundle.
course-corporate-edition
Do NOT invoke when site is mid-development. The ebook pipeline reads ; if the site changes daily, the ebook drifts daily.
window.COURSE- 网站功能完整(阶段1–5已完成)且内容稳定。
- 利益相关方需要可打印的讲义、可存档的记录,或用于非网络渠道的交付物。
- 有时会在之后调用,以生成客户交付包。
course-corporate-edition
请勿调用的情况:网站处于开发中期。电子书流水线读取对象;若网站每日变更,电子书内容也会随之每日偏离。
window.COURSEArchitecture: Single Source → Two Outputs
架构:单一数据源 → 两种输出格式
window.COURSE ─┐
├─→ compose-ebook.mjs ─→ master.md ─→ pandoc ─→ ebook.docx
asset folders ─┤ └─→ Playwright page.pdf() ─→ ebook.pdf
materials/ ─┘Why a single composed markdown intermediate: keeps PDF and DOCX content identical. Pandoc handles markdown → DOCX cleanly; Playwright renders the same markdown via an HTML wrapper → PDF. Two outputs, one source of truth.
window.COURSE ─┐
├─→ compose-ebook.mjs ─→ master.md ─→ pandoc ─→ ebook.docx
asset folders ─┤ └─→ Playwright page.pdf() ─→ ebook.pdf
materials/ ─┘为何使用单一合成的markdown中间件:确保PDF和DOCX内容完全一致。Pandoc可将markdown干净地转换为DOCX;Playwright通过HTML包装器渲染同一markdown文件以生成PDF。两种输出格式共享同一数据源。
File Layout (Standard)
标准文件布局
scripts/
├── build-ebook.mjs ← CLI entry point (--md-only, --no-docx, --no-pdf, --output, --keep-html)
├── render-pdf.mjs ← markdown → HTML → PDF via Playwright
└── lib/
├── compose-ebook.mjs ← loadSources + composeXxx functions (cover, TOC, chapters, appendix)
└── reference.docx ← pandoc style template (generated by gen-reference-docx.mjs)
style-ebook.css ← @page rules, cover, page-number footer, print-only styles
dist/ ← output: ebook.md, ebook.pdf, ebook.docxscripts/
├── build-ebook.mjs ← CLI入口点(--md-only, --no-docx, --no-pdf, --output, --keep-html)
├── render-pdf.mjs ← markdown → HTML → PDF via Playwright
└── lib/
├── compose-ebook.mjs ← loadSources + composeXxx函数(封面、目录、章节、附录)
└── reference.docx ← pandoc样式模板(由gen-reference-docx.mjs生成)
style-ebook.css ← @page规则、封面、页码页脚、仅打印样式
dist/ ←输出文件:ebook.md, ebook.pdf, ebook.docxLoading Source Data: window.COURSE
from index.html
window.COURSE加载源数据:从index.html读取window.COURSE
window.COURSEWhen the site uses external , just import it. When it uses inlined COURSE (corporate edition), extract via regex + vm sandbox:
course-data.jsjs
import vm from 'node:vm';
async function loadCourseFromIndexHtml(htmlPath) {
const html = await fs.readFile(htmlPath, 'utf8');
const match = html.match(/<script>\s*window\.COURSE\s*=\s*\{[\s\S]*?\};\s*<\/script>/);
if (!match) throw new Error('window.COURSE 區塊找不到');
const code = match[0].replace(/<\/?script>/g, '');
const sandbox = { window: {}, console };
vm.createContext(sandbox);
vm.runInContext(code, sandbox);
return sandbox.window.COURSE;
}Why vm sandbox not : vm isolates the script's globals from the build process; an accidental side-effect can't leak.
eval当网站使用外部时,直接导入即可。当网站使用内联的COURSE对象(企业版)时,通过正则表达式+vm沙箱提取:
course-data.jsjs
import vm from 'node:vm';
async function loadCourseFromIndexHtml(htmlPath) {
const html = await fs.readFile(htmlPath, 'utf8');
const match = html.match(/<script>\s*window\.COURSE\s*=\s*\{[\s\S]*?\};\s*<\/script>/);
if (!match) throw new Error('window.COURSE 區塊找不到');
const code = match[0].replace(/<\/?script>/g, '');
const sandbox = { window: {}, console };
vm.createContext(sandbox);
vm.runInContext(code, sandbox);
return sandbox.window.COURSE;
}为何使用vm沙箱而非:vm沙箱可将脚本的全局变量与构建进程隔离;意外的副作用不会泄露到构建进程中。
evalPDF Rendering: Playwright page.pdf()
(not Chrome CLI)
page.pdf()PDF渲染:使用Playwright page.pdf()
(而非Chrome CLI)
page.pdf()Original implementation: . Switch to Playwright — Chrome CLI doesn't support (no page numbers).
pandoc → HTML → msedge --headless --print-to-pdffooterTemplatejs
import { chromium } from 'playwright';
const browser = await chromium.launch();
const page = await browser.newPage();
await page.goto(`file://${path.resolve(htmlPath)}`, { waitUntil: 'networkidle' });
await page.pdf({
path: outputPath,
format: 'A4',
printBackground: true,
displayHeaderFooter: true,
headerTemplate: '<div></div>',
footerTemplate: `
<div style="font-size:9pt; width:100%; text-align:center; color:#666;">
<span class="pageNumber"></span> / <span class="totalPages"></span>
</div>`,
margin: { top: '20mm', right: '20mm', bottom: '20mm', left: '20mm' },
});
await browser.close();最初的实现方案:。切换为Playwright——Chrome CLI不支持(无法添加页码)。
pandoc → HTML → msedge --headless --print-to-pdffooterTemplatejs
import { chromium } from 'playwright';
const browser = await chromium.launch();
const page = await browser.newPage();
await page.goto(`file://${path.resolve(htmlPath)}`, { waitUntil: 'networkidle' });
await page.pdf({
path: outputPath,
format: 'A4',
printBackground: true,
displayHeaderFooter: true,
headerTemplate: '<div></div>',
footerTemplate: `
<div style="font-size:9pt; width:100%; text-align:center; color:#666;">
<span class="pageNumber"></span> / <span class="totalPages"></span>
</div>`,
margin: { top: '20mm', right: '20mm', bottom: '20mm', left: '20mm' },
});
await browser.close();Cover page without page number
无页码的封面页
The cover should be number-less. CSS:
css
@page { margin: 20mm; }
@page :first { margin: 0; } /* full bleed cover */Playwright respects — the footer doesn't print on it.
@page :first封面不应显示页码。CSS代码如下:
css
@page { margin: 20mm; }
@page :first { margin: 0; } /* full bleed cover */Playwright支持规则——封面页不会打印页脚。
@page :firstBook-Style Layout (style-ebook.css)
书籍风格布局(style-ebook.css)
The example workshop's "book-grade typography" commit was driven by aesthetics, not just formality. Key rules worth porting:
css
/* 章首跨頁分頁 */
h1.chapter { page-break-before: always; }
/* 圖文並排(Grid 排版) */
.figure-grid {
display: grid;
grid-template-columns: repeat(2, 1fr);
gap: 12mm;
}
.figure-grid .three-up { grid-template-columns: repeat(3, 1fr); }
/* 提示詞區塊:橘色 PROMPT 徽章 + 暗底 code */
.prompt-block {
background: #1a1a1a; color: #e0e0e0;
border-left: 4px solid #ff8c00;
padding: 12pt; margin: 8pt 0;
}
.prompt-block::before {
content: 'PROMPT'; display: inline-block;
background: #ff8c00; color: white;
padding: 2pt 8pt; border-radius: 3pt;
font-size: 8pt; margin-bottom: 6pt;
}
/* 任務勾選方框(純印刷) */
.task::before {
content: '☐'; margin-right: 8pt;
font-size: 14pt; color: #888;
}
/* 學習目標綠色徽章 */
.goal::before {
content: '✓'; margin-right: 6pt;
color: #2e7d32; font-weight: bold;
}示例工作坊的“书籍级排版”提交主要为了提升美观度,而非仅满足形式要求。以下是值得复用的关键规则:
css
/* 章首跨頁分頁 */
h1.chapter { page-break-before: always; }
/* 圖文並排(Grid 排版) */
.figure-grid {
display: grid;
grid-template-columns: repeat(2, 1fr);
gap: 12mm;
}
.figure-grid .three-up { grid-template-columns: repeat(3, 1fr); }
/* 提示詞區塊:橘色 PROMPT 徽章 + 暗底 code */
.prompt-block {
background: #1a1a1a; color: #e0e0e0;
border-left: 4px solid #ff8c00;
padding: 12pt; margin: 8pt 0;
}
.prompt-block::before {
content: 'PROMPT'; display: inline-block;
background: #ff8c00; color: white;
padding: 2pt 8pt; border-radius: 3pt;
font-size: 8pt; margin-bottom: 6pt;
}
/* 任務勾選方框(純印刷) */
.task::before {
content: '☐'; margin-right: 8pt;
font-size: 14pt; color: #888;
}
/* 學習目標綠色徽章 */
.goal::before {
content: '✓'; margin-right: 6pt;
color: #2e7d32; font-weight: bold;
}DOCX via pandoc + reference template
通过pandoc+参考模板生成DOCX
js
const args = [
mdPath,
'-f', 'gfm+attributes+raw_html',
'-t', 'docx',
'--standalone',
'--toc', '--toc-depth=2',
'--resource-path', path.dirname(mdPath),
'--reference-doc', 'scripts/lib/reference.docx',
'-o', outPath,
];
spawn('pandoc', args);reference.docxscripts/gen-reference-docx.mjsCaveat: raw HTML support in DOCX is partial. blocks won't render as grids in Word — they degrade to vertical stack. Use markdown syntax for images so they survive both formats.
<figure-grid>js
const args = [
mdPath,
'-f', 'gfm+attributes+raw_html',
'-t', 'docx',
'--standalone',
'--toc', '--toc-depth=2',
'--resource-path', path.dirname(mdPath),
'--reference-doc', 'scripts/lib/reference.docx',
'-o', outPath,
];
spawn('pandoc', args);****是一次性生成的模板,包含你选择的字体(例如中文使用微软正黑体)、标题颜色、段落间距。可通过生成——pandoc会复制其中的样式。
reference.docxscripts/gen-reference-docx.mjs注意事项:DOCX对原生HTML的支持有限。块在Word中不会渲染为网格——会退化为垂直堆叠。请使用markdown的语法插入图片,以确保在两种格式中都能正常显示。
<figure-grid>Compose Module Anatomy
合成模块结构
compose-ebook.mjsjs
export async function loadSources() { /* read COURSE + materials + assets */ }
export async function composeCover(meta) { /* big title, day list, meta card */ }
export async function composeOverview(meta) { /* TOC + course overview */ }
export async function composeChapter(day, assetIndex) { /* hero image + units */ }
export async function composeAppendixMaterials(materials) { /* full-text material appendix */ }
export async function composeAppendixSkills(skills) { /* QR-coded skill links */ }Each returns a markdown string. concatenates them with .
build-ebook.mjs\n\n---\n\ncompose-ebook.mjsjs
export async function loadSources() { /* read COURSE + materials + assets */ }
export async function composeCover(meta) { /* big title, day list, meta card */ }
export async function composeOverview(meta) { /* TOC + course overview */ }
export async function composeChapter(day, assetIndex) { /* hero image + units */ }
export async function composeAppendixMaterials(materials) { /* full-text material appendix */ }
export async function composeAppendixSkills(skills) { /* QR-coded skill links */ }每个函数返回一个markdown字符串。使用将它们拼接在一起。
build-ebook.mjs\n\n---\n\nAsset Path Strategy
资源路径策略
Inside the composed markdown, images use paths relative to the markdown file (). Pandoc's resolves them; Playwright's HTML loader does too. No absolute paths.
./assets/illustrations/foo.png--resource-pathfile://For corporate editions with asset fallback, resolve the actual file location at compose time and embed the resolved path:
js
async function pickAsset(name, roots) {
for (const root of roots) {
const full = path.join(root, name);
try { await fs.access(full); return path.relative(mdDir, full); }
catch { continue; }
}
return null; // caller decides whether to drop the image or fall back to placeholder
}在合成的markdown文件中,图片使用相对于markdown文件的路径()。Pandoc的参数可解析这些路径;Playwright的 HTML加载器也支持这种路径。请勿使用绝对路径。
./assets/illustrations/foo.png--resource-pathfile://对于带有资源回退机制的企业版,在合成时解析实际文件位置并嵌入解析后的路径:
js
async function pickAsset(name, roots) {
for (const root of roots) {
const full = path.join(root, name);
try { await fs.access(full); return path.relative(mdDir, full); }
catch { continue; }
}
return null; // caller decides whether to drop the image or fall back to placeholder
}What to Filter Out
需要过滤的内容
Some content lives on the website but should not be in the ebook:
- Quiz items: leaking exam questions defeats the assessment. The example workshop filters out the entire quiz chapter.
- Pre-test / post-test sections: same reason.
- Interactive-only features: "點擊複製" buttons make no sense in print. Render the raw prompt text, not the button.
- Dynamic illustrations: anything generated client-side via canvas/JS won't render.
Add filter conditions to the compose functions; don't try to render-then-strip.
有些内容适合在网站上展示,但不应出现在电子书中:
- 测验内容:泄露考题会削弱评估的有效性。示例工作坊会过滤掉整个测验章节。
- 预测试/后测试部分:原因同上。
- 仅交互式功能:“点击复制”按钮在印刷版中毫无意义。应渲染原始提示文本,而非按钮。
- 动态插图:任何通过canvas/JS在客户端生成的内容都无法被渲染。
请在合成函数中添加过滤条件;不要尝试先渲染再剥离内容。
Verification
验证
After generating, verify:
js
// scripts/verify-ebook.mjs
// - PDF file exists and is > 100KB
// - First page (cover) has no page number
// - TOC pages have entries
// - Every image reference resolved (no broken-image placeholders)For DOCX, open in Word once and visually check: TOC linked, fonts applied, no missing-image squares.
生成电子书后,需进行验证:
js
// scripts/verify-ebook.mjs
// - PDF文件存在且大小>100KB
// -第一页(封面)无页码
// -目录页有正确条目
// -所有图片引用已解析(无损坏图片占位符)对于DOCX文件,需在Word中打开并进行视觉检查:目录是否可跳转、字体是否正确应用、是否存在缺失图片的占位框。
CLI Conventions
CLI约定
The example workshop's script supports:
build:ebook| Flag | Behavior |
|---|---|
| (no flag) | Build both PDF + DOCX |
| Stop after composing markdown (debug content) |
| Skip DOCX (faster iteration on PDF) |
| Skip PDF (faster iteration on DOCX) |
| Don't delete intermediate HTML (debug CSS) |
| Custom PDF output path |
Provide these from day one — you'll iterate the layout many times.
示例工作坊的脚本支持以下参数:
build:ebook| 参数 | 行为 |
|---|---|
| 无参数 | 同时构建PDF + DOCX |
| 合成markdown后停止(用于调试内容) |
| 跳过DOCX生成(加速PDF迭代) |
| 跳过PDF生成(加速DOCX迭代) |
| 不删除中间HTML文件(用于调试CSS) |
| 自定义PDF输出路径 |
从项目初期就提供这些参数——你会多次迭代布局。
Anti-Patterns
反模式
- Re-authoring content in the ebook builder — the ebook is a derivative. If you find yourself adding new prose in the compose functions, that prose belongs in instead.
course-content-authoring - Using Chrome CLI for PDF — switch to Playwright early; you'll want footers eventually.
- Hardcoding image paths — use with a list of roots, especially when supporting corporate edition fallbacks.
pickAsset() - Forgetting — cover ends up with a stray page number in the footer.
@page :first { margin: 0 } - Re-running the ebook build every time the site changes — only build at stable checkpoints; the intermediate folder is large.
dist/
- 在电子书构建器中重新创作内容——电子书是衍生产品。如果你发现自己在合成函数中添加新的文本,这些文本应该放在中。
course-content-authoring - 使用Chrome CLI生成PDF——尽早切换到Playwright;你最终会需要页脚功能。
- 硬编码图片路径——使用函数并传入根路径列表,尤其是在支持企业版回退机制时。
pickAsset() - 忘记添加——封面页的页脚会出现多余的页码。
@page :first { margin: 0 } - 网站每次变更都重新生成电子书——仅在稳定的检查点构建;中间产物文件夹体积较大。
dist/
Hand-off
交付
When this skill finishes:
- and optionally
dist/{course-name}.pdfare produced..docx - Verify scripts pass.
- If for a client, the file goes into the deliverable bundle.
Tell the user: "ebook ready. The site and ebook are now in sync; from now on, treat the site as canonical and rebuild the ebook only at release milestones — don't edit the ebook directly, it'll get overwritten."
当此Skill执行完成后:
- 会生成及可选的
dist/{course-name}.pdf文件。.docx - 验证脚本执行通过。
- 如果是为客户生成,需将文件加入交付包。
告知用户:“电子书已准备就绪。网站与电子书内容现已同步;从现在起,请将网站视为权威数据源,仅在发布里程碑时重新构建电子书——请勿直接编辑电子书,否则会被覆盖。”