web-content-audit

Compare original and translation side by side

🇺🇸

Original

English
🇨🇳

Translation

Chinese

Web Content Audit

Web内容审计

Schema authority: cross-file consistency rules (3-place sync for materials, task ID stability, quiz renumber trap) are codified in
_shared/domain-primitives.md
§13. Audit scripts in this skill operationalise those rules.
Filename convention (English-first): audit reports land under
data/audit-*.md
; scripts under
scripts/audit-*.mjs
.
This skill produces audit scripts that read source files and deployed data, compare them, and emit a human-readable report. The goal is not to gatekeep deployment (that's verification's job) — it's to surface drift that humans should review.
Schema 权威来源: 跨文件一致性规则(素材三方同步、任务ID稳定性、测验编号重命名陷阱)已编入
_shared/domain-primitives.md
第13节。本技能中的审计脚本将这些规则落地执行。
文件名约定(英文优先): 审计报告存放在
data/audit-*.md
下;脚本存放在
scripts/audit-*.mjs
下。
本技能生成审计脚本,读取源文件与已部署数据并进行对比,输出人类可读的报告。其目标并非阻止部署(那是验证的职责),而是呈现需要人工审核的内容偏差。

When to Invoke

调用时机

  • Before a major content release (catch outline ↔ course-data ↔ visuals drift before students see it).
  • After adding/removing units / materials / quiz items (re-check the three-place sync rule).
  • Before handing off a corporate edition to a client (confirm the condensed COURSE matches its source).
  • When the user senses "something's off" but can't see a runtime bug — audit is for the invisible drift.
  • When planning the next iteration (e.g. "which sections need new illustrations?").
Do NOT invoke when the question is "does the page render correctly?" — that's
web-visual-verification
.
  • 重大内容发布前(在学生看到之前,捕捉大纲 ↔ course-data ↔ 视觉素材之间的偏差)。
  • 添加/移除单元/素材/测验题目后(重新检查三方同步规则)。
  • 向客户交付企业版课程前(确认精简后的COURSE与源内容一致)。
  • 用户感觉「哪里不对」但未发现运行时bug时——审计针对的是隐形偏差。
  • 规划下一轮迭代时(例如「哪些章节需要新增插图?」)。
请勿调用当问题是「页面渲染是否正常?」时——那是
web-visual-verification
的职责。

Audit vs Verify — Operating Mode Difference

审计 vs 验证——运行模式差异

TraitAuditVerify
OutputMarkdown report (human reads)Pass/fail (CI gates)
ToolsFile reads, regex, JSON diffs, optionally PlaywrightPlaywright + asserts
Failure modeAlways exits 0; report tells human what's worth fixingExits non-zero on bad state
When runManually, at milestonesOn every PR / pre-deploy
Mental model"Show me what's drifted""Don't let this regress"
If your script ends with
assert.*
or
process.exit(1)
, it's not an audit — move it to
web-visual-verification
.
特性审计验证
输出Markdown报告(供人工阅读)通过/失败结果(作为CI门禁)
工具文件读取、正则表达式、JSON对比,可选PlaywrightPlaywright + 断言
失败模式始终以0状态退出;报告告知人工哪些内容需要修复状态异常时以非0状态退出
运行时机手动触发,在里程碑节点每个PR/预部署阶段
核心思路「告诉我哪里出现了偏差」「防止出现回退」
如果你的脚本以
assert.*
process.exit(1)
结尾,那它不属于审计——请将其移至
web-visual-verification

The Five Audit Categories

五大审计类别

Audit 1: Cross-Artifact Reference Resolution

审计1:跨工件引用解析

The classic teaching-site bug:
course-data.js:materials[]
references a file the disk doesn't have, or
getMaterialUrl()
is missing a rule.
js
// scripts/audit-material-references.mjs
import fs from 'node:fs/promises';
import path from 'node:path';

// Read course-data.js (via vm or regex extract)
const COURSE = await loadCourse('./course-data.js');

const report = ['# Material reference audit\n'];
for (const m of COURSE.materials) {
  // 1. Does the file router know about it?
  const routerCovers = await checkRouterCoverage(m.name);
  // 2. Does the actual file exist?
  const fileExists = await checkFileExists(m);
  // 3. Is it referenced in any unit's materials[]?
  const referencedInUnit = findReferencingUnits(COURSE, m.id);

  if (!routerCovers || !fileExists || referencedInUnit.length === 0) {
    report.push(`## ⚠️ ${m.id}: ${m.name}`);
    report.push(`- Router: ${routerCovers ? '✅' : '❌'}`);
    report.push(`- File: ${fileExists ? '✅' : '❌'}`);
    report.push(`- Referenced in units: ${referencedInUnit.join(', ') || '❌ NONE'}`);
  }
}
await fs.writeFile('data/audit-materials.md', report.join('\n'));
The output goes to
data/audit-*.md
; the human opens it and decides what to fix.
教学网站的典型bug:
course-data.js:materials[]
引用了磁盘上不存在的文件,或者
getMaterialUrl()
缺少对应规则。
js
// scripts/audit-material-references.mjs
import fs from 'node:fs/promises';
import path from 'node:path';

// Read course-data.js (via vm or regex extract)
const COURSE = await loadCourse('./course-data.js');

const report = ['# Material reference audit\n'];
for (const m of COURSE.materials) {
  // 1. Does the file router know about it?
  const routerCovers = await checkRouterCoverage(m.name);
  // 2. Does the actual file exist?
  const fileExists = await checkFileExists(m);
  // 3. Is it referenced in any unit's materials[]?
  const referencedInUnit = findReferencingUnits(COURSE, m.id);

  if (!routerCovers || !fileExists || referencedInUnit.length === 0) {
    report.push(`## ⚠️ ${m.id}: ${m.name}`);
    report.push(`- Router: ${routerCovers ? '✅' : '❌'}`);
    report.push(`- File: ${fileExists ? '✅' : '❌'}`);
    report.push(`- Referenced in units: ${referencedInUnit.join(', ') || '❌ NONE'}`);
  }
}
await fs.writeFile('data/audit-materials.md', report.join('\n'));
输出文件存至
data/audit-*.md
;人工打开后决定需要修复的内容。

Audit 2: Asset Coverage / Gap Analysis

审计2:资产覆盖/缺口分析

"Which sections still have no illustration?" — guide future work, don't fail anything.
js
// scripts/audit-illustrations.mjs
// Walks every .chapter / .accordion-content; counts text length, image count, list density
// For each section: if (textLength > 500 && imageCount === 0) → candidate for illustration
// Output: data/illustration-audit.md (a markdown list with section titles + screenshots)
This may use Playwright for screenshots (capture mode — no assertions) but the product is the gap list, not the screenshots. Don't conflate this with
web-visual-verification
's
capture-*.mjs
which has no analytical output.
「哪些章节仍无插图?」——为后续工作提供指导,而非判定失败。
js
// scripts/audit-illustrations.mjs
// Walks every .chapter / .accordion-content; counts text length, image count, list density
// For each section: if (textLength > 500 && imageCount === 0) → candidate for illustration
// Output: data/illustration-audit.md (a markdown list with section titles + screenshots)
此脚本可能使用Playwright进行截图(仅捕获模式——无断言),但核心产出是缺口列表,而非截图。请勿将其与
web-visual-verification
中无分析输出的
capture-*.mjs
混淆。

Audit 3: Inlined Data ↔ Source Drift (Corporate Edition)

审计3:内联数据 ↔ 源内容偏差(企业版)

When
window.COURSE
is inlined in
index.html
(corporate edition), it can drift from the source files it was condensed from. Audit the diff:
js
// scripts/audit-corp-content.mjs
const inlined = await loadCourseFromIndexHtml('corporate-editions/client_6h/index.html');
const source = await loadSourceMarkdown('course-package/');

const report = ['# Corporate edition content audit\n'];

// Compare meta
report.push(`## Meta`);
report.push(diff(inlined.meta, source.meta));

// Per-unit: deliverables / prompts / tasks count
for (const day of ['day1', 'day2']) {
  for (const unit of inlined[day].units) {
    report.push(`## ${unit.id}: ${unit.title}`);
    report.push(`- Prompts: ${unit.prompts?.length || 0}`);
    report.push(`- Tasks: ${unit.tasks?.length || 0}`);
    report.push(`- Materials: ${unit.materials?.length || 0}`);
  }
}
await fs.writeFile('data/audit-corp-content.md', report.join('\n'));
The example workshop's
audit-corp-6h-content.mjs
does exactly this — produces a markdown dump for the instructor to skim before each corporate session.
window.COURSE
内联至
index.html
(企业版)时,可能与它所精简自的源文件产生偏差。审计两者的差异:
js
// scripts/audit-corp-content.mjs
const inlined = await loadCourseFromIndexHtml('corporate-editions/client_6h/index.html');
const source = await loadSourceMarkdown('course-package/');

const report = ['# Corporate edition content audit\n'];

// Compare meta
report.push(`## Meta`);
report.push(diff(inlined.meta, source.meta));

// Per-unit: deliverables / prompts / tasks count
for (const day of ['day1', 'day2']) {
  for (const unit of inlined[day].units) {
    report.push(`## ${unit.id}: ${unit.title}`);
    report.push(`- Prompts: ${unit.prompts?.length || 0}`);
    report.push(`- Tasks: ${unit.tasks?.length || 0}`);
    report.push(`- Materials: ${unit.materials?.length || 0}`);
  }
}
await fs.writeFile('data/audit-corp-content.md', report.join('\n'));
示例工作坊中的
audit-corp-6h-content.mjs
正是如此——生成Markdown输出供讲师在每次企业课程前快速浏览。

Audit 4: ID Stability & Reuse

审计4:ID稳定性与复用

task.id
,
quiz.id
,
unit.id
are stable contracts (localStorage keys, internal references). Audit them:
js
// scripts/audit-id-stability.mjs
// 1. Collect all task IDs across course-data.js — check no duplicates
// 2. Read git log for course-data.js — flag any task ID that was renamed (not just deleted)
// 3. Cross-check quiz[N].sourceUnit references actual unit IDs
// 4. Cross-check materials[].id appears in at least one unit's materials[]
This is the audit that catches "we accidentally renamed
d2-u3-t1
to
d2-u3-task1
" — silent localStorage-progress-wipe across all students.
task.id
quiz.id
unit.id
是稳定的约定(localStorage键、内部引用)。对其进行审计:
js
// scripts/audit-id-stability.mjs
// 1. Collect all task IDs across course-data.js — check no duplicates
// 2. Read git log for course-data.js — flag any task ID that was renamed (not just deleted)
// 3. Cross-check quiz[N].sourceUnit references actual unit IDs
// 4. Cross-check materials[].id appears in at least one unit's materials[]
此审计可捕获「我们意外将
d2-u3-t1
重命名为
d2-u3-task1
」这类问题——这会导致所有学生的localStorage进度被静默清除。

Audit 5: Produced Artifact Inspection

审计5:生成工件检查

When the pipeline produces a binary deliverable (DOCX, PDF, zip), audit its inner structure to catch broken builds early.
js
// scripts/inspect-docx-images.mjs
import JSZip from 'jszip';
const zip = await JSZip.loadAsync(await readFile('dist/ebook.docx'));
const images = Object.keys(zip.files).filter(f => f.startsWith('word/media/'));
console.log(`📷 內嵌圖片:${images.length}`);
for (const f of images) {
  const size = zip.files[f]._data?.uncompressedSize || 0;
  console.log(`   ${f}  ${(size / 1024).toFixed(1)} KB`);
}
If you expect ~30 images and the DOCX has 3, something in the build went wrong silently. This is faster than opening Word and scrolling.
当流水线生成二进制交付物(DOCX、PDF、压缩包)时,检查其内部结构以尽早发现构建故障。
js
// scripts/inspect-docx-images.mjs
import JSZip from 'jszip';
const zip = await JSZip.loadAsync(await readFile('dist/ebook.docx'));
const images = Object.keys(zip.files).filter(f => f.startsWith('word/media/'));
console.log(`📷 內嵌圖片:${images.length}`);
for (const f of images) {
  const size = zip.files[f]._data?.uncompressedSize || 0;
  console.log(`   ${f}  ${(size / 1024).toFixed(1)} KB`);
}
如果你预期约30张图片但DOCX中只有3张,说明构建过程中出现了静默故障。这种检查比打开Word滚动查看更快。

Output Convention

输出约定

data/
├── audit-materials.md            ← Audit 1
├── audit-illustrations.md         ← Audit 2 (gap analysis)
├── audit/section-*.png            ←   ↳ supporting screenshots
├── audit-corp-content.md          ← Audit 3
├── audit-id-stability.md          ← Audit 4
└── audit-ebook-inspection.txt     ← Audit 5
All audit outputs are versionable markdown (or JSON). Commit them at release milestones; the diff between releases tells you "what content actually changed".
data/
├── audit-materials.md            ← 审计1
├── audit-illustrations.md         ← 审计2(缺口分析)
├── audit/section-*.png            ←   ↳ 配套截图
├── audit-corp-content.md          ← 审计3
├── audit-id-stability.md          ← 审计4
└── audit-ebook-inspection.txt     ← 审计5
所有审计输出均为可版本化的Markdown(或JSON)。在发布里程碑节点提交这些文件;不同版本间的差异可告知你「实际变更了哪些内容」。

Format the Report for Human Skimming

为人工快速浏览格式化报告

Auditors will skim, not read. Structure for skimming:
markdown
undefined
审核人员会快速浏览而非逐字阅读。请为快速浏览设计报告结构:
markdown
undefined

{Audit name} — {timestamp}

{审计名称} — {时间戳}

Summary

摘要

  • ✅ 24 of 30 materials fully wired
  • ⚠️ 4 materials missing router rules
  • ❌ 2 materials referenced but file missing
  • ✅ 30项素材中有24项已完全配置
  • ⚠️ 4项素材缺少路由规则
  • ❌ 2项素材被引用但文件不存在

⚠️ Warnings (need attention but not blocking)

⚠️ 警告(需关注但不阻塞)

...
...

❌ Errors (likely broken in production)

❌ 错误(生产环境可能已损坏)

...
...

ℹ️ Notes (FYI)

ℹ️ 说明(仅供参考)

...

Lead with the summary and emoji-prefix every section. The audit's job is to be the world's most efficient code review.
...

以摘要开头,为每个章节添加 emoji 前缀。审计的职责是成为最高效的代码审查工具。

The "Three-Place Sync" Audit (Critical for Teaching Sites)

「三方同步」审计(教学网站关键)

The pattern from
static-spa-conversion
: every material must appear in (1) filesystem, (2)
course-data.js:materials[]
, (3)
getMaterialUrl()
router. Audit script:
js
const fsFiles = new Set(await fs.readdir('course-package/materials/'));
const dataMaterials = new Set(COURSE.materials.map(m => routerNameToFile(m.name)));
const routerCoverage = extractRouterRules(indexHtmlContent);

const inFsNotInData = [...fsFiles].filter(f => !dataMaterials.has(f));
const inDataNotInFs = [...dataMaterials].filter(f => !fsFiles.has(f));
const inDataNotInRouter = [...dataMaterials].filter(f => !routerCoverage.has(f));

// Emit a 3-column markdown table showing each file's coverage in each of the three places.
This single script catches 90% of "the material link is broken" bugs.
来自
static-spa-conversion
的模式:每个素材必须出现在三个位置:(1) 文件系统,(2)
course-data.js:materials[]
,(3)
getMaterialUrl()
路由。审计脚本:
js
const fsFiles = new Set(await fs.readdir('course-package/materials/'));
const dataMaterials = new Set(COURSE.materials.map(m => routerNameToFile(m.name)));
const routerCoverage = extractRouterRules(indexHtmlContent);

const inFsNotInData = [...fsFiles].filter(f => !dataMaterials.has(f));
const inDataNotInFs = [...dataMaterials].filter(f => !fsFiles.has(f));
const inDataNotInRouter = [...dataMaterials].filter(f => !routerCoverage.has(f));

// Emit a 3-column markdown table showing each file's coverage in each of the three places.
这单个脚本可捕获90%的「素材链接损坏」类bug。

Audit Doesn't Block — It Informs

审计不阻塞——仅提供信息

The strongest discipline: audit scripts always exit 0, even when they find problems. Why:
  • Audits run regularly; you don't want CI red because of a known "we'll fix in next release" issue.
  • Audits surface judgement calls (e.g. "section has no illustration — is that intentional?") that automation shouldn't decide.
  • A failing audit feels like a verify; humans then treat audits as gates and the report falls into "didn't read it" hell.
If an issue MUST block release, promote it to a verify script (
web-visual-verification
) with
assert.*
. Audits inform; verifies gate.
最严格的准则:审计脚本始终以0状态退出,即使发现问题。原因如下:
  • 审计会定期运行;你不希望CI因已知「将在下一版本修复」的问题而变红。
  • 审计呈现的是需要主观判断的问题(例如「章节无插图——是否是有意为之?」),这类问题不应由自动化工具决定。
  • 失败的审计会被当作验证;人工会将审计视为门禁,进而导致报告被「忽略不读」。
如果某个问题必须阻止发布,请将其升级为验证脚本(
web-visual-verification
)并添加
assert.*
断言。审计提供信息;验证充当门禁。

When to Audit (Cadence)

审计时机(节奏)

TriggerAudits to run
Adding/removing a unitid-stability, material-references
Adding a material filematerial-references (esp. the three-place sync)
Renumbering quizid-stability + manual check of hardcoded
(N題)
strings
Before corporate edition handoffcorp-content (inline ↔ source) + material-references
Before publishing ebookinspect-docx-images / inspect-pdf-pagecount
Quarterly maintenanceillustrations (gap analysis) → plan next content cycle
触发事件需运行的审计
添加/移除单元ID稳定性、素材引用
添加素材文件素材引用(尤其是三方同步)
重新编号测验ID稳定性 + 手动检查硬编码的
(N題)
字符串
交付企业版前企业版内容(内联 ↔ 源内容) + 素材引用
发布电子书前DOCX图片检查 / PDF页数检查
季度维护插图(缺口分析)→ 规划下一内容周期

Anti-Patterns

反模式

  • Audit scripts with
    assert.*
    / non-zero exits
    — that's a verify, not an audit. Move it.
  • Audit reports that nobody reads — keep them short, lead with summary counts, use emojis aggressively for scanability.
  • One audit doing five jobs — split.
    audit-materials.mjs
    and
    audit-illustrations.mjs
    produce different reports to different folders for different decisions.
  • Running audits in CI as if they were verifies — they become noise, get muted, become useless. Run on demand or at release milestones.
  • No timestamp / version in the report — when comparing two audit runs, you need to know which is newer.
  • assert.*
    /非0退出的审计脚本
    ——那是验证,不是审计。请迁移。
  • 无人阅读的审计报告——保持简短,以统计摘要开头,大量使用emoji提升可读性。
  • 一个审计承担五项职责——拆分。
    audit-materials.mjs
    audit-illustrations.mjs
    应生成不同报告并存至不同文件夹,服务于不同决策。
  • 在CI中像运行验证一样运行审计——它们会变成噪音,被屏蔽,最终失去作用。按需运行或在发布里程碑节点运行。
  • 报告中无时间戳/版本信息——对比两次审计结果时,你需要知道哪个是最新的。

Hand-off

交付

When this skill finishes:
  • A set of
    audit-*.mjs
    (and supporting
    inspect-*.mjs
    ) scripts is in
    scripts/
    .
  • Their outputs land in
    data/audit-*.md
    (or
    .json
    for tooling consumption).
  • A README section in the project explains "how to read an audit report" (link to the convention above).
  • The user knows to run audits at release milestones, not on every commit.
当本技能完成时:
  • 一组
    audit-*.mjs
    (及配套的
    inspect-*.mjs
    )脚本已存至
    scripts/
  • 它们的输出存至
    data/audit-*.md
    (或供工具消费的
    .json
    )。
  • 项目中的README章节已说明「如何阅读审计报告」(链接至上述约定)。
  • 用户已了解需在发布里程碑节点运行审计,而非每次提交都运行。