wiki-lint

Compare original and translation side by side

🇺🇸

Original

English
🇨🇳

Translation

Chinese

Wiki Lint — Health Audit

Wiki Lint — 健康审计

You are performing a health check on an Obsidian wiki. Your goal is to find and fix structural issues that degrade the wiki's value over time.
Before scanning anything: follow the Retrieval Primitives table in
llm-wiki/SKILL.md
. Prefer frontmatter-scoped greps and section-anchored reads over full-page reads. On a large vault, blindly reading every page to lint it is exactly what this framework is built to avoid.
你正在对Obsidian wiki执行健康检查,目标是查找并修复会随时间降低wiki价值的结构性问题。
扫描前注意: 遵循
llm-wiki/SKILL.md
中的检索原语表。优先使用frontmatter范围的grep和章节锚点读取,而非全页读取。在大型vault中,盲目读取所有页面进行lint正是本框架要避免的操作。

Before You Start

开始之前

  1. Read
    .env
    to get
    OBSIDIAN_VAULT_PATH
  2. Read
    index.md
    for the full page inventory
  3. Read
    log.md
    for recent activity context
  1. 读取
    .env
    获取
    OBSIDIAN_VAULT_PATH
  2. 读取
    index.md
    获取全页面清单
  3. 读取
    log.md
    获取最近活动上下文

Lint Checks

Lint检查项

Run these checks in order. Report findings as you go.
按顺序执行以下检查,过程中同步上报发现的问题。

1. Orphaned Pages

1. 孤立页面

Find pages with zero incoming wikilinks. These are knowledge islands that nothing connects to.
How to check:
  • Glob all
    .md
    files in the vault
  • For each page, Grep the rest of the vault for
    [[page-name]]
    references
  • Pages with zero incoming links (except
    index.md
    and
    log.md
    ) are orphans
How to fix:
  • Identify which existing pages should link to the orphan
  • Add wikilinks in appropriate sections
查找没有任何入站wikilink的页面,这类页面是没有其他内容关联的知识孤岛。
检查方法:
  • 匹配vault中所有
    .md
    文件
  • 针对每个页面,在vault其余内容中grep
    [[page-name]]
    引用
  • index.md
    log.md
    外,入站链接数为0的页面即为孤立页面
修复方法:
  • 确定哪些现有页面应该链接到该孤立页面
  • 在合适的章节添加wikilink

2. Broken Wikilinks

2. 损坏的Wikilink

Find
[[wikilinks]]
that point to pages that don't exist.
How to check:
  • Grep for
    \[\[.*?\]\]
    across all pages
  • Extract the link targets
  • Check if a corresponding
    .md
    file exists
How to fix:
  • If the target was renamed, update the link
  • If the target should exist, create it
  • If the link is wrong, remove or correct it
查找指向不存在页面的
[[wikilinks]]
检查方法:
  • 全库grep匹配
    \[\[.*?\]\]
    格式的链接
  • 提取所有链接目标
  • 检查是否存在对应的
    .md
    文件
修复方法:
  • 如果目标页面已重命名,更新对应链接
  • 如果目标页面理应存在,创建该页面
  • 如果链接本身错误,删除或修正链接

3. Missing Frontmatter

3. 缺失Frontmatter

Every page should have: title, category, tags, sources, created, updated.
How to check:
  • Grep frontmatter blocks (scope to
    ^---
    at file heads) instead of reading every page in full
  • Flag pages missing required fields
How to fix:
  • Add missing fields with reasonable defaults
所有页面都应包含:title、category、tags、sources、created、updated字段。
检查方法:
  • 仅grep文件头部的frontmatter块(范围限定为文件开头的
    ^---
    区间),无需读取全页内容
  • 标记缺失必填字段的页面
修复方法:
  • 补充缺失字段,填写合理的默认值

3a. Missing Summary (soft warning)

3a. 缺失摘要(软警告)

Every page should have a
summary:
frontmatter field — 1–2 sentences, ≤200 chars. This is what cheap retrieval (e.g.
wiki-query
's index-only mode) reads to avoid opening page bodies.
How to check:
  • Grep frontmatter for
    ^summary:
    across the vault
  • Flag pages without it, but as a soft warning, not an error — older pages predating this field are fine; the check exists to nudge ingest skills into filling it on new writes.
  • Also flag pages whose summary exceeds 200 chars.
How to fix:
  • Re-ingest the page, or manually write a short summary (1–2 sentences of the page's content).
所有页面建议包含
summary:
frontmatter字段,内容为1-2句话,长度不超过200字符。轻量检索(例如
wiki-query
的仅索引模式)会读取该字段,无需打开页面主体。
检查方法:
  • 全库grep frontmatter中的
    ^summary:
    字段
  • 标记没有该字段的页面,仅作为软警告,不视为错误——早于该规则上线的旧页面不受影响,该检查仅用于提示内容摄入技能在写入新页面时补充该字段
  • 同时标记摘要长度超过200字符的页面
修复方法:
  • 重新摄入页面,或手动撰写简短摘要(概括页面内容的1-2句话)。

4. Stale Content

4. 过时内容

Pages whose
updated
timestamp is old relative to their sources.
How to check:
  • Compare page
    updated
    timestamps to source file modification times
  • Flag pages where sources have been modified after the page was last updated
updated
时间戳早于其关联源文件修改时间的页面。
检查方法:
  • 对比页面的
    updated
    时间戳和源文件的修改时间
  • 标记源文件修改时间晚于页面最后更新时间的页面

5. Contradictions

5. 内容矛盾

Claims that conflict across pages.
How to check:
  • This requires reading related pages and comparing claims
  • Focus on pages that share tags or are heavily cross-referenced
  • Look for phrases like "however", "in contrast", "despite" that may signal existing acknowledged contradictions vs. unacknowledged ones
How to fix:
  • Add an "Open Questions" section noting the contradiction
  • Reference both sources and their claims
不同页面之间存在冲突的表述。
检查方法:
  • 需要读取相关页面并对比表述内容
  • 重点关注共享标签或存在大量交叉引用的页面
  • 区分已被明确标注的矛盾和未被识别的矛盾,可留意“however”、“in contrast”、“despite”这类可能暗示存在已知矛盾的表述
修复方法:
  • 添加“开放问题”章节标注矛盾点
  • 同时引用两个来源及其对应的表述

6. Index Consistency

6. 索引一致性

Verify
index.md
matches the actual page inventory.
How to check:
  • Compare pages listed in
    index.md
    to actual files on disk
  • Check that summaries in
    index.md
    still match page content
验证
index.md
内容与实际页面清单匹配。
检查方法:
  • 对比
    index.md
    中列出的页面和磁盘上的实际文件
  • 检查
    index.md
    中的摘要是否仍与页面内容匹配

7. Provenance Drift

7. 来源偏移

Check whether pages are being honest about how much of their content is inferred vs extracted. See the Provenance Markers section in
llm-wiki
for the convention.
How to check:
  • For each page with a
    provenance:
    block or any
    ^[inferred]
    /
    ^[ambiguous]
    markers, count sentences/bullets and how many end with each marker
  • Compute rough fractions (
    extracted
    ,
    inferred
    ,
    ambiguous
    )
  • Apply these thresholds:
    • AMBIGUOUS > 15%: flag as "speculation-heavy" — even 1-in-7 claims being genuinely uncertain is a signal the page needs tighter sourcing or should be moved to
      synthesis/
    • INFERRED > 40% with no
      sources:
      in frontmatter
      : flag as "unsourced synthesis" — the page is making connections but has nothing to cite
    • Hub pages (top 10 by incoming wikilink count) with INFERRED > 20%: flag as "high-traffic page with questionable provenance" — errors on hub pages propagate to every page that links to them
    • Drift: if the page has a
      provenance:
      frontmatter block, flag it when any field is more than 0.20 off from the recomputed value
  • Skip pages with no
    provenance:
    frontmatter and no markers — treated as fully extracted by convention
How to fix:
  • For ambiguous-heavy: re-ingest from sources, resolve the uncertain claims, or split speculative content into a
    synthesis/
    page
  • For unsourced synthesis: add
    sources:
    to frontmatter or clearly label the page as synthesis
  • For hub pages with INFERRED > 20%: prioritize for re-ingestion — errors here have the widest blast radius
  • For drift: update the
    provenance:
    frontmatter to match the recomputed values
检查页面是否如实标注了内容的推断占比和提取占比,相关规范见
llm-wiki
中的来源标记章节。
检查方法:
  • 针对所有包含
    provenance:
    块或任意
    ^[inferred]
    /
    ^[ambiguous]
    标记的页面,统计句子/列表项总数,以及各类标记的对应数量
  • 计算大致占比(
    extracted
    inferred
    ambiguous
  • 应用以下阈值:
    • AMBIGUOUS > 15%:标记为“大量推测内容”——哪怕每7条表述里就有1条存疑,也说明页面需要更严谨的来源,或应移动到
      synthesis/
      目录
    • INFERRED > 40%且frontmatter中无
      sources:
      字段
      :标记为“无来源合成内容”——页面做了内容关联但没有可引用的来源
    • 中心页面(入站wikilink数量排名前10)INFERRED占比>20%:标记为“来源存疑的高流量页面”——中心页面的错误会传播到所有关联的页面
    • 偏移:如果页面有
      provenance:
      frontmatter块,当其中任意字段和重新计算的值偏差超过0.20时进行标记
  • 跳过没有
    provenance:
    frontmatter也没有任何标记的页面——按规范默认视为完全提取的内容
修复方法:
  • 歧义内容占比高的页面:从源文件重新摄入,解决存疑表述,或将推测内容拆分到
    synthesis/
    目录下的页面
  • 无来源合成内容:在frontmatter中添加
    sources:
    字段,或明确标注页面为合成内容
  • INFERRED占比>20%的中心页面:优先安排重新摄入——这类页面的错误影响范围最广
  • 来源偏移:更新
    provenance:
    frontmatter字段,匹配重新计算的值

8. Fragmented Tag Clusters

8. 碎片化标签集群

Checks whether pages that share a tag are actually linked to each other. Tags imply a topic cluster; if those pages don't reference each other, the cluster is fragmented — knowledge islands that should be woven together.
How to check:
  • For each tag that appears on ≥ 5 pages:
    • n
      = count of pages with this tag
    • actual_links
      = count of wikilinks between any two pages in this tag group (check both directions)
    • cohesion = actual_links / (n × (n−1) / 2)
  • Flag any tag group where cohesion < 0.15 and n ≥ 5
How to fix:
  • Run the
    cross-linker
    skill targeted at the fragmented tag — it will surface and insert the missing links
  • If a tag group is large (n > 15) and still fragmented, consider splitting it into more specific sub-tags
检查共享同一个标签的页面是否互相链接。标签隐含了主题集群的属性,如果这些页面没有互相引用,说明集群是碎片化的,属于应当被关联起来的知识孤岛。
检查方法:
  • 针对所有出现在≥5个页面上的标签:
    • n
      = 带有该标签的页面总数
    • actual_links
      = 该标签组内任意两个页面之间的wikilink总数(双向统计)
    • cohesion = actual_links / (n × (n−1) / 2)
  • 标记所有n≥5且cohesion<0.15的标签组
修复方法:
  • 针对碎片化标签运行
    cross-linker
    技能——它会自动发现并插入缺失的链接
  • 如果标签组过大(n>15)且仍处于碎片化状态,可考虑拆分为更细分的子标签

Output Format

输出格式

Report findings as a structured list:
markdown
undefined
将发现的问题以结构化列表形式上报:
markdown
undefined

Wiki Health Report

Wiki Health Report

Orphaned Pages (N found)

Orphaned Pages (N found)

  • concepts/foo.md
    — no incoming links
  • concepts/foo.md
    — no incoming links

Broken Wikilinks (N found)

Broken Wikilinks (N found)

  • entities/bar.md:15
    — links to [[nonexistent-page]]
  • entities/bar.md:15
    — links to [[nonexistent-page]]

Missing Frontmatter (N found)

Missing Frontmatter (N found)

  • skills/baz.md
    — missing: tags, sources
  • skills/baz.md
    — missing: tags, sources

Stale Content (N found)

Stale Content (N found)

  • references/paper-x.md
    — source modified 2024-03-10, page last updated 2024-01-05
  • references/paper-x.md
    — source modified 2024-03-10, page last updated 2024-01-05

Contradictions (N found)

Contradictions (N found)

  • concepts/scaling.md
    claims "X" but
    synthesis/efficiency.md
    claims "not X"
  • concepts/scaling.md
    claims "X" but
    synthesis/efficiency.md
    claims "not X"

Index Issues (N found)

Index Issues (N found)

  • concepts/new-page.md
    exists on disk but not in index.md
  • concepts/new-page.md
    exists on disk but not in index.md

Missing Summary (N found — soft)

Missing Summary (N found — soft)

  • concepts/foo.md
    — no
    summary:
    field
  • entities/bar.md
    — summary exceeds 200 chars
  • concepts/foo.md
    — no
    summary:
    field
  • entities/bar.md
    — summary exceeds 200 chars

Provenance Issues (N found)

Provenance Issues (N found)

  • concepts/scaling.md
    — AMBIGUOUS > 15%: 22% of claims are ambiguous (re-source or move to synthesis/)
  • entities/some-tool.md
    — drift: frontmatter says inferred=0.10, recomputed=0.45
  • concepts/transformers.md
    — hub page (31 incoming links) with INFERRED=28%: errors here propagate widely
  • synthesis/speculation.md
    — unsourced synthesis: no
    sources:
    field, 55% inferred
  • concepts/scaling.md
    — AMBIGUOUS > 15%: 22% of claims are ambiguous (re-source or move to synthesis/)
  • entities/some-tool.md
    — drift: frontmatter says inferred=0.10, recomputed=0.45
  • concepts/transformers.md
    — hub page (31 incoming links) with INFERRED=28%: errors here propagate widely
  • synthesis/speculation.md
    — unsourced synthesis: no
    sources:
    field, 55% inferred

Fragmented Tag Clusters (N found)

Fragmented Tag Clusters (N found)

  • #systems — 7 pages, cohesion=0.06 ⚠️ — run cross-linker on this tag
  • #databases — 5 pages, cohesion=0.10 ⚠️
undefined
  • #systems — 7 pages, cohesion=0.06 ⚠️ — run cross-linker on this tag
  • #databases — 5 pages, cohesion=0.10 ⚠️
undefined

After Linting

Lint完成后

Append to
log.md
:
- [TIMESTAMP] LINT issues_found=N orphans=X broken_links=Y stale=Z contradictions=W prov_issues=P missing_summary=S fragmented_clusters=F
Offer to fix issues automatically or let the user decide which to address.
追加内容到
log.md
- [TIMESTAMP] LINT issues_found=N orphans=X broken_links=Y stale=Z contradictions=W prov_issues=P missing_summary=S fragmented_clusters=F
可向用户提议自动修复所有问题,或让用户选择需要处理的问题。