cross-linker
Compare original and translation side by side
🇺🇸
Original
English🇨🇳
Translation
ChineseCross-Linker — Automated Wiki Cross-Referencing
Cross-Linker — 自动化Wiki交叉引用工具
You are weaving the wiki's knowledge graph tighter by finding and inserting missing between pages that should reference each other but currently don't.
[[wikilinks]]Follow the Retrieval Primitives table in . Build the registry in Step 1 by grepping frontmatter only (not full pages). Reserve full for the unlinked-mention detection pass, and even there, only read pages whose summaries/titles make them plausible link targets. Blind full-vault reads are what this framework exists to avoid.
llm-wiki/SKILL.mdRead你正在通过查找并插入页面之间缺失的(这些页面本应相互引用但目前没有),来进一步完善wiki的knowledge graph。
[[wikilinks]]请遵循中的检索原语表。 第一步构建注册表时仅grep frontmatter内容(不要扫描完整页面)。把完整的操作留给未链接提及检测环节,即便在该环节,也仅读取摘要/标题看起来是合理链接目标的页面。本框架就是为了避免盲目读取整个 vault 的内容而设计的。
llm-wiki/SKILL.mdReadBefore You Start
开始之前
- Read to get
.envOBSIDIAN_VAULT_PATH - Read to get the full inventory of pages and their one-line descriptions
index.md - Skim to see what was recently ingested (focus linking effort on new pages)
log.md
- 读取获取
.envOBSIDIAN_VAULT_PATH - 读取获取页面完整清单及其单行描述
index.md - 浏览查看最近导入的内容(将链接工作的重点放在新页面上)
log.md
Step 1: Build the Page Registry
步骤1:构建页面注册表
Glob all files in the vault (excluding , ). For each page, extract:
.md_archives/.obsidian/- Filename (without ) — this is the wikilink target
.md - Title from frontmatter
- Aliases from frontmatter (if any)
- Tags from frontmatter
- Category from frontmatter or directory inference
- One-line summary — first sentence or field
title
Build a lookup table:
page_name → { path, title, aliases, tags, summary }This is your "vocabulary" — every entry in this table is a valid wikilink target.
全局匹配vault中所有文件(排除、目录)。为每个页面提取以下信息:
.md_archives/.obsidian/- 文件名(不带后缀)—— 这是wikilink的目标
.md - Title:来自frontmatter
- Aliases:来自frontmatter(如果有的话)
- Tags:来自frontmatter
- Category:来自frontmatter或通过目录推断
- 单行摘要:第一句话或字段
title
构建一个查找表:
page_name → { path, title, aliases, tags, summary }这就是你的“词汇表”——表中的每个条目都是有效的wikilink目标。
Step 2: Scan for Missing Links
步骤2:扫描缺失的链接
For each page in the vault:
-
Read the full content
-
Extract existing wikilinks — find allreferences already present
[[...]] -
Search for unlinked mentions — check if the page's text contains any of these, without being wrapped in:
[[...]]- Page filenames (e.g., the word "MyProject" appears but is missing)
[[projects/my-project/my-project]] - Page titles from frontmatter
- Aliases from frontmatter
- Entity names, project names, concept names from the registry
- Page filenames (e.g., the word "MyProject" appears but
-
Check for semantic connections — pages that share multiple tags or are in the same project directory but don't link to each other
针对vault中的每个页面:
-
读取完整内容
-
提取现有wikilinks——查找所有已经存在的引用
[[...]] -
搜索未链接的提及内容——检查页面文本中是否包含以下内容,但没有被包裹:
[[...]]- 页面文件名(例如出现了单词“MyProject”,但缺少链接)
[[projects/my-project/my-project]] - frontmatter中的页面标题
- frontmatter中的别名
- 注册表中的实体名、项目名、概念名
- 页面文件名(例如出现了单词“MyProject”,但缺少
-
检查语义关联——共享多个标签或属于同一项目目录但没有相互链接的页面
Matching Rules
匹配规则
- Case-insensitive matching for names (e.g., "my-project" matches page )
MyProject - Skip self-references — a page shouldn't link to itself
- Skip common words — don't link "the", "and", generic terms. Only match on distinctive names
- Prefer the shortest unambiguous wikilink path — use not
[[page-name]]when the name is unique across the vault[[full/path/to/page-name]] - Don't link inside code blocks or frontmatter
- Don't double-link — if already appears on the page, don't add another
[[foo]]
- 名称匹配不区分大小写(例如“my-project”匹配页面)
MyProject - 跳过自引用——页面不应该链接到自身
- 跳过通用词汇——不要链接“the”、“and”这类通用术语,仅匹配具有辨识度的名称
- 优先使用最短的无歧义wikilink路径——当名称在整个vault中唯一时,使用而非
[[page-name]][[full/path/to/page-name]] - 不要在代码块或frontmatter内部添加链接
- 不要重复链接——如果页面上已经出现了,不要再添加第二个
[[foo]]
Step 3: Score and Rank Suggestions
步骤3:对建议链接进行评分和排序
Not every possible link is worth adding. Score each candidate using a composite signal, then tag it with a confidence label.
不是所有可能的链接都值得添加。使用复合信号对每个候选链接进行评分,然后为其打上置信度标签。
Scoring
评分规则
| Signal | Points | Example |
|---|---|---|
| Exact name match in text | +4 | "MyProject" appears in body text → link to my-project.md |
| Shared tags (2+) | +2 | Both tagged |
| Same project, no link | +2 | Both under |
| Mentioned entity/concept | +2 | Page mentions "knowledge graphs" → link to |
| Cross-category connection | +2 | Source is in |
| Peripheral→hub reach | +2 | Source page has ≤ 2 total links (peripheral) but target has ≥ 8 (hub) — connecting a loose page to a load-bearing concept |
| Partial name match | +1 | "graph" appears but page is |
| 信号 | 分值 | 示例 |
|---|---|---|
| 文本中精确匹配名称 | +4 | 正文出现“MyProject” → 链接到my-project.md |
| 共享标签(≥2个) | +2 | 两个页面都打了 |
| 同属一个项目且无链接 | +2 | 两个页面都在 |
| 提及的实体/概念 | +2 | 页面提到“knowledge graphs” → 链接到 |
| 跨类别关联 | +2 | 源页面在 |
| 边缘→中心节点连接 | +2 | 源页面总链接数≤2(边缘节点),而目标页面总链接数≥8(中心节点)——将松散页面关联到核心概念 |
| 部分名称匹配 | +1 | 出现“graph”,但对应页面是 |
Confidence labels
置信度标签
Tag each candidate with a confidence label based on its score:
| Score | Label | Action |
|---|---|---|
| ≥ 6 | EXTRACTED | Link is effectively certain — exact mention or very strong match. Apply inline. |
| 3–5 | INFERRED | Link is a reasonable inference — shared context, cross-category, peripheral→hub. Apply inline or as Related section. |
| 1–2 | AMBIGUOUS | Weak or partial match. Skip unless user specifically asks to connect loose pages. |
Only act on EXTRACTED and INFERRED candidates. Include the confidence label in the Cross-Link Report so the user can review INFERRED links before trusting them.
根据得分给每个候选链接打上置信度标签:
| 得分 | 标签 | 操作 |
|---|---|---|
| ≥ 6 | EXTRACTED | 链接几乎是确定的——精确提及或匹配度极高,直接在正文中插入 |
| 3–5 | INFERRED | 链接是合理推断的结果——共享上下文、跨类别、边缘→中心连接,可插入正文或放在相关内容板块 |
| 1–2 | AMBIGUOUS | 匹配度弱或存在歧义,除非用户明确要求关联松散页面,否则跳过 |
仅处理EXTRACTED和INFERRED类别的候选链接。在交叉链接报告中包含置信度标签,方便用户在信任INFERRED类链接之前进行审核。
Step 4: Apply Links
步骤4:添加链接
For each page with missing links:
针对每个存在缺失链接的页面:
4a: Inline linking (preferred)
4a: 正文内链接(优先方式)
Find the first natural mention of the term in the body text and wrap it in wikilinks:
Before:
markdown
This project uses knowledge graphs to connect entities.After:
markdown
This project uses [[concepts/knowledge-graphs|knowledge graphs]] to connect entities.Use the format when the wikilink path differs from the display text.
[[path|display text]]找到术语在正文中第一次自然出现的位置,用wikilink包裹:
修改前:
markdown
This project uses knowledge graphs to connect entities.修改后:
markdown
This project uses [[concepts/knowledge-graphs|knowledge graphs]] to connect entities.当wikilink路径和展示文本不同时,使用格式。
[[path|display text]]4b: Related section (fallback)
4b: 相关内容板块(兜底方式)
If the term isn't mentioned naturally in the body but the pages are semantically related (shared tags, same project), add a section at the bottom of the page:
## Relatedmarkdown
undefined如果术语没有在正文中自然出现,但页面之间存在语义关联(共享标签、同属一个项目),在页面底部添加板块:
## Relatedmarkdown
undefinedRelated
Related
- [[projects/my-project/my-project]] — Also uses AI agents for research automation
- [[concepts/knowledge-graphs]] — Core technique used in this project
If a `## Related` section already exists, append to it. Don't duplicate existing entries.- [[projects/my-project/my-project]] — Also uses AI agents for research automation
- [[concepts/knowledge-graphs]] — Core technique used in this project
如果已经存在`## Related`板块,直接追加内容即可,不要重复已有的条目。Step 5: Report
步骤5:生成报告
Present a summary:
markdown
undefined输出汇总信息:
markdown
undefinedCross-Link Report
Cross-Link Report
Links Added: 23 across 12 pages
Links Added: 23 across 12 pages
| Page | Links Added | Confidence | Type |
|---|---|---|---|
| 3 | EXTRACTED | 2 inline, 1 related |
| 5 | INFERRED | 3 inline, 2 related |
| ... |
| Page | Links Added | Confidence | Type |
|---|---|---|---|
| 3 | EXTRACTED | 2 inline, 1 related |
| 5 | INFERRED | 3 inline, 2 related |
| ... |
Orphan Pages Remaining: 2
Orphan Pages Remaining: 2
- — no incoming or outgoing links found
references/foo.md - — could not find related pages
concepts/bar.md
- — no incoming or outgoing links found
references/foo.md - — could not find related pages
concepts/bar.md
Pages Skipped: 3
Pages Skipped: 3
- ,
index.md— special fileslog.md - — archived content
_archives/*
undefined- ,
index.md— special fileslog.md - — archived content
_archives/*
undefinedStep 6: Update Log
步骤6:更新日志
Append to :
log.md- [TIMESTAMP] CROSS_LINK pages_scanned=N links_added=M pages_modified=P orphans_remaining=Q追加内容到:
log.md- [TIMESTAMP] CROSS_LINK pages_scanned=N links_added=M pages_modified=P orphans_remaining=QTips
提示
- Run after every ingest. New pages are almost always poorly connected. This is the fix.
- Be conservative with inline links. Only link the first natural mention, not every occurrence.
- Don't touch pages in . Those are frozen snapshots.
_archives/ - Respect existing structure. If a page carefully curates its links in a section, add to that section rather than creating a separate
## Key Concepts.## Related - Entity pages are link magnets. An entity like should be linked from almost every project page. Prioritize these.
jane-doe
- 每次内容导入后运行。新页面的关联度几乎都很差,这个工具可以解决这个问题。
- 正文链接要保守。仅链接第一次自然出现的术语,不要每个出现的位置都加链接。
- 不要修改目录下的页面。这些是冻结的快照。
_archives/ - 尊重现有结构。如果页面在板块精心整理了链接,就把新链接添加到该板块,不要单独创建
## Key Concepts板块。## Related - 实体页面是链接核心。类似这类实体页面应该被几乎所有项目页面链接,优先处理这类链接。
jane-doe