llm-wiki

Compare original and translation side by side

🇺🇸

Original

English

🇨🇳

Translation

Chinese

Karpathy's LLM Wiki

Karpathy的LLM Wiki

Build and maintain a persistent, compounding knowledge base as interlinked markdown files. Based on Andrej Karpathy's LLM Wiki pattern.

Unlike traditional RAG (which rediscovers knowledge from scratch per query), the wiki compiles knowledge once and keeps it current. Cross-references are already there. Contradictions have already been flagged. Synthesis reflects everything ingested.

Division of labor: The human curates sources and directs analysis. The agent summarizes, cross-references, files, and maintains consistency.

以互联的Markdown文件形式构建并维护一个持久化、可累积的知识库。基于Andrej Karpathy的LLM Wiki模式。

与传统RAG（每次查询都从零开始重新发现知识）不同，该Wiki 只需编译一次知识并保持内容更新，且已内置交叉引用，矛盾点已被标记，内容整合反映了所有已纳入的信息。

分工机制：人类负责筛选来源并指导分析，Agent负责总结、交叉引用、归档并保持内容一致性。

When This Skill Activates

该Skill触发场景

Use this skill when the user:

Asks to create, build, or start a wiki or knowledge base
Asks to ingest, add, or process a source into their wiki
Asks a question and an existing wiki is present at the configured path
Asks to lint, audit, or health-check their wiki
References their wiki, knowledge base, or "notes" in a research context

当用户出现以下需求时激活此Skill：

请求创建、搭建或启动Wiki/知识库
请求将来源内容纳入、添加或处理至其Wiki
提出问题且配置路径下已有Wiki存在
请求对Wiki进行检查、审计或健康诊断
在研究场景中提及自己的Wiki、知识库或“笔记”

Wiki Location

Wiki存储位置

Location: Set via

WIKI_PATH

environment variable (e.g. in

~/.hermes/.env

If unset, defaults to

~/wiki

bash

WIKI="${WIKI_PATH:-$HOME/wiki}"

The wiki is just a directory of markdown files — open it in Obsidian, VS Code, or any editor. No database, no special tooling required.

位置：通过

WIKI_PATH

环境变量设置（例如在

~/.hermes/.env

中）。

若未设置，默认路径为

~/wiki

。

bash

WIKI="${WIKI_PATH:-$HOME/wiki}"

Wiki仅为Markdown文件目录，可在Obsidian、VS Code或任意编辑器中打开，无需数据库或特殊工具。

Architecture: Three Layers

架构：三层结构

wiki/
├── SCHEMA.md           # Conventions, structure rules, domain config
├── index.md            # Sectioned content catalog with one-line summaries
├── log.md              # Chronological action log (append-only, rotated yearly)
├── raw/                # Layer 1: Immutable source material
│   ├── articles/       # Web articles, clippings
│   ├── papers/         # PDFs, arxiv papers
│   ├── transcripts/    # Meeting notes, interviews
│   └── assets/         # Images, diagrams referenced by sources
├── entities/           # Layer 2: Entity pages (people, orgs, products, models)
├── concepts/           # Layer 2: Concept/topic pages
├── comparisons/        # Layer 2: Side-by-side analyses
└── queries/            # Layer 2: Filed query results worth keeping

Layer 1 — Raw Sources: Immutable. The agent reads but never modifies these. Layer 2 — The Wiki: Agent-owned markdown files. Created, updated, and cross-referenced by the agent. Layer 3 — The Schema:

SCHEMA.md

defines structure, conventions, and tag taxonomy.

wiki/
├── SCHEMA.md           # 约定规则、结构规范、领域配置
├── index.md            # 带单行摘要的分区内容目录
├── log.md              # 按时间顺序的操作日志（仅追加，按年轮换）
├── raw/                # 第一层：不可变的原始素材
│   ├── articles/       # 网页文章、剪报
│   ├── papers/         # PDF、arxiv论文
│   ├── transcripts/    # 会议记录、访谈内容
│   └── assets/         # 素材引用的图片、图表
├── entities/           # 第二层：实体页面（人物、组织、产品、模型）
├── concepts/           # 第二层：概念/主题页面
├── comparisons/        # 第二层：对比分析页面
└── queries/            # 第二层：值得留存的查询结果归档

第一层——原始素材：不可变，Agent仅读取但绝不修改。 第二层——Wiki主体：Agent管理的Markdown文件，由Agent创建、更新并添加交叉引用。 第三层——Schema：

SCHEMA.md

定义结构、约定规则和标签分类体系。

Resuming an Existing Wiki (CRITICAL — do this every session)

恢复现有Wiki（关键——每次会话必做）

When the user has an existing wiki, always orient yourself before doing anything:

① Read
SCHEMA.md
— understand the domain, conventions, and tag taxonomy. ② Read
index.md
— learn what pages exist and their summaries. ③ Scan recent
log.md
— read the last 20-30 entries to understand recent activity.

bash

WIKI="${WIKI_PATH:-$HOME/wiki}"

当用户已有Wiki时，执行任何操作前务必先完成定位：

① 阅读
SCHEMA.md
——了解领域范围、约定规则和标签分类体系。 ② 阅读
index.md
——了解已存在的页面及其摘要。 ③ 浏览
log.md
近期内容——查看最后20-30条记录，了解近期操作。

bash

WIKI="${WIKI_PATH:-$HOME/wiki}"

Orientation reads at session start

会话开始时执行定位读取

read_file "$WIKI/SCHEMA.md" read_file "$WIKI/index.md" read_file "$WIKI/log.md" offset=<last 30 lines>


Only after orientation should you ingest, query, or lint. This prevents:
- Creating duplicate pages for entities that already exist
- Missing cross-references to existing content
- Contradicting the schema's conventions
- Repeating work already logged

For large wikis (100+ pages), also run a quick `search_files` for the topic
at hand before creating anything new.

read_file "$WIKI/SCHEMA.md" read_file "$WIKI/index.md" read_file "$WIKI/log.md" offset=<last 30 lines>


完成定位后才能进行内容纳入、查询或检查操作，避免：
- 为已存在的实体创建重复页面
- 遗漏与现有内容的交叉引用
- 违反Schema约定规则
- 重复已记录的工作

对于大型Wiki（100+页面），创建新内容前还需针对相关主题快速执行`search_files`搜索。

Initializing a New Wiki

初始化新Wiki

When the user asks to create or start a wiki:

Determine the wiki path (from
```
$WIKI_PATH
```
env var, or ask the user; default
```
~/wiki
```
)
Create the directory structure above
Ask the user what domain the wiki covers — be specific
Write
```
SCHEMA.md
```
customized to the domain (see template below)
Write initial
```
index.md
```
with sectioned header
Write initial
```
log.md
```
with creation entry
Confirm the wiki is ready and suggest first sources to ingest

当用户请求创建或启动Wiki时：

确定Wiki路径（从
```
$WIKI_PATH
```
环境变量获取，或询问用户；默认
```
~/wiki
```
）
创建上述目录结构
询问用户Wiki覆盖的具体领域
编写适配该领域的
```
SCHEMA.md
```
（见下方模板）
编写带分区标题的初始
```
index.md
```
编写带创建记录的初始
```
log.md
```
确认Wiki已就绪，并建议首批纳入的素材来源

SCHEMA.md Template

SCHEMA.md模板

Adapt to the user's domain. The schema constrains agent behavior and ensures consistency:

markdown

undefined

根据用户领域调整，Schema用于约束Agent行为并确保一致性：

markdown

undefined

Wiki Schema

Domain

领域

[What this wiki covers — e.g., "AI/ML research", "personal health", "startup intelligence"]

[本Wiki覆盖范围——例如：“AI/ML研究”、“个人健康”、“创业情报”]

Conventions

约定规则

File names: lowercase, hyphens, no spaces (e.g.,
```
transformer-architecture.md
```
)
Every wiki page starts with YAML frontmatter (see below)
Use
```
[[wikilinks]]
```
to link between pages (minimum 2 outbound links per page)
When updating a page, always bump the
```
updated
```
date
Every new page must be added to
```
index.md
```
under the correct section
Every action must be appended to
```
log.md
```
Provenance markers: On pages that synthesize 3+ sources, append
```
^[raw/articles/source-file.md]
```
at the end of paragraphs whose claims come from a specific source. This lets a reader trace each claim back without re-reading the whole raw file. Optional on single-source pages where the
```
sources:
```
frontmatter is enough.

文件名：小写、连字符分隔、无空格（例如：
```
transformer-architecture.md
```
）
每个Wiki页面均以YAML前置元数据开头（见下方示例）
使用
```
[[wikilinks]]
```
实现页面间链接（每页至少2个出站链接）
更新页面时，务必更新
```
updated
```
日期
所有新页面必须添加至
```
index.md
```
对应分区
所有操作必须追加至
```
log.md
```
来源标记：整合3个及以上来源的页面，需在引用特定来源的段落末尾添加
```
^[raw/articles/source-file.md]
```
标记，方便读者无需重新阅读完整原始文件即可追溯内容来源。单来源页面若已在前置元数据中设置
```
sources:
```
，则可选添加此标记。

Frontmatter

前置元数据

yaml

  ---
  title: Page Title
  created: YYYY-MM-DD
  updated: YYYY-MM-DD
  type: entity | concept | comparison | query | summary
  tags: [from taxonomy below]
  sources: [raw/articles/source-name.md]
  # Optional quality signals:
  confidence: high | medium | low        # how well-supported the claims are
  contested: true                        # set when the page has unresolved contradictions
  contradictions: [other-page-slug]      # pages this one conflicts with
  ---

confidence

and

contested

are optional but recommended for opinion-heavy or fast-moving topics. Lint surfaces

contested: true

and

confidence: low

pages for review so weak claims don't silently harden into accepted wiki fact.

yaml

  ---
  title: 页面标题
  created: YYYY-MM-DD
  updated: YYYY-MM-DD
  type: entity | concept | comparison | query | summary
  tags: [下方分类体系中的标签]
  sources: [raw/articles/source-name.md]
  # 可选质量标记：
  confidence: high | medium | low        # 内容主张的支撑程度
  contested: true                        # 页面存在未解决矛盾时设置
  contradictions: [other-page-slug]      # 与当前页面存在矛盾的页面
  ---

confidence

和

contested

为可选字段，但建议在观点密集或快速变化的主题中使用。检查操作会标记

contested: true

和

confidence: low

的页面供用户审核，避免薄弱主张默认为Wiki既定事实。

raw/ Frontmatter

raw/目录前置元数据

Raw sources ALSO get a small frontmatter block so re-ingests can detect drift:

yaml

---
source_url: https://example.com/article   # original URL, if applicable
ingested: YYYY-MM-DD
sha256: <hex digest of the raw content below the frontmatter>
---

The

sha256:

lets a future re-ingest of the same URL skip processing when content is unchanged, and flag drift when it has changed. Compute over the body only (everything after the closing

---

), not the frontmatter itself.

原始素材也需添加小型前置元数据块，以便重新纳入时检测内容变更：

yaml

---
source_url: https://example.com/article   # 原始URL（如有）
ingested: YYYY-MM-DD
sha256: <前置元数据下方原始内容的十六进制摘要>
---

sha256:

字段可让后续重新纳入同一URL时，若内容未变更则跳过处理，若内容已变更则标记差异。仅计算正文（所有

---

之后的内容）的哈希值，不包含前置元数据本身。

Tag Taxonomy

标签分类体系

[Define 10-20 top-level tags for the domain. Add new tags here BEFORE using them.]

Example for AI/ML:

Models: model, architecture, benchmark, training
People/Orgs: person, company, lab, open-source
Techniques: optimization, fine-tuning, inference, alignment, data
Meta: comparison, timeline, controversy, prediction

Rule: every tag on a page must appear in this taxonomy. If a new tag is needed, add it here first, then use it. This prevents tag sprawl.

[为领域定义10-20个顶级标签，使用新标签前需先在此添加。]

AI/ML领域示例：

模型类：model, architecture, benchmark, training
人物/组织类：person, company, lab, open-source
技术类：optimization, fine-tuning, inference, alignment, data
元数据类：comparison, timeline, controversy, prediction

规则：页面使用的所有标签必须属于此分类体系。若需使用新标签，先在此添加，再应用到页面，避免标签泛滥。

Page Thresholds

页面创建阈值

Create a page when an entity/concept appears in 2+ sources OR is central to one source
Add to existing page when a source mentions something already covered
DON'T create a page for passing mentions, minor details, or things outside the domain
Split a page when it exceeds ~200 lines — break into sub-topics with cross-links
Archive a page when its content is fully superseded — move to
```
_archive/
```
, remove from index

创建新页面：当实体/概念在2个及以上来源中出现，或为某一来源的核心内容时
更新现有页面：当来源提及已覆盖的内容时
不创建页面：仅为附带提及、次要细节或领域外内容时
拆分页面：当页面超过约200行时——拆分为子主题并添加交叉链接
归档页面：当内容完全被替代时——移至
```
_archive/
```
目录，并从索引中移除

Entity Pages

实体页面

One page per notable entity. Include:

Overview / what it is
Key facts and dates
Relationships to other entities ([[wikilinks]])
Source references

每个重要实体对应一个页面，包含：

概述/定义
关键事实和日期
与其他实体的关联（[[wikilinks]]）
来源引用

Concept Pages

概念页面

One page per concept or topic. Include:

Definition / explanation
Current state of knowledge
Open questions or debates
Related concepts ([[wikilinks]])

每个概念或主题对应一个页面，包含：

定义/解释
当前认知状态
待解决问题或争议点
相关概念（[[wikilinks]]）

Comparison Pages

对比分析页面

Side-by-side analyses. Include:

What is being compared and why
Dimensions of comparison (table format preferred)
Verdict or synthesis
Sources

并列对比分析，包含：

对比对象及原因
对比维度（优先使用表格格式）
结论或整合内容
来源

Update Policy

更新策略

When new information conflicts with existing content:

Check the dates — newer sources generally supersede older ones
If genuinely contradictory, note both positions with dates and sources
Mark the contradiction in frontmatter:
```
contradictions: [page-name]
```
Flag for user review in the lint report

undefined

当新信息与现有内容冲突时：

检查日期——较新来源通常替代较旧来源
若确实存在矛盾，需同时记录两种观点及对应日期和来源
在前置元数据中标记矛盾：
```
contradictions: [page-name]
```
在检查报告中标记供用户审核

undefined

index.md Template

index.md模板

The index is sectioned by type. Each entry is one line: wikilink + summary.

markdown

undefined

索引按页面类型分区，每条记录为一行：维基链接+摘要。

markdown

undefined

Wiki Index

Wiki索引

Content catalog. Every wiki page listed under its type with a one-line summary. Read this first to find relevant pages for any query. Last updated: YYYY-MM-DD | Total pages: N

内容目录，所有Wiki页面按类型分类并附带单行摘要。查询前请先阅读此索引以找到相关页面。最后更新：YYYY-MM-DD | 总页面数：N

Entities

实体

Concepts

概念

Comparisons

对比分析

Queries

查询结果


**Scaling rule:** When any section exceeds 50 entries, split it into sub-sections
by first letter or sub-domain. When the index exceeds 200 entries total, create
a `_meta/topic-map.md` that groups pages by theme for faster navigation.


**扩展规则**：当任意分区条目超过50条时，按首字母或子领域拆分；当索引总条目超过200条时，创建`_meta/topic-map.md`按主题分组页面，提升导航效率。

log.md Template

log.md模板

markdown

undefined

markdown

undefined

Wiki Log

Wiki日志

Chronological record of all wiki actions. Append-only. Format:
## [YYYY-MM-DD] action | subject
Actions: ingest, update, query, lint, create, archive, delete When this file exceeds 500 entries, rotate: rename to log-YYYY.md, start fresh.

所有Wiki操作的时间顺序记录，仅允许追加内容。格式：
## [YYYY-MM-DD] 操作 | 主题
操作类型：ingest, update, query, lint, create, archive, delete 当文件超过500条记录时，轮换日志：重命名为log-YYYY.md，重新开始记录。

[YYYY-MM-DD] create | Wiki initialized

[YYYY-MM-DD] create | Wiki初始化

Domain: [domain]
Structure created with SCHEMA.md, index.md, log.md

undefined

领域：[领域名称]
创建SCHEMA.md、index.md、log.md及对应结构

undefined

Core Operations

核心操作

1. Ingest

1. 内容纳入

When the user provides a source (URL, file, paste), integrate it into the wiki:

① Capture the raw source:

URL → use
```
web_extract
```
to get markdown, save to
```
raw/articles/
```
PDF → use
```
web_extract
```
(handles PDFs), save to
```
raw/papers/
```
Pasted text → save to appropriate
```
raw/
```
subdirectory
Name the file descriptively:
```
raw/articles/karpathy-llm-wiki-2026.md
```
Add raw frontmatter (
```
source_url
```
,
```
ingested
```
,
```
sha256
```
of the body). On re-ingest of the same URL: recompute the sha256, compare to the stored value — skip if identical, flag drift and update if different. This is cheap enough to do on every re-ingest and catches silent source changes.

② Discuss takeaways with the user — what's interesting, what matters for the domain. (Skip this in automated/cron contexts — proceed directly.)

③ Check what already exists — search index.md and use

search_files

to find existing pages for mentioned entities/concepts. This is the difference between a growing wiki and a pile of duplicates.

④ Write or update wiki pages:

New entities/concepts: Create pages only if they meet the Page Thresholds in SCHEMA.md (2+ source mentions, or central to one source)
Existing pages: Add new information, update facts, bump
```
updated
```
date. When new info contradicts existing content, follow the Update Policy.
Cross-reference: Every new or updated page must link to at least 2 other pages via
```
[[wikilinks]]
```
. Check that existing pages link back.
Tags: Only use tags from the taxonomy in SCHEMA.md
Provenance: On pages synthesizing 3+ sources, append
```
^[raw/articles/source.md]
```
markers to paragraphs whose claims trace to a specific source.
Confidence: For opinion-heavy, fast-moving, or single-source claims, set
```
confidence: medium
```
or
```
low
```
in frontmatter. Don't mark
```
high
```
unless the claim is well-supported across multiple sources.

⑤ Update navigation:

Add new pages to
```
index.md
```
under the correct section, alphabetically
Update the "Total pages" count and "Last updated" date in index header

Append to

log.md

## [YYYY-MM-DD] ingest | Source Title

List every file created or updated in the log entry

⑥ Report what changed — list every file created or updated to the user.

A single source can trigger updates across 5-15 wiki pages. This is normal and desired — it's the compounding effect.

当用户提供来源（URL、文件、粘贴文本）时，将其整合至Wiki：

① 捕获原始素材：

URL → 使用
```
web_extract
```
获取Markdown格式内容，保存至
```
raw/articles/
```
PDF → 使用
```
web_extract
```
（支持PDF处理），保存至
```
raw/papers/
```
粘贴文本 → 保存至
```
raw/
```
对应子目录
文件命名需清晰描述内容：
```
raw/articles/karpathy-llm-wiki-2026.md
```
添加原始素材前置元数据（
```
source_url
```
、
```
ingested
```
、正文的
```
sha256
```
）。重新纳入同一URL时：重新计算哈希值并与存储值对比——若相同则跳过，若不同则标记差异并更新内容。此操作成本低，可在每次重新纳入时执行，能及时发现素材的静默变更。

② 与用户讨论要点——内容亮点、对领域的重要性。（自动化/定时任务场景可跳过此步骤，直接执行后续操作。）

③ 检查现有内容——搜索index.md并使用

search_files

查找提及的实体/概念对应的现有页面，这是构建可增长Wiki而非重复内容堆的关键。

④ 编写或更新Wiki页面：

新实体/概念：仅当符合SCHEMA.md中的页面创建阈值时才创建页面（2个及以上来源提及，或为某一来源的核心内容）
现有页面：添加新信息、更新事实、更新
```
updated
```
日期。若新信息与现有内容冲突，遵循更新策略处理。
交叉引用：所有新创建或更新的页面必须通过
```
[[wikilinks]]
```
链接至至少2个其他页面，并检查现有页面是否反向链接。
标签：仅使用SCHEMA.md分类体系中的标签
来源标记：整合3个及以上来源的页面，需在引用特定来源的段落末尾添加
```
^[raw/articles/source.md]
```
标记。
置信度：对于观点密集、快速变化或单来源的内容，在前置元数据中设置
```
confidence: medium
```
或
```
low
```
。仅当内容主张得到多个来源充分支撑时，才可标记为
```
high
```
。

⑤ 更新导航内容：

将新页面按字母顺序添加至
```
index.md
```
对应分区
更新index.md头部的“总页面数”和“最后更新”日期

追加至

log.md

：

## [YYYY-MM-DD] ingest | 素材标题

在日志条目中列出所有创建或更新的文件

⑥ 向用户报告变更——列出所有创建或更新的文件。

单个素材可能触发5-15个Wiki页面的更新，这是正常且预期的——体现了内容的累积效应。

2. Query

2. 查询

When the user asks a question about the wiki's domain:

① Read
index.md
to identify relevant pages. ② For wikis with 100+ pages, also

search_files

across all

.md

files for key terms — the index alone may miss relevant content. ③ Read the relevant pages using

read_file

. ④ Synthesize an answer from the compiled knowledge. Cite the wiki pages you drew from: "Based on [[page-a]] and [[page-b]]..." ⑤ File valuable answers back — if the answer is a substantial comparison, deep dive, or novel synthesis, create a page in

queries/

comparisons/

. Don't file trivial lookups — only answers that would be painful to re-derive. ⑥ Update log.md with the query and whether it was filed.

当用户询问Wiki领域相关问题时：

① **阅读

index.md

**以识别相关页面。 ② 对于100+页面的Wiki，还需在所有

.md

文件中执行

search_files

搜索关键词——仅靠索引可能遗漏相关内容。 ③ 使用
read_file
读取相关页面。 ④ 从已整合的知识中合成答案，并引用所参考的Wiki页面：“基于[[page-a]]和[[page-b]]……” ⑤ 将有价值的答案归档——若答案为重要对比、深度分析或新颖整合内容，在

queries/

或

comparisons/

目录创建页面。琐碎查询结果无需归档——仅留存难以重新推导的答案。 ⑥ 更新log.md记录查询操作及是否归档。

3. Lint

3. 检查（Lint）

When the user asks to lint, health-check, or audit the wiki:

① Orphan pages: Find pages with no inbound

[[wikilinks]]

from other pages.

python

undefined

当用户请求检查、健康诊断或审计Wiki时：

① 孤立页面：查找无其他页面

[[wikilinks]]

入站链接的页面。

python

undefined

Use execute_code for this — programmatic scan across all wiki pages

使用execute_code执行此操作——程序化扫描所有Wiki页面

import os, re from collections import defaultdict wiki = "<WIKI_PATH>"

Scan all .md files in entities/, concepts/, comparisons/, queries/

扫描entities/、concepts/、comparisons/、queries/下的所有.md文件

Extract all [[wikilinks]] — build inbound link map

提取所有[[wikilinks]]——构建入站链接映射

Pages with zero inbound links are orphans

无入站链接的页面即为孤立页面


② **Broken wikilinks:** Find `[[links]]` that point to pages that don't exist.

③ **Index completeness:** Every wiki page should appear in `index.md`. Compare
   the filesystem against index entries.

④ **Frontmatter validation:** Every wiki page must have all required fields
   (title, created, updated, type, tags, sources). Tags must be in the taxonomy.

⑤ **Stale content:** Pages whose `updated` date is >90 days older than the most
   recent source that mentions the same entities.

⑥ **Contradictions:** Pages on the same topic with conflicting claims. Look for
   pages that share tags/entities but state different facts. Surface all pages
   with `contested: true` or `contradictions:` frontmatter for user review.

⑦ **Quality signals:** List pages with `confidence: low` and any page that cites
   only a single source but has no confidence field set — these are candidates
   for either finding corroboration or demoting to `confidence: medium`.

⑧ **Source drift:** For each file in `raw/` with a `sha256:` frontmatter, recompute
   the hash and flag mismatches. Mismatches indicate the raw file was edited
   (shouldn't happen — raw/ is immutable) or ingested from a URL that has since
   changed. Not a hard error, but worth reporting.

⑨ **Page size:** Flag pages over 200 lines — candidates for splitting.

⑩ **Tag audit:** List all tags in use, flag any not in the SCHEMA.md taxonomy.

⑪ **Log rotation:** If log.md exceeds 500 entries, rotate it.

⑫ **Report findings** with specific file paths and suggested actions, grouped by
   severity (broken links > orphans > source drift > contested pages > stale content > style issues).

⑬ **Append to log.md:** `## [YYYY-MM-DD] lint | N issues found`


② **无效维基链接**：查找指向不存在页面的`[[links]]`。

③ **索引完整性**：所有Wiki页面均应出现在`index.md`中，对比文件系统与索引条目。

④ **前置元数据验证**：所有Wiki页面必须包含必填字段
   （title、created、updated、type、tags、sources），标签必须属于分类体系。

⑤ **过时内容**：`updated`日期比提及同一实体的最新素材日期晚90天以上的页面。

⑥ **矛盾内容**：同一主题下存在冲突主张的页面，查找共享标签/实体但事实表述不同的页面，标记所有`contested: true`或含`contradictions:`前置元数据的页面供用户审核。

⑦ **质量标记**：列出`confidence: low`的页面，以及仅引用单个来源但未设置置信度字段的页面——这些页面需补充佐证内容或降级为`confidence: medium`。

⑧ **素材变更**：对于`raw/`目录中含`sha256:`前置元数据的文件，重新计算哈希值并标记不匹配项。不匹配表示原始文件被编辑（不应发生——raw/目录为不可变）或来源URL内容已变更，虽非严重错误，但需向用户报告。

⑨ **页面长度**：标记超过200行的页面——此类页面需拆分。

⑩ **标签审计**：列出所有已使用的标签，标记未纳入SCHEMA.md分类体系的标签。

⑪ **日志轮换**：若log.md超过500条记录，执行轮换操作。

⑫ **按严重程度分组报告结果**——包含具体文件路径和建议操作（无效链接>孤立页面>素材变更>争议页面>过时内容>格式问题）。

⑬ **追加至log.md**：`## [YYYY-MM-DD] lint | 发现N个问题`

Working with the Wiki

Wiki使用技巧

Searching

搜索

bash

undefined

bash

undefined

Find pages by content

按内容查找页面

search_files "transformer" path="$WIKI" file_glob="*.md"

Find pages by filename

按文件名查找页面

search_files "*.md" target="files" path="$WIKI"

Find pages by tag

按标签查找页面

search_files "tags:.alignment" path="$WIKI" file_glob=".md"

Recent activity

近期操作

read_file "$WIKI/log.md" offset=<last 20 lines>

undefined

read_file "$WIKI/log.md" offset=<last 20 lines>

undefined

Bulk Ingest

批量纳入

When ingesting multiple sources at once, batch the updates:

Read all sources first
Identify all entities and concepts across all sources
Check existing pages for all of them (one search pass, not N)
Create/update pages in one pass (avoids redundant updates)
Update index.md once at the end
Write a single log entry covering the batch

同时纳入多个素材时，批量执行更新：

先读取所有素材
识别所有素材中的实体和概念
一次性搜索检查所有实体和概念对应的现有页面（一次搜索而非N次）
一次性创建/更新页面（避免重复更新）
最后统一更新index.md
编写单个日志条目记录批量操作

Archiving

Obsidian Integration

Obsidian集成

The wiki directory works as an Obsidian vault out of the box:

```
[[wikilinks]]
```
render as clickable links
Graph View visualizes the knowledge network
YAML frontmatter powers Dataview queries
The
```
raw/assets/
```
folder holds images referenced via
```
![[image.png]]
```

For best results:

Set Obsidian's attachment folder to
```
raw/assets/
```
Enable "Wikilinks" in Obsidian settings (usually on by default)

Install Dataview plugin for queries like

TABLE tags FROM "entities" WHERE contains(tags, "company")

If using the Obsidian skill alongside this one, set

OBSIDIAN_VAULT_PATH

to the same directory as the wiki path.

Wiki目录可直接作为Obsidian库使用：

```
[[wikilinks]]
```
渲染为可点击链接
图谱视图可视化知识网络
YAML前置元数据支持Dataview查询
```
raw/assets/
```
文件夹存储通过
```
![[image.png]]
```
引用的图片

最佳实践：

将Obsidian的附件文件夹设置为
```
raw/assets/
```
在Obsidian设置中启用“维基链接”（通常默认开启）

安装Dataview插件执行查询，例如

TABLE tags FROM "entities" WHERE contains(tags, "company")

若同时使用Obsidian Skill和本Skill，需将

OBSIDIAN_VAULT_PATH

设置为与Wiki路径相同的目录。

Obsidian Headless (servers and headless machines)

Obsidian Headless（服务器与无界面机器）

On machines without a display, use

obsidian-headless

instead of the desktop app. It syncs vaults via Obsidian Sync without a GUI — perfect for agents running on servers that write to the wiki while Obsidian desktop reads it on another device.

Setup:

bash

undefined

在无显示器的机器上，使用

obsidian-headless

替代桌面端应用。它通过Obsidian Sync同步库，无需GUI——非常适合在服务器上运行的Agent写入Wiki，同时在其他设备上使用Obsidian桌面端读取内容。

设置步骤：

bash

undefined

Requires Node.js 22+

需要Node.js 22+

npm install -g obsidian-headless

Login (requires Obsidian account with Sync subscription)

登录（需Obsidian账号及Sync订阅）

ob login --email <email> --password '<password>'

Create a remote vault for the wiki

为Wiki创建远程库

ob sync-create-remote --name "LLM Wiki"

Connect the wiki directory to the vault

将Wiki目录连接至远程库

cd ~/wiki ob sync-setup --vault "<vault-id>"

Initial sync

初始同步

ob sync

Continuous sync (foreground — use systemd for background)

持续同步（前台运行——使用systemd实现后台运行）

ob sync --continuous


**Continuous background sync via systemd:**
```ini

ob sync --continuous


**通过systemd实现后台持续同步**：
```ini

~/.config/systemd/user/obsidian-wiki-sync.service

[Unit] Description=Obsidian LLM Wiki Sync After=network-online.target Wants=network-online.target

[Service] ExecStart=/path/to/ob sync --continuous WorkingDirectory=/home/user/wiki Restart=on-failure RestartSec=10

[Install] WantedBy=default.target


```bash
systemctl --user daemon-reload
systemctl --user enable --now obsidian-wiki-sync

[Unit] Description=Obsidian LLM Wiki同步服务 After=network-online.target Wants=network-online.target

[Service] ExecStart=/path/to/ob sync --continuous WorkingDirectory=/home/user/wiki Restart=on-failure RestartSec=10

[Install] WantedBy=default.target


```bash
systemctl --user daemon-reload
systemctl --user enable --now obsidian-wiki-sync

Enable linger so sync survives logout:

启用linger使同步服务在注销后仍运行：

sudo loginctl enable-linger $USER


This lets the agent write to `~/wiki` on a server while you browse the same
vault in Obsidian on your laptop/phone — changes appear within seconds.

sudo loginctl enable-linger $USER


此设置可让Agent在服务器上写入`~/wiki`，同时你在笔记本/手机的Obsidian中浏览同一库——变更内容几秒内即可同步显示。

Pitfalls

注意事项

Never modify files in
raw/
— sources are immutable. Corrections go in wiki pages.
Always orient first — read SCHEMA + index + recent log before any operation in a new session. Skipping this causes duplicates and missed cross-references.
Always update index.md and log.md — skipping this makes the wiki degrade. These are the navigational backbone.
Don't create pages for passing mentions — follow the Page Thresholds in SCHEMA.md. A name appearing once in a footnote doesn't warrant an entity page.
Don't create pages without cross-references — isolated pages are invisible. Every page must link to at least 2 other pages.
Frontmatter is required — it enables search, filtering, and staleness detection.
Tags must come from the taxonomy — freeform tags decay into noise. Add new tags to SCHEMA.md first, then use them.
Keep pages scannable — a wiki page should be readable in 30 seconds. Split pages over 200 lines. Move detailed analysis to dedicated deep-dive pages.
Ask before mass-updating — if an ingest would touch 10+ existing pages, confirm the scope with the user first.
Rotate the log — when log.md exceeds 500 entries, rename it
```
log-YYYY.md
```
and start fresh. The agent should check log size during lint.
Handle contradictions explicitly — don't silently overwrite. Note both claims with dates, mark in frontmatter, flag for user review.

绝不修改
raw/
目录下的文件——素材为不可变内容，修正内容需在Wiki页面中进行。
务必先完成定位——新会话中执行任何操作前，先阅读SCHEMA+索引+近期日志。跳过此步骤会导致重复页面和遗漏交叉引用。
务必更新index.md和log.md——跳过此步骤会导致Wiki可用性下降，这两个文件是导航核心。
不为附带提及内容创建页面——遵循SCHEMA.md中的页面创建阈值，脚注中仅出现一次的名称无需创建实体页面。
不创建无交叉引用的页面——孤立页面无法被发现，每个页面必须链接至至少2个其他页面。
必须添加前置元数据——它支持搜索、筛选和过时内容检测。
标签必须来自分类体系——自由标签会逐渐失效，使用新标签前需先添加至SCHEMA.md。
保持页面易读——Wiki页面应能在30秒内读完，超过200行的页面需拆分，将详细分析移至专用深度页面。
大规模更新前需确认——若纳入操作会修改10+现有页面，需先与用户确认范围。
轮换日志——当log.md超过500条记录时，重命名为
```
log-YYYY.md
```
并重新开始记录，Agent需在检查时确认日志大小。
明确处理矛盾内容——不静默覆盖，需记录两种主张及对应日期，在前置元数据中标记，并供用户审核。