write-notes

Compare original and translation side by side

🇺🇸

Original

English
🇨🇳

Translation

Chinese

Writing Lecture Notes from Slide PDFs

从幻灯片PDF制作讲座笔记

Turn lecture PDFs into a linked concept graph inside an Obsidian vault — one topic folder per lecture, one markdown file per concept, plus a positioned
.canvas
view.
将讲座PDF转换为Obsidian库中的关联概念图谱——每个讲座对应一个主题文件夹,每个概念对应一个Markdown文件,再加上一个已定位的
.canvas
视图。

Core principles

核心原则

  • The PDF is the only source of concepts. If a concept is not explicitly named or clearly introduced in the
    pdftotext
    output, it does NOT become a note — no matter how canonical, standard, or "obviously related" it is to the lecture topic. See Do not invent concepts below.
  • General knowledge is for explaining, not extending. You may use your own knowledge to flesh out an explanation, add an illustrative example, or phrase a definition clearly — but only for concepts the PDF already introduces.
  • One concept per note. Never dump a whole lecture into one file.
  • Group notes by lecture topic. Every concept from a lecture goes inside a folder named after that lecture's hub concept — never at the course root. See the path convention below.
  • Link aggressively — within the verified set. Use
    [[wikilinks]]
    for every concept that has (or should have) its own note, but never wikilink to a concept the PDF doesn't mention. Forward-links to future lectures are fine only if the current PDF explicitly names the future concept.
  • Stay within the course. Do not link across course folders.
  • Merge, don't overwrite. If a note exists, integrate new material and preserve existing structure.
  • Cite the PDF, not slide numbers. E.g.
    CS101 Lecture — 03_DesignByContract.pdf
    .
  • Ask before you assume inputs. If the user didn't specify a PDF path, a vault location, or a canvas file, ask — do not invent defaults. See Inputs below.
  • PDF是概念的唯一来源。如果某个概念在
    pdftotext
    输出中没有明确命名或清晰介绍,无论它在该讲座主题中多么经典、标准或“明显相关”,都不能将其做成笔记——详见下文的「不要凭空创造概念」。
  • 通用知识仅用于解释,而非拓展。你可以用自己的知识来充实解释内容、添加示例或清晰表述定义,但只能针对PDF已介绍的概念。
  • 一个概念对应一个笔记。绝不能将整个讲座内容一股脑塞进一个文件。
  • 按讲座主题分组笔记。某一讲座中的所有概念都要放在以该讲座核心概念命名的文件夹中——绝不能直接放在课程根目录下。详见下文的路径规范。
  • 在已验证范围内大量使用链接。对每个拥有(或应当拥有)独立笔记的概念使用
    [[wikilinks]]
    ,但绝不能链接到PDF未提及的概念。只有当当前PDF明确提到后续讲座的概念时,才允许链接到未来讲座的内容。
  • 限定在当前课程范围内。不要跨课程文件夹链接。
  • 合并而非覆盖。如果笔记已存在,要整合新内容并保留原有结构。
  • 引用PDF文件,而非幻灯片编号。例如:
    CS101 Lecture — 03_DesignByContract.pdf
  • 先询问再假设输入信息。如果用户未指定PDF路径、库位置或canvas文件,要询问用户——不要自行设定默认值。详见下文的「输入信息」。

Do not invent concepts

不要凭空创造概念

Completeness means what the lecture covered, not what the topic canonically includes.
Lectures are deliberately narrower than textbooks. If a microservices lecture skips Circuit Breaker / Saga / Bulkhead, those concepts are not examinable and must not appear in the notes. Adding them pollutes the vault with material the student hasn't been taught and may contradict the lecturer's framing.
完整性指的是讲座所涵盖的内容,而非该主题通常包含的内容
讲座的范围刻意窄于教材。如果某节微服务讲座跳过了Circuit Breaker(断路器)/ Saga(事务编排)/ Bulkhead(舱壁隔离),这些概念不属于考试范围,绝对不能出现在笔记中。添加这些内容会向库中引入学生未学过的材料,甚至可能与讲师的讲解框架相矛盾。

Rationalizations to reject

需拒绝的合理化借口

ExcuseReality
"This concept is standard for this topic."PDF is the source of truth, not the field's canon.
"Adding it makes the graph more complete."Completeness = the lecture's scope, not the topic's scope.
"The concept is implied even if not named."Implied ≠ covered. Do not create the note.
"It's only one extra concept."One hallucinated concept invites ten more. Zero tolerance.
"The PDF mentions it once in a diagram / example."A passing mention is not an introduction. Only create a note if the PDF names and describes the concept.
"The student will want to know this eventually."Then they'll get it in the lecture that covers it, or add it themselves.
"I can mark it as 'extension material'."No. Unverified concepts are excluded entirely.
借口实际情况
“这个概念是该主题的标准内容。”PDF是唯一的事实来源,而非该领域的通用规范。
“添加它能让图谱更完整。”完整性=讲座的覆盖范围,而非主题的完整范围。
“即使没有命名,这个概念也是隐含的。”隐含≠涵盖。不要创建该笔记。
“只是多添加一个概念而已。”一个凭空捏造的概念会引发更多错误。零容忍。
“PDF在图表/示例中提到过一次。”一笔带过的提及不算介绍。只有当PDF命名并描述了该概念时,才能创建笔记。
“学生最终会想了解这个内容的。”那他们会在讲解该内容的讲座中学到,或者自行添加。
“我可以把它标记为‘拓展材料’。”不行。未经验证的概念必须完全排除。

Red flags — STOP if you catch yourself thinking any of these

危险信号——如果你有以下想法,请立即停止

  • "What are the standard sub-topics under X?"
  • "To be complete I should add…"
  • "This lecture briefly mentioned Y, and Y is usually explained alongside Z, so Z should have a note too."
  • Writing a note whose body relies on details the PDF doesn't contain.
  • Adding a wikilink to a concept name that never appeared in the
    pdftotext
    output.
  • “X主题下的标准子主题有哪些?”
  • “为了完整,我应该添加……”
  • “这节课简要提到了Y,而Y通常和Z一起讲解,所以Z也应该有一个笔记。”
  • 撰写的笔记内容依赖PDF中没有的细节。
  • 链接到
    pdftotext
    输出中从未出现过的概念名称。

When to use

适用场景

  • User gives one or more PDF paths and asks for notes.
  • User references lecture slides in the vault and wants them processed.
  • User asks for "study notes from these slides" or similar.
  • 用户提供一个或多个PDF路径,并要求制作笔记。
  • 用户提及库中的讲座幻灯片,希望对其进行处理。
  • 用户要求“根据这些幻灯片制作学习笔记”或类似需求。

Inputs

输入信息

Gather these before any work:
InputRequired?Default
PDF sourceYes
Vault rootYes
Course nameNoInfer from PDF folder name or ask
Canvas pathNo
<Vault>/<Course>.canvas
(create if missing)
开始工作前需收集以下信息:
输入项是否必填默认值
PDF源
库根目录
课程名称从PDF文件夹名称推断或询问用户
Canvas路径
<Vault>/<Course>.canvas
(不存在则创建)

PDF source

PDF源

Accept either:
  • A file path (one PDF).
  • A directory path — list every
    .pdf
    inside, show the user the list, and confirm the processing order before starting.
  • A glob (e.g.
    ~/slides/*.pdf
    ).
If the user invokes this skill without specifying any PDF, ask:
Which lecture PDF(s) should I process? Give me a file path, a directory containing PDFs, or a glob.
Do not process "whatever's in the current directory" as a default. Do not silently skip PDFs that didn't parse.
接受以下任意一种形式:
  • 文件路径(单个PDF)。
  • 目录路径——列出目录内所有
    .pdf
    文件,展示给用户并确认处理顺序后再开始。
  • 通配符(例如:
    ~/slides/*.pdf
    )。
如果用户调用此技能但未指定任何PDF,请询问
我应该处理哪些讲座PDF?请提供文件路径、包含PDF的目录或通配符。
不要将“当前目录中的所有文件”作为默认值进行处理。不要静默跳过解析失败的PDF。

Vault root

库根目录

The Obsidian vault's root. If unspecified:
  1. If the current working directory contains
    .obsidian/
    , offer it as a suggestion.
  2. Otherwise ask: "Which Obsidian vault should I add notes to?"
Never create a brand-new vault without explicit permission.
Obsidian库的根目录。如果未指定:
  1. 如果当前工作目录包含
    .obsidian/
    文件夹,可将其作为建议提供给用户。
  2. 否则询问:“我应该将笔记添加到哪个Obsidian库中?”
未经明确许可,绝不要创建全新的库。

Canvas path

Canvas路径

  • If the user provides an existing canvas (e.g.
    architecture-map.canvas
    ), use that file — load, mutate, save.
  • Otherwise default to
    <Vault>/<Course>.canvas
    . Create if absent, as an empty
    {"nodes":[],"edges":[]}
    .
Do not write canvas data into a file you weren't told to touch.
  • 如果用户提供了已有的canvas文件(例如:
    architecture-map.canvas
    ),直接使用该文件——读取、修改、保存。
  • 否则默认使用
    <Vault>/<Course>.canvas
    。如果文件不存在,则创建并初始化为
    {"nodes":[],"edges":[]}
不要将canvas数据写入未被指定的文件。

Never assume inputs you weren't given

绝不要假设未提供的输入信息

If any required input is missing, ASK BEFORE PROCEEDING. Rationalizations to reject:
ExcuseReality
"The obvious PDF is in
~/Desktop
."
Ask — don't guess the user's filesystem.
"I'll just use the cwd as the vault."Cwd ≠ vault. Confirm first.
"The course name is obvious from the folder."Confirm; do not assume the folder name equals the course.
如果任何必填输入项缺失,请先询问再继续。需拒绝的合理化借口:
借口实际情况
“最明显的PDF在
~/Desktop
里。”
请询问用户——不要猜测用户的文件系统结构。
“我直接用当前工作目录作为库即可。”当前工作目录≠库。请先确认。
“从文件夹名称就能明显看出课程名称。”请确认;不要假设文件夹名称等于课程名称。

Execution mode

执行模式

Pick one based on batch size:
  • 1–3 PDFs → single-agent mode. The main agent does everything. Simpler; keeps full context for cross-lecture linking.
  • 4+ PDFs → subagent-per-PDF mode. For each PDF, dispatch a subagent (via the
    Agent
    tool) to do steps 1–4 only: survey, plan, extract visuals, draft note contents. The subagent returns a compact structured result — do NOT have it write files or touch the canvas. The main agent then performs the merge into existing notes (step 4 write) and the canvas update (step 5) serially, so shared state (existing notes, canvas positions) stays consistent.
Subagent output schema (JSON in the final message):
json
{
  "course": "CS101",
  "topic": "Design by Contract",
  "concepts": [
    {
      "name": "Preconditions",
      "path": "CS101/Content/Design by Contract/Preconditions.md",
      "tier_hint": "leaf",
      "parents": ["Design by Contract"],
      "wikilinks": ["Design by Contract", "Assertions"],
      "pdf_evidence": "A pre-condition must be true before the method executes — the caller's obligation.",
      "body": "<full markdown body, including ## References footer>"
    }
  ],
  "pdf_filename": "03_DesignByContract.pdf"
}
pdf_evidence
is mandatory.
It must be a verbatim snippet from the
pdftotext
output that names and introduces the concept. No evidence = no note. The main agent rejects any concept without a plausible evidence quote.
The main agent then: validates every returned concept has a
pdf_evidence
string, spot-checks 2–3 of the evidence quotes against the
pdftotext
output, rejects and sends back any concept that fails, checks each concept's path, merges or creates, updates the canvas using the full set of verified concepts across all subagent returns.
If the subagent returns more than ~8 concepts from a single short lecture (≤25 slides), treat it as a hallucination warning sign and re-run grep-verification on every concept before proceeding.
根据批量大小选择一种模式:
  • 1–3个PDF → 单Agent模式。由主Agent完成所有工作。流程更简单;可保留完整上下文以实现跨讲座链接。
  • 4个及以上PDF → 每个PDF对应一个子Agent模式。针对每个PDF,调度一个子Agent(通过
    Agent
    工具)仅完成步骤1–4:概览、规划、提取视觉内容、草拟笔记内容。子Agent返回紧凑的结构化结果——不要让它写入文件或修改canvas。主Agent随后串行执行合并到现有笔记(步骤4的写入操作)和更新canvas(步骤5)的操作,以确保共享状态(现有笔记、canvas位置)保持一致。
子Agent输出 schema(最终消息中的JSON):
json
{
  "course": "CS101",
  "topic": "Design by Contract",
  "concepts": [
    {
      "name": "Preconditions",
      "path": "CS101/Content/Design by Contract/Preconditions.md",
      "tier_hint": "leaf",
      "parents": ["Design by Contract"],
      "wikilinks": ["Design by Contract", "Assertions"],
      "pdf_evidence": "A pre-condition must be true before the method executes — the caller's obligation.",
      "body": "<full markdown body, including ## References footer>"
    }
  ],
  "pdf_filename": "03_DesignByContract.pdf"
}
pdf_evidence
是必填项
。它必须是
pdftotext
输出中命名并介绍该概念的原文片段。没有证据=不创建笔记。主Agent会拒绝任何没有合理证据引用的概念。
主Agent随后:验证每个返回的概念都包含
pdf_evidence
字符串,随机抽查2–3条证据引用与
pdftotext
输出是否匹配,拒绝并退回任何不符合要求的概念,检查每个概念的路径,合并或创建笔记,使用所有子Agent返回的已验证概念更新canvas。
如果子Agent从单节短讲座(≤25张幻灯片)中返回超过约8个概念,将其视为幻觉警告信号,在继续操作前对每个概念重新进行grep验证。

Workflow

工作流程

Process lectures one PDF at a time (even in subagent mode — dispatch sequentially, merging each before the next, so later subagents can be told which notes already exist). Do not parallelise across PDFs — each one needs its own survey → plan → write cycle so concept maps stay coherent.
一次处理一个PDF(即使是子Agent模式——也要按顺序调度,先合并当前PDF的结果再处理下一个,这样后续子Agent就能知晓已存在哪些笔记)。不要并行处理多个PDF——每个PDF都需要独立完成概览→规划→写入的流程,以确保概念图谱保持连贯。

1. Survey (cheap text pass)

1. 概览(快速文本扫描)

Run
pdftotext <path> -
to get the full lecture text. Save the output — you will grep against it in step 1.5. From the text, identify:
  • Course (confirmed with the user, or inferred from the PDF path — e.g.
    CS101
    )
  • Lecture topic (hub concept, e.g. "Design by Contract") — this becomes both the hub note name AND the topic folder name.
  • Candidate concepts — distinct ideas the PDF explicitly names or introduces. For each, record a short verbatim quote from the
    pdftotext
    output that introduces the concept. This becomes the
    pdf_evidence
    field.
  • Slide ranges where each concept lives (for the next step)
Do not add concepts you merely expect the topic to include. If the PDF doesn't mention it, it isn't a candidate.
运行
pdftotext <path> -
获取完整讲座文本。保存输出结果——你将在步骤1.5中对其进行grep检索。从文本中识别:
  • 课程(与用户确认,或从PDF路径推断——例如:
    CS101
  • 讲座主题(核心概念,例如:“Design by Contract”)——它将同时作为核心笔记名称和主题文件夹名称。
  • 候选概念——PDF明确命名或介绍的不同概念。针对每个概念,记录
    pdftotext
    输出中介绍该概念的简短原文片段。这将作为
    pdf_evidence
    字段的内容。
  • 每个概念所在的幻灯片范围(用于下一步)
不要添加你认为该主题应该包含的概念。如果PDF未提及,就不能作为候选概念。

1.5. Grep-verify the candidate list (mandatory)

1.5. Grep验证候选列表(必填步骤)

For every candidate concept, run a case-insensitive grep against the
pdftotext
output:
bash
pdftotext <path> - | grep -iE "concept name|obvious paraphrase"
Rules:
  • Zero hits and no obvious paraphrase present → delete the candidate. It is hallucinated.
  • One passing mention inside an example/diagram caption → not enough. The PDF must introduce the concept (define, describe, list as a bullet, or give it a section heading). Delete the candidate if it only appears inside a worked example's body.
  • Multiple references, clearly introduced → keep.
Record the grep results you relied on. The
pdf_evidence
in the subagent's JSON output must cite a specific hit from this step.
If you find yourself grepping for synonyms to "rescue" a candidate you wanted to include — stop. That is the rationalization described in Do not invent concepts. The only legitimate paraphrase is one the PDF itself uses (e.g. "pre-condition" vs "precondition").
针对每个候选概念,对
pdftotext
输出进行不区分大小写的grep检索:
bash
pdftotext <path> - | grep -iE "concept name|obvious paraphrase"
规则:
  • 无匹配结果且无明显同义表述 → 删除该候选概念。它是凭空捏造的。
  • 仅在示例/图表标题中被提及一次 → 不足以创建笔记。PDF必须介绍该概念(定义、描述、列为项目符号或赋予章节标题)。如果仅在示例正文中出现,删除该候选概念。
  • 多次提及且被明确介绍 → 保留。
记录你依赖的grep结果。子Agent JSON输出中的
pdf_evidence
必须引用此步骤中的具体匹配内容。
如果你发现自己为了“拯救”某个想添加的候选概念而检索同义词——请停止。这属于「不要凭空创造概念」中描述的合理化借口。唯一合法的同义表述是PDF本身使用的表述(例如:“pre-condition” vs “precondition”)。

2. Plan

2. 规划

Build a concept tree using only the grep-verified candidates: hub → direct sub-concepts → recursive sub-concepts.
仅使用通过grep验证的候选概念构建概念树:核心概念→直接子概念→递归子概念。

Path convention (topic folder per lecture)

路径规范(每个讲座对应一个主题文件夹)

Every note goes inside a topic folder named after the lecture's hub concept:
<Vault>/<Course>/Content/<Topic>/<Concept>.md
Example — a "Design by Contract" lecture produces:
  • <Vault>/CS101/Content/Design by Contract/Design by Contract.md
    ← hub
  • <Vault>/CS101/Content/Design by Contract/Preconditions.md
  • <Vault>/CS101/Content/Design by Contract/Postconditions.md
  • <Vault>/CS101/Content/Design by Contract/Class Invariants.md
Do not flatten concepts directly into
<Course>/
or into
<Course>/Content/
. Topic folders scope each lecture's notes so the vault stays browsable as it grows.
If the user's vault already uses a different structure (e.g. flat), confirm the deviation explicitly before following it. Don't silently mirror what's there — ask.
每个笔记都要放在以讲座核心概念命名的主题文件夹中:
<Vault>/<Course>/Content/<Topic>/<Concept>.md
示例——“Design by Contract”讲座生成的文件:
  • <Vault>/CS101/Content/Design by Contract/Design by Contract.md
    ← 核心笔记
  • <Vault>/CS101/Content/Design by Contract/Preconditions.md
  • <Vault>/CS101/Content/Design by Contract/Postconditions.md
  • <Vault>/CS101/Content/Design by Contract/Class Invariants.md
不要将概念直接放在
<Course>/
<Course>/Content/
目录下。主题文件夹可以限定每个讲座的笔记范围,确保库在扩容后仍可浏览。
如果用户的库已使用不同的结构(例如:扁平化结构),请在遵循该结构前明确确认偏差。不要静默照搬现有结构——请询问用户。

Existing-note check

现有笔记检查

For each concept, check whether a note already exists at the expected path. Also search the whole course folder in case the topic folder is named slightly differently (e.g. "DbC" vs "Design by Contract"). Mark each node create or merge.
针对每个概念,检查预期路径下是否已存在笔记。同时搜索整个课程文件夹,以防主题文件夹名称略有不同(例如:“DbC” vs “Design by Contract”)。标记每个节点为「创建」或「合并」。

3. Targeted visual extraction

3. 定向视觉内容提取

For slides containing diagrams, code, worked examples, or tables worth preserving, use the
Read
tool directly on the PDF with the
pages
parameter for just those pages. Don't read the whole PDF as images — it's wasteful.
对于包含图表、代码、示例或值得保留的表格的幻灯片,直接使用
Read
工具读取PDF的对应页面(通过
pages
参数指定)。不要将整个PDF作为图像读取——这会造成资源浪费。

4. Write notes

4. 撰写笔记

Follow the existing vault style exactly. Example structure:
markdown
A **pre-condition** is a condition or predicate that must always be true just **prior** to the execution of some section of code — it is the caller's obligation in a [[Design by Contract]] specification.
严格遵循现有库的样式。示例结构:
markdown
**前置条件(Preconditions)**是指在执行某段代码之前必须始终成立的条件或断言——它是[[Design by Contract]]规范中调用方的义务。

Key points

关键点

  • If a precondition is violated, the effect becomes undefined.
  • ...
  • 如果前置条件被违反,执行结果将变得未定义
  • ...

Example

示例

java
// code from the slide, cleaned up
java
// 来自幻灯片的代码,已整理

References

参考文献

  • CS101 Lecture —
    03_DesignByContract.pdf

Style rules:

- Opening sentence defines the concept and wikilinks to parent/related ideas.
- Use `**bold**` for emphasis on key terms.
- Section headings (`##`) are topical (`Key points`, `Example`, `In inheritance`, etc.) — do NOT include slide numbers.
- Code blocks use fenced syntax with the right language.
- Footer is a `## References` section naming the PDF file only.
- Every concept name that has (or could have) its own note — **and is present in the verified concept list for this or a previous lecture** — becomes a `[[wikilink]]`. Never wikilink to a concept that no PDF has introduced.

**Using general knowledge for explanation (allowed):**

- Fleshing out a definition the PDF states tersely.
- Providing an extra illustrative example that matches the lecture's framing.
- Rephrasing for clarity while preserving the PDF's meaning.

**Using general knowledge for extension (forbidden):**

- Introducing related sub-concepts the PDF doesn't name (even as a sub-bullet).
- Adding a `##` section about a topic the PDF doesn't cover.
- Citing mechanisms, patterns, or examples that pull in new terminology not in the PDF.
- Expanding a brief slide into a multi-section deep dive that goes beyond what was taught.

If a `##` section in your draft can't be traced back to specific PDF content, delete it.

**Merging:** when a note exists, read it first. Add new bullets/sections at the natural place, add missing wikilinks, extend the `## References` footer with the new PDF if different. Preserve the author's existing wording. The same no-extension rule applies to merges — new material must come from the current PDF.
  • CS101 Lecture —
    03_DesignByContract.pdf

样式规则:

- 开篇句子定义概念并链接到父概念/相关概念。
- 使用`**粗体**`强调关键术语。
- 章节标题(`##`)为主题性标题(如「关键点」「示例」「继承中的应用」等)——**不要包含幻灯片编号**。
- 代码块使用围栏语法并指定正确的语言。
- 页脚为`## 参考文献`章节,仅列出PDF文件名。
- 每个拥有(或应当拥有)独立笔记的概念名称——**且存在于本次或之前讲座的已验证概念列表中**——都要转换为`[[wikilink]]`。绝不能链接到任何PDF未介绍的概念。

**允许使用通用知识进行解释:**

- 充实PDF中简洁表述的定义。
- 添加符合讲座框架的额外示例。
- 在保留PDF原意的前提下重新表述以提升清晰度。

**禁止使用通用知识进行拓展:**

- 引入PDF未命名的相关子概念(即使是作为子项目符号)。
- 添加PDF未涵盖主题的`##`章节。
- 引用引入PDF未包含的新术语的机制、模式或示例。
- 将简短的幻灯片内容扩展为超出授课范围的多章节深度讲解。

如果你的草稿中某个`##`章节无法追溯到PDF的具体内容,请删除它。

**合并操作:**如果笔记已存在,请先读取它。在合适的位置添加新的项目符号/章节,补充缺失的链接,如果PDF不同则扩展`## 参考文献`页脚。保留作者原有的措辞。合并操作同样适用“禁止拓展”规则——新内容必须来自当前PDF。

5. Update canvas

5. 更新Canvas

Resolve the canvas path

确定Canvas路径

  1. User-provided canvas path — use it directly. Read, mutate, save.
  2. Otherwise — use
    <Vault>/<Course>.canvas
    .
  3. If the file doesn't exist — create it, initialised as
    {"nodes":[],"edges":[]}
    .
Always read the current canvas before writing so you preserve existing nodes and edges.
Canvas JSON schema:
json
{
  "nodes": [
    {"id": "<16-hex>", "type": "file", "file": "<Course>/Content/<Topic>/<Name>.md",
     "x": 0, "y": 0, "width": 400, "height": 400, "color": "5"}
  ],
  "edges": [
    {"id": "<16-hex>", "fromNode": "<id>", "fromSide": "top",
     "toNode": "<id>", "toSide": "bottom"}
  ]
}
Rules:
  • Preserve existing node positions. Only place/resize new nodes; never move existing ones.
  • Generate
    id
    as a random 16-char hex string.
  • Write valid JSON (Obsidian is strict — no trailing commas, no comments).
Importance tiers — before placing, count each concept's degree (how many other concepts link to/from it). Assign a tier:
TierWhenSizeColor
HubLecture topic, or degree ≥ 6680×420
"5"
CoreDegree 3–5 (dense branching)520×360
"4"
LeafDegree 1–2400×280(omit color)
A concept with many outgoing arrows is a "core" node — make it visibly bigger and coloured even if it isn't the lecture's top-level hub. Recompute tiers each run so a node promoted by new lectures gets resized.
Layout philosophy: aim for an organic, flowing graph — not a rigid ring or grid. Think of concepts drifting outward from their parent like branches of a tree, with some asymmetry. Perfect symmetry looks sterile; slight irregularity reads as natural.
Layout algorithm:
  1. Place the hub at its existing position, or ≥1200 px from the nearest existing hub if new.
  2. For a parent's children, pick a broad angular region (e.g. 120°–180° arc) on the side facing away from the rest of the graph. Within that arc, distribute children at varied radii (roughly 700–1000 px for core nodes, 450–650 px for leaves off a core) and uneven angular gaps — don't snap to equal spacing. Small jitter (±10–15%) is better than clockwork.
  3. Follow the natural branching: a core node's leaves should fan out beyond it, continuing the flow outward rather than curling back toward the hub.
  4. Minimum spacing: 120 px gap between any two node bounding boxes. If a candidate position overlaps, nudge outward or sideways until clear — keep the nudge small so the flow isn't disrupted.
  5. Arrow routing: pick
    fromSide
    /
    toSide
    based on relative position:
    • child above parent →
      fromSide: "top"
      ,
      toSide: "bottom"
    • child below →
      "bottom"
      /
      "top"
    • child left/right →
      "left"
      /
      "right"
      symmetrically Choose whichever axis (horizontal vs vertical) has the greater displacement. This prevents arrows cutting diagonally across nodes.
  6. Add edges: parent hub → each direct sub-concept, plus edges between already-placed concepts wherever wikilinks exist between them.
  1. 用户提供的Canvas路径——直接使用。读取、修改、保存。
  2. 否则——使用
    <Vault>/<Course>.canvas
  3. 如果文件不存在——创建该文件,初始化为
    {"nodes":[],"edges":[]}
写入前务必先读取当前Canvas,以保留现有节点和连线。
Canvas JSON schema:
json
{
  "nodes": [
    {"id": "<16-hex>", "type": "file", "file": "<Course>/Content/<Topic>/<Name>.md",
     "x": 0, "y": 0, "width": 400, "height": 400, "color": "5"}
  ],
  "edges": [
    {"id": "<16-hex>", "fromNode": "<id>", "fromSide": "top",
     "toNode": "<id>", "toSide": "bottom"}
  ]
}
规则:
  • 保留现有节点位置。仅放置/调整新节点的大小;绝不要移动现有节点。
  • 生成
    id
    为随机的16位十六进制字符串。
  • 写入合法的JSON(Obsidian要求严格——不能有尾随逗号,不能有注释)。
重要性层级——放置前,统计每个概念的关联度(与其他概念的链接数量)。分配层级:
层级适用场景尺寸颜色
核心讲座主题,或关联度≥6680×420
"5"
关键关联度3–5(分支密集)520×360
"4"
叶子关联度1–2400×280(省略颜色)
被多个其他概念链接的概念是“关键”节点——即使它不是讲座的顶级核心概念,也要使其尺寸更大并添加颜色标记。每次运行时重新计算层级,这样被新讲座提升重要性的节点就能调整大小。
布局理念: 目标是打造自然流畅的图谱——而非僵硬的环形或网格布局。想象概念像树枝一样从父节点向外延伸,带有一定的不对称性。完美对称会显得生硬;轻微的不规则会更自然。
布局算法:
  1. 将核心节点放在其现有位置;如果是新核心节点,则放在距离最近的现有核心节点≥1200 px的位置。
  2. 对于父节点的子节点,选择一个宽角度区域(例如:120°–180°的弧形),朝向远离图谱其余部分的一侧。在该弧形区域内,将子节点分布在不同半径(关键节点约700–1000 px,叶子节点相对于关键节点约450–650 px)和不均匀的角度间隔中——不要对齐到等间距。轻微的抖动(±10–15%)比精确的时钟式布局更好。
  3. 遵循自然分支:关键节点的叶子节点应向外展开,延续向外的流动方向,而非向核心节点卷曲。
  4. 最小间距: 任意两个节点边框之间的间距为120 px。如果候选位置发生重叠,向外或向侧面轻微移动节点——移动幅度要小,以免破坏整体布局的流畅性。
  5. 箭头路由: 根据相对位置选择
    fromSide
    /
    toSide
    • 子节点在父节点上方 →
      fromSide: "top"
      ,
      toSide: "bottom"
    • 子节点在父节点下方 →
      "bottom"
      /
      "top"
    • 子节点在父节点左侧/右侧 →
      "left"
      /
      "right"
      对称设置 选择位移更大的轴(水平或垂直)。这能避免箭头斜穿节点。
  6. 添加连线:核心父节点→每个直接子概念,以及已放置的概念之间凡存在wikilink的都添加连线。

6. Next PDF

6. 处理下一个PDF

Only after the current PDF is fully written (notes + canvas updated), move to the next. Report progress to the user between PDFs.
只有当前PDF的所有操作(笔记+Canvas更新)全部完成后,再处理下一个PDF。在PDF之间向用户报告进度。

Quick reference

快速参考

TaskTool
Survey PDF text
Bash: pdftotext <path> -
Grep-verify a candidate concept
Bash: pdftotext <path> - | grep -iE "name|paraphrase"
Extract diagrams/code from specific slides
Read
with
pages: "N-M"
Read existing note before merging
Read
Create new note
Write
Merge into existing note
Edit
Update canvas
Read
then
Write
(parse JSON, mutate, re-serialise)
任务工具
概览PDF文本
Bash: pdftotext <path> -
Grep验证候选概念
Bash: pdftotext <path> - | grep -iE "name|paraphrase"
从特定幻灯片提取图表/代码
Read
工具,指定
pages: "N-M"
合并前读取现有笔记
Read
工具
创建新笔记
Write
工具
合并到现有笔记
Edit
工具
更新Canvas先使用
Read
工具读取,再使用
Write
工具写入(解析JSON、修改、重新序列化)

Common mistakes

常见错误

  • Hallucinating canonical concepts not in the PDF. The #1 failure mode. If you're adding standard topic-X concepts because "lectures on X usually cover them", STOP — grep the pdftotext output, confirm zero hits, delete the concept. Applies to every topic.
  • Deep-diving a concept the PDF only names briefly. A one-line slide mention doesn't justify a multi-section note. Match the note's depth to the PDF's depth; use general knowledge only for phrasing and one short illustrative example.
  • Wikilinking to concepts the PDF doesn't mention. Forward-links to future-lecture concepts are allowed only if the current PDF explicitly names them (e.g. a slide listing upcoming lectures).
  • Dumping every concept at the course root. Every lecture's concepts belong inside a topic folder named after the lecture — never
    <Course>/Concept.md
    . See Step 2.
  • Processing PDFs you weren't given. If the user didn't specify which files to process, ask — don't glob the current directory.
  • Writing to a canvas you weren't asked to touch. Use the user-provided canvas if given; otherwise the course default. Don't pick a third file.
  • Writing one giant note per lecture. Split by concept.
  • Citing slide numbers. PDF filename only.
  • Linking across courses. Stay within the course folder.
  • Overwriting existing notes. Always merge.
  • Processing all PDFs in parallel. Go one at a time.
  • Reading the whole PDF as images. Use
    pdftotext
    first, then targeted page reads.
  • Dropping existing canvas nodes or moving them. Only add/position new ones.
  • Invalid canvas JSON (trailing commas, missing
    id
    ). Obsidian silently fails to load.
  • Uniform node sizes. A concept that ten others link to should look different from a leaf — resize and recolour by degree.
  • Arrows slicing through nodes. Pick
    fromSide
    /
    toSide
    from relative position (step 5). If an arrow still crosses a node, nudge the new child further out.
  • Clustering new nodes on one side. Distribute around the parent — but in a flowing arc, not a strict ring.
  • Over-regular layout. Equal spacing and matched radii look mechanical. Vary distances and angles slightly so the graph reads as organic.
  • 凭空添加PDF中没有的通用概念。这是最常见的失败模式。如果你因为“讲解X主题的讲座通常会涵盖这些内容”而添加标准X主题概念,请立即停止——对pdftotext输出进行grep检索,确认无匹配结果后删除该概念。适用于所有主题。
  • 对PDF仅简要提及的概念进行深度讲解。幻灯片上的一行内容不足以支撑多章节的笔记。笔记的深度要与PDF的深度匹配;仅可使用通用知识进行表述和添加一个简短示例。
  • 链接到PDF未提及的概念。只有当当前PDF明确提到后续讲座的概念时(例如:列出后续讲座的幻灯片),才允许链接到未来讲座的内容。
  • 将所有概念直接放在课程根目录。每个讲座的概念都要放在以该讲座命名的主题文件夹中——绝不能放在
    <Course>/Concept.md
    。详见步骤2。
  • 处理未被指定的PDF。如果用户未指定要处理的文件,请询问——不要对当前目录进行通配符检索。
  • 写入未被指定的Canvas。如果用户提供了Canvas路径则使用该路径;否则使用课程默认路径。不要选择第三个文件。
  • 每个讲座写一个巨型笔记。按概念拆分。
  • 引用幻灯片编号。仅引用PDF文件名。
  • 跨课程链接。限定在当前课程文件夹内。
  • 覆盖现有笔记。始终进行合并操作。
  • 并行处理所有PDF。一次处理一个。
  • 将整个PDF作为图像读取。先使用
    pdftotext
    ,再定向读取特定页面。
  • 删除或移动现有Canvas节点。仅添加/定位新节点。
  • 无效的Canvas JSON(尾随逗号、缺失
    id
    )。Obsidian会静默加载失败。
  • 统一节点大小。被多个其他概念链接的概念应与叶子节点有所区别——根据关联度调整大小和颜色。
  • 箭头穿过节点。根据相对位置选择
    fromSide
    /
    toSide
    (步骤5)。如果箭头仍穿过节点,将新子节点向外移动一点。
  • 新节点集中在一侧。围绕父节点分布——但要采用流畅的弧形布局,而非严格的环形。
  • 布局过于规整。等间距和相同半径会显得机械。略微改变距离和角度,让图谱看起来更自然。

Baseline failure on record

已记录的基准失败案例

In the development of this skill, a subagent was pointed at a 24-slide architecture lecture without these guardrails. It returned 25 notes; 13 were hallucinated — canonical topics from the wider literature that the slides never taught (resilience patterns, sagas, eventual consistency, service discovery, bounded contexts, and several service-infrastructure concepts). A
pdftotext … | grep -i
check confirmed zero hits for any of the distinctive terms. These concepts exist in the textbook the course references, but the lecturer chose not to teach them, so they polluted the vault with non-examinable material. The grep-verify step exists specifically to catch this pattern — pressure to "be complete" pulls the subagent toward topic-canonical completeness rather than lecture-specific fidelity.
在开发此技能时,一个子Agent在没有这些约束的情况下处理了一节24张幻灯片的架构讲座。它返回了25个笔记;其中13个是凭空捏造的——来自更广泛文献的通用主题,但幻灯片从未讲解过(弹性模式、事务编排、最终一致性、服务发现、有界上下文以及多个服务基础设施概念)。
pdftotext … | grep -i
检索确认这些独特术语无匹配结果。这些概念存在于课程参考的教材中,但讲师选择不讲解它们,因此它们向库中引入了非考试范围的材料。grep验证步骤正是为了捕捉这种模式——“追求完整”的压力会促使子Agent向主题通用完整性靠拢,而非保持讲座特定的准确性。