frontmatter-guard
Compare original and translation side by side
🇺🇸
Original
English🇨🇳
Translation
ChineseFrontmatter Guard Skill
—
Convention: seefor citation rules; this skill is structural validation, not citation auditing.skills/conventions/quality.md
,嵌套引号、slug不匹配、空字节、空前置元数据、YAML解析失败)。为Agent驱动的工作流封装 CLI工具。
triggers:
gbrain frontmatter- "validate frontmatter"
- "check frontmatter"
- "fix frontmatter"
- "frontmatter audit"
- "brain lint" tools:
- exec mutating: true
Contract
Frontmatter Guard Skill
This skill guarantees:
- Every brain page is scanned against the seven canonical frontmatter validation classes
- Mechanical errors (nested quotes, missing closing , null bytes, slug mismatch) are auto-repairable on demand with
---backups.bak - Validation logic is shared with 's
gbrain doctorsubcheck — single source of truthfrontmatter_integrity - Reports per source (gbrain is multi-source since v0.18.0); never silently audits the wrong root
约定: 参考中的引用规则;本技能用于结构验证,不涉及引用审计。skills/conventions/quality.md
Why This Exists
功能承诺
Brain pages pile up over months. Agents write them with malformed frontmatter:
- Missing closing (entity detector bugs)
--- - Unstructured YAML in meeting pages (ingestion bugs)
- Slug mismatches (path renames not propagated)
- Null bytes (binary corruption from copy-paste accidents)
- Nested double quotes in titles ()
title: "Phil "Nick" Last"
Without a guard, these accumulate silently until chokes or search returns garbage. The guard makes the failure visible at audit time and trivially fixable.
gbrain sync本技能保证:
- 所有脑图页面都会针对7类标准前置元数据验证规则进行扫描
- 机械错误(嵌套引号、缺少闭合、空字节、slug不匹配)可按需自动修复,并生成
---备份文件.bak - 验证逻辑与的
gbrain doctor子检查共享——保证单一事实来源frontmatter_integrity - 按来源生成报告(gbrain自v0.18.0起支持多来源);绝不会静默审计错误的根目录
Validation classes
设计初衷
| Code | Meaning | Auto-fixable? |
|---|---|---|
| File doesn't start with | No (needs human) |
| No closing | Yes |
| YAML failed to parse | Sometimes (depends on cause) |
| Frontmatter | Yes (removes the field) |
| Binary corruption ( | Yes |
| | Yes |
| Open + close present but nothing between | No (needs human) |
脑图页面会随着时间不断积累,Agent生成的页面可能存在前置元数据格式错误:
- 缺少闭合(实体检测器bug)
--- - 会议页面中存在非结构化YAML(导入bug)
- slug不匹配(路径重命名未同步)
- 空字节(复制粘贴导致的二进制损坏)
- 标题中存在嵌套双引号()
title: "Phil "Nick" Last"
如果没有这个防护工具,这些错误会静默积累,直到运行失败或搜索结果出现垃圾数据。本工具能在审计阶段就暴露问题,并让修复变得简单。
gbrain syncPhases
验证类别
Phase 1: Audit
—
Run a read-only scan across all registered sources (or one with ).
--source <id>bash
gbrain frontmatter audit --jsonReports:
- Per-source counts grouped by error code
- Sample of up to 20 affected pages per source
- Total count
- Scan timestamp
Output is JSON; agents parse and to decide next steps.
errors_by_codeper_source| 代码 | 含义 | 是否可自动修复? |
|---|---|---|
| 文件未以 | 否(需人工处理) |
| 第一个标题前缺少闭合 | 是 |
| YAML解析失败 | 有时(取决于原因) |
| 前置元数据中的 | 是(移除该字段) |
| 二进制损坏( | 是 |
| 形如 | 是 |
| 存在开头和闭合标记,但中间无内容 | 否(需人工处理) |
Phase 2: Validate one path
操作阶段
—
阶段1:审计
Validate a single file or directory (does not require source registration):
bash
gbrain frontmatter validate <path> --jsonExit code 0 = clean; 1 = errors found. Use this in CI pipelines or pre-commit hooks.
对所有已注册来源(或通过指定单个来源)执行只读扫描。
--source <id>bash
gbrain frontmatter audit --json报告内容:
- 按错误代码分组的各来源统计数
- 每个来源最多20个受影响页面的样本
- 总错误数
- 扫描时间戳
输出为JSON格式;Agent可解析和字段来决定后续操作。
errors_by_codeper_sourcePhase 3: Fix
阶段2:验证单个路径
When issues are found:
bash
gbrain frontmatter validate <path> --fix--fix<file>.bak--dry-run验证单个文件或目录(无需注册来源):
bash
gbrain frontmatter validate <path> --json退出码0表示无错误;1表示发现错误。可用于CI流水线或预提交钩子。
Phase 4: Pre-commit hook (optional)
阶段3:修复
For brain repos that ARE git repos, install the pre-commit hook to block malformed pages from being committed in the first place:
bash
gbrain frontmatter install-hook [--source <id>]The hook runs against staged / files. Bypass with .
gbrain frontmatter validate.md.mdxgit commit --no-verify发现问题后执行修复:
bash
gbrain frontmatter validate <path> --fix--fix<file>.bak--dry-runTrigger words
阶段4:预提交钩子(可选)
When the user says any of these, route here:
- "validate frontmatter"
- "check frontmatter"
- "fix frontmatter"
- "frontmatter audit"
- "brain lint"
对于git仓库形式的脑图,安装预提交钩子可从源头阻止格式错误的页面被提交:
bash
gbrain frontmatter install-hook [--source <id>]该钩子会对已暂存的/文件执行检查。可通过绕过检查。
.md.mdxgbrain frontmatter validategit commit --no-verifyOutput rules
触发词
- Always run first; never assume a brain is clean.
gbrain frontmatter audit --json - Surface counts to the user in plain language; do not dump raw JSON.
- For operations: state how many files will be modified BEFORE running, then confirm.
--fix - fixes remove the frontmatter
SLUG_MISMATCHfield — gbrain derives slug from path. Mention this when the user's title is intentionally renamed.slug: - Never auto-fix or
MISSING_OPENwithout explicit user input — these usually mean a human author started a page and didn't finish.EMPTY_FRONTMATTER
当用户说出以下任意内容时,路由至本技能:
- "validate frontmatter"
- "check frontmatter"
- "fix frontmatter"
- "frontmatter audit"
- "brain lint"
Chains with
输出规则
- — the
gbrain doctorsubcheck reports the same counts asfrontmatter_integrity.audit - — broader brain health audit; chain after this skill if other classes of issue are suspected.
skills/maintain/SKILL.md - (via
skills/lint/SKILL.md) — overlapping rules for skill-file lint; thegbrain lintrule names in lint output come from this skill's validation surface.frontmatter-*
- 始终先执行;绝不要假设脑图是无错误的。
gbrain frontmatter audit --json - 用通俗易懂的语言向用户展示统计数;不要直接输出原始JSON。
- 执行操作时:在运行前告知用户将修改的文件数量,然后确认。
--fix - 修复会移除前置元数据中的
SLUG_MISMATCH字段——gbrain会从路径生成slug。当用户有意重命名标题时需提及这一点。slug: - 未经用户明确输入,绝不要自动修复或
MISSING_OPEN——这些通常意味着人工作者尚未完成页面编写。EMPTY_FRONTMATTER
Output Format
关联技能/工具
Audit summary (terse, agent-friendly):
Frontmatter audit — 17 issue(s) across 1 source(s)
[default] /Users/me/brain
17 issue(s)
MISSING_CLOSE: 8
NESTED_QUOTES: 5
NULL_BYTES: 4
sample:
people/jane.md — MISSING_CLOSE
companies/acme.md — NESTED_QUOTES
(+ 12 more)
Fix with: gbrain frontmatter validate /Users/me/brain --fixJSON envelope (when is passed):
--jsonjson
{
"ok": false,
"total": 17,
"errors_by_code": { "MISSING_CLOSE": 8, "NESTED_QUOTES": 5, "NULL_BYTES": 4 },
"per_source": [
{
"source_id": "default",
"source_path": "/Users/me/brain",
"total": 17,
"errors_by_code": { "MISSING_CLOSE": 8, "NESTED_QUOTES": 5, "NULL_BYTES": 4 },
"sample": [{ "path": "people/jane.md", "codes": ["MISSING_CLOSE"] }]
}
],
"scanned_at": "2026-04-25T22:30:00.000Z"
}gbrain frontmatter validate <path> --json- ——
gbrain doctor子检查的统计数与frontmatter_integrity一致。audit - ——更全面的脑图健康审计;如果怀疑存在其他类型的问题,可在本技能之后调用该技能。
skills/maintain/SKILL.md - (通过
skills/lint/SKILL.md)——技能文件检查的重叠规则;lint输出中的gbrain lint规则名称来自本技能的验证范围。frontmatter-*
Prevention — Writing Valid Frontmatter
输出格式
This is the most important section. Fixing broken frontmatter is good. Not writing broken frontmatter in the first place is better.
审计摘要(简洁,适合Agent处理):
Frontmatter audit — 17 issue(s) across 1 source(s)
[default] /Users/me/brain
17 issue(s)
MISSING_CLOSE: 8
NESTED_QUOTES: 5
NULL_BYTES: 4
sample:
people/jane.md — MISSING_CLOSE
companies/acme.md — NESTED_QUOTES
(+ 12 more)
Fix with: gbrain frontmatter validate /Users/me/brain --fixJSON封装(当使用参数时):
--jsonjson
{
"ok": false,
"total": 17,
"errors_by_code": { "MISSING_CLOSE": 8, "NESTED_QUOTES": 5, "NULL_BYTES": 4 },
"per_source": [
{
"source_id": "default",
"source_path": "/Users/me/brain",
"total": 17,
"errors_by_code": { "MISSING_CLOSE": 8, "NESTED_QUOTES": 5, "NULL_BYTES": 4 },
"sample": [{ "path": "people/jane.md", "codes": ["MISSING_CLOSE"] }]
}
],
"scanned_at": "2026-04-25T22:30:00.000Z"
}gbrain frontmatter validate <path> --jsonYAML arrays (the historical #1 error source)
预防措施——编写合法的前置元数据
yaml
undefined这是最重要的部分。 修复错误的前置元数据固然重要,但从一开始就避免编写错误的前置元数据更好。
Correct: single-quoted YAML flow (canonical form gbrain emits)
YAML数组(历史上排名第一的错误来源)
tags: ['yc', 'w2025', 'ai']
yaml
undefinedCorrect: unquoted scalars (fine when values have no special chars)
正确:单引号YAML流(gbrain生成的标准格式)
tags: [yc, w2025, ai]
tags: ['yc', 'w2025', 'ai']
Correct: block style
正确:无引号标量(值无特殊字符时可用)
tags:
- yc
- w2025
tags: [yc, w2025, ai]
Tolerated post-v0.37.5.0 but non-canonical: JSON-style double quotes
正确:块级样式
tags: ["yc", "w2025"]
tags:
- yc
- w2025
Broken: mixed JSON objects and strings (invalid YAML)
v0.37.5.0之后兼容但非标准:JSON风格双引号
tags: [{"name": "sports"}, "posterous"]
**Why this used to break:** before v0.37.5.0, the validator counted unescaped `"` characters and flagged any line with 3+. A flow sequence like `tags: ["yc", "w2025"]` has 4 unescaped `"` by design — it's valid YAML, but the dumb counter flagged it anyway. One brain saw 6,981 of these on a single doctor run. v0.37.5.0 parses suspicious values with `js-yaml.safeLoad` before flagging, so JSON-style arrays no longer trigger NESTED_QUOTES.
**Why you should still write the canonical form:** the auto-fix engine (`gbrain frontmatter validate --fix`) and the inferred-frontmatter serializer both emit single-quoted YAML for `tags:` / `aliases:`. Writing the canonical form in new content keeps the source files stylistically consistent and makes diffs against `--fix` runs empty.
**The classic LLM trap:** code like `tags: [${items.map(t => JSON.stringify(t)).join(', ')}]` produces `tags: ["yc", "w2025"]`. Use single quotes with an apostrophe fallback: `tags: [${items.map(t => t.includes("'") ? JSON.stringify(t) : "'" + t + "'").join(', ')}]`. Or use a YAML library that knows how to emit canonical YAML.tags: ["yc", "w2025"]
Quoted scalars
错误:混合JSON对象和字符串(无效YAML)
yaml
undefinedtags: [{"name": "sports"}, "posterous"]
**为何过去会报错:** 在v0.37.5.0之前,验证器会统计未转义的`"`字符,标记任何包含3个及以上的行。像`tags: ["yc", "w2025"]`这样的流序列天生包含4个未转义的`"`——这是合法的YAML,但简单的计数器会误判。某次`doctor`运行中,一个脑图出现了6981个此类误判。v0.37.5.0会先使用`js-yaml.safeLoad`解析可疑值再标记,因此JSON风格数组不再触发NESTED_QUOTES错误。
**为何仍应使用标准格式:** 自动修复引擎(`gbrain frontmatter validate --fix`)和推断前置元数据序列化器都会为`tags:`/`aliases:`生成单引号YAML。在新内容中使用标准格式可保持源文件风格一致,并且与`--fix`运行的差异为空。
**经典LLM陷阱:** 类似`tags: [${items.map(t => JSON.stringify(t)).join(', ')}]`的代码会生成`tags: ["yc", "w2025"]`。应使用单引号并回退到撇号:`tags: [${items.map(t => t.includes("'") ? JSON.stringify(t) : "'" + t + "'").join(', ')}]`。或者使用能生成标准YAML的YAML库。Correct: single quotes for values with special chars
带引号的标量
title: 'My "Quoted" Title'
yaml
undefinedCorrect: double quotes when value has apostrophes
正确:值含特殊字符时使用单引号
title: "Men's Fashion Guide"
title: 'My "Quoted" Title'
Broken: double quotes wrapping inner double quotes
正确:值含撇号时使用双引号
title: "My "Quoted" Title"
undefinedtitle: "Men's Fashion Guide"
When to quote at all
错误:双引号包裹内部双引号
- Unquoted is fine for simple values: ,
type: personbatch: w2025 - Quote when the value contains or starts with
: " ' # [ ] { } | > & * ! ? ,@ - Single quotes are the default safe choice
- Double quotes only when the value itself contains apostrophes
title: "My "Quoted" Title"
undefinedAnti-Patterns
何时需要加引号
Don't auto-fix or without user input. These usually mean a human author started a page and didn't finish — silently inserting markers around an unfinished draft is wrong.
MISSING_OPENEMPTY_FRONTMATTER---Don't use to "make doctor green" without reading the audit first. SLUG_MISMATCH cases are surfaced for manual review specifically because gbrain derives the slug from path. A mismatch usually means the user renamed a file intentionally; auto-removing the slug field is the right outcome only when you've confirmed the rename was deliberate.
--fixDon't skip the backups. The is the safety contract for non-git brain repos. If files accumulate after a fix run, that's a feature, not a bug — the user can review the diffs and delete the backups when satisfied.
.bak.bak.bakDon't run on a brain where sources aren't registered. The CLI returns "no registered sources to audit" gracefully, but the migration emits a phase result. Don't paper over this with a manual path-walk; the right fix is to register the source via .
auditskipped: no_sourcesgbrain sources addDon't install the pre-commit hook on non-git brain dirs. The install-hook command skips them automatically with a one-line note. If you see "skipped — not a git repo" and want validation at write time anyway, use the command on a cron schedule.
audit- 无引号适用于简单值:,
type: personbatch: w2025 - 加引号当值包含或以
: " ' # [ ] { } | > & * ! ? ,开头时@ - 单引号是默认的安全选择
- 双引号仅在值本身包含撇号时使用
—
反模式
—
不要未经用户输入就自动修复或。 这些通常意味着人工作者尚未完成页面编写——静默插入标记会破坏未完成的草稿。
MISSING_OPENEMPTY_FRONTMATTER---不要未查看审计结果就使用来“让doctor显示正常”。 SLUG_MISMATCH案例需要人工审核,因为gbrain从路径生成slug。不匹配通常意味着用户有意重命名了文件;只有确认重命名是故意的,自动移除slug字段才是正确的操作。
--fix不要跳过备份。 是非git脑图仓库的安全保障。修复后文件积累是正常的,而非bug——用户可查看差异并在满意后删除备份。
.bak.bak.bak不要在未注册来源的脑图上运行。 CLI会优雅地返回“no registered sources to audit”,但迁移会生成阶段结果。不要通过手动遍历路径来掩盖这个问题;正确的做法是通过注册来源。
auditskipped: no_sourcesgbrain sources add不要在非git脑图目录上安装预提交钩子。 install-hook命令会自动跳过并给出一行提示。如果看到“skipped — not a git repo”但仍想在写入时进行验证,可通过定时任务运行命令。
audit