solo-audit

Compare original and translation side by side

🇺🇸

Original

English
🇨🇳

Translation

Chinese

/audit

/audit

Audit the knowledge base for quality issues: missing frontmatter, broken links, tag inconsistencies, orphaned files, and coverage gaps. Works on any markdown-heavy project.
对知识库进行质量问题审计:缺失前置元数据(frontmatter)、失效链接、标签不一致、孤立文件以及内容覆盖缺口。适用于任何以Markdown为主的项目。

Steps

步骤

  1. Parse focus area from
    $ARGUMENTS
    (optional). If provided, focus on that area (e.g., "tags", "frontmatter", "links"). If empty, run full audit.
  2. Find all markdown files: Use Glob to find all .md files, excluding common non-content directories:
    .venv/
    ,
    node_modules/
    ,
    .git/
    ,
    .embeddings/
    ,
    archive/
    ,
    .archive_old/
    .
  3. Frontmatter audit: For each markdown file, read it and check:
    • Has YAML frontmatter (starts with
      ---
      and has closing
      ---
      )
    • Required fields present:
      type
      ,
      status
      ,
      title
      ,
      tags
    • type
      is valid: one of
      principle
      ,
      methodology
      ,
      agent
      ,
      opportunity
      ,
      capture
      ,
      research
    • status
      is valid: one of
      active
      ,
      draft
      ,
      validated
      ,
      archived
    • tags
      is a non-empty list Track files missing frontmatter and files with incomplete/invalid frontmatter.
  4. Link check: Look for broken internal links:
    • If
      scripts/check_links.py
      exists, run it:
      uv run python scripts/check_links.py
    • Otherwise: Grep for markdown links
      \[.*\]\(.*\.md\)
      and verify each target exists
  5. Tag consistency audit: Use Grep to find all
    tags:
    sections across .md files. Look for:
    • Near-duplicate tags (e.g., "ai" vs "AI" vs "artificial-intelligence")
    • Tags used only once (potential typos)
    • Very common tags that might be too broad List all unique tags with counts.
  6. Orphaned files: Check which files are NOT referenced in any other file's
    related:
    field. Files that exist but are never cross-referenced may be orphaned.
  7. Opportunity quality: Find all documents with
    type: opportunity
    and check:
    • Missing
      opportunity_score
      field
    • evidence_sources
      = 0 or missing
    • Status still
      draft
      for more than 30 days
  8. Coverage gaps: Check each directory for content:
    • Flag any empty or near-empty directories
    • Look for directories with only 1-2 files (may need more content)
  9. Output report:
    ## KB Audit Report
    
    **Date:** [today]
    
    ### Summary
    - Total .md files: X
    - With frontmatter: X (X%)
    - Without frontmatter: X
    
    ### Frontmatter Issues
    | File | Issue |
    |------|-------|
    | path | Missing field: type |
    
    ### Broken Links
    [list of broken references]
    
    ### Tag Analysis
    - Total unique tags: X
    - Single-use tags: [list]
    - Potential duplicates: [list]
    
    ### Orphaned Files
    [files not referenced anywhere]
    
    ### Opportunity Quality
    - Without opportunity_score: [list]
    - Without evidence_sources: [list]
    
    ### Coverage
    [directory analysis]
    
    ### Recommendations
    1. [specific action]
    2. [specific action]
    3. [specific action]
  1. 解析聚焦领域(来自
    $ARGUMENTS
    ,可选)。若提供该参数,则仅针对指定领域进行审计(例如:"tags"、"frontmatter"、"links")。若为空,则执行全面审计。
  2. 查找所有Markdown文件:使用Glob查找所有.md文件,排除常见非内容目录:
    .venv/
    node_modules/
    .git/
    .embeddings/
    archive/
    .archive_old/
  3. 前置元数据(Frontmatter)审计:对每个Markdown文件,读取并检查:
    • 是否包含YAML前置元数据(以
      ---
      开头并以
      ---
      结尾)
    • 是否存在必填字段:
      type
      status
      title
      tags
    • type
      取值有效:必须是
      principle
      methodology
      agent
      opportunity
      capture
      research
      中的一个
    • status
      取值有效:必须是
      active
      draft
      validated
      archived
      中的一个
    • tags
      为非空列表 记录缺失前置元数据的文件,以及前置元数据不完整或无效的文件。
  4. 链接检查:查找失效的内部链接:
    • scripts/check_links.py
      存在,则运行该脚本:
      uv run python scripts/check_links.py
    • 若不存在:使用Grep查找Markdown链接
      \[.*\]\(.*\.md\)
      ,并验证每个目标文件是否存在
  5. 标签一致性审计:使用Grep查找所有.md文件中的
    tags:
    部分。检查:
    • 近似重复的标签(例如:"ai" vs "AI" vs "artificial-intelligence")
    • 仅使用过一次的标签(可能是拼写错误)
    • 过于宽泛的高频标签 列出所有唯一标签及其使用次数。
  6. 孤立文件检查:检查哪些文件未在其他任何文件的
    related:
    字段中被引用。存在但从未被交叉引用的文件可能是孤立文件。
  7. 机会类文档质量检查:查找所有
    type: opportunity
    的文档,并检查:
    • 缺失
      opportunity_score
      字段
    • evidence_sources
      为0或缺失
    • 状态为
      draft
      且超过30天
  8. 内容覆盖缺口检查:检查每个目录下的内容:
    • 标记空目录或接近空的目录
    • 查找仅包含1-2个文件的目录(可能需要补充更多内容)
  9. 输出报告
    ## KB Audit Report
    
    **Date:** [today]
    
    ### Summary
    - Total .md files: X
    - With frontmatter: X (X%)
    - Without frontmatter: X
    
    ### Frontmatter Issues
    | File | Issue |
    |------|-------|
    | path | Missing field: type |
    
    ### Broken Links
    [list of broken references]
    
    ### Tag Analysis
    - Total unique tags: X
    - Single-use tags: [list]
    - Potential duplicates: [list]
    
    ### Orphaned Files
    [files not referenced anywhere]
    
    ### Opportunity Quality
    - Without opportunity_score: [list]
    - Without evidence_sources: [list]
    
    ### Coverage
    [directory analysis]
    
    ### Recommendations
    1. [specific action]
    2. [specific action]
    3. [specific action]