index-knowledge

Compare original and translation side by side

🇺🇸

Original

English

🇨🇳

Translation

Chinese

index-knowledge

索引知识库

Generate hierarchical AGENTS.md files. Root + complexity-scored subdirectories.

生成分层的AGENTS.md文件。包含根目录及带复杂度评分的子目录。

Usage

使用方法

--create-new   # Read existing → remove all → regenerate from scratch
--max-depth=2  # Limit directory depth (default: 5)

Default: Update mode (modify existing + create new where warranted)

--create-new   # 读取现有文件 → 删除全部 → 从头重新生成
--max-depth=2  # 限制目录深度（默认值：5）

默认模式：更新模式（修改现有内容 + 在必要时创建新内容）

Workflow (High-Level)

工作流程（高层级）

Discovery + Analysis (concurrent)
- Launch parallel explore agents (multiple Task calls in one message)
- Main session: bash structure + LSP codemap + read existing AGENTS.md
Score & Decide - Determine AGENTS.md locations from merged findings
Generate - Root first, then subdirs in parallel
Review - Deduplicate, trim, validate

<critical> **TodoWrite ALL phases. Mark in_progress → completed in real-time.**

TodoWrite([
  { id: "discovery", content: "Fire explore agents + LSP codemap + read existing", status: "pending", priority: "high" },
  { id: "scoring", content: "Score directories, determine locations", status: "pending", priority: "high" },
  { id: "generate", content: "Generate AGENTS.md files (root + subdirs)", status: "pending", priority: "high" },
  { id: "review", content: "Deduplicate, validate, trim", status: "pending", priority: "medium" }
])

</critical>

发现与分析（并行）
- 启动并行探索Agent（一条消息中包含多个Task调用）
- 主会话：bash结构分析 + LSP代码映射 + 读取现有AGENTS.md文件
评分与决策 - 根据合并后的结果确定AGENTS.md的生成位置
生成 - 先生成根目录文档，再并行生成子目录文档
审核 - 去重、精简、验证

<critical> **记录所有阶段的TodoWrite状态。实时标记in_progress（进行中）→ completed（已完成）。**

TodoWrite([
  { id: "discovery", content: "Fire explore agents + LSP codemap + read existing", status: "pending", priority: "high" },
  { id: "scoring", content: "Score directories, determine locations", status: "pending", priority: "high" },
  { id: "generate", content: "Generate AGENTS.md files (root + subdirs)", status: "pending", priority: "high" },
  { id: "review", content: "Deduplicate, validate, trim", status: "pending", priority: "medium" }
])

</critical>

Phase 1: Discovery + Analysis (Concurrent)

阶段1：发现与分析（并行）

Mark "discovery" as in_progress.

将"discovery"标记为in_progress（进行中）。

Launch Parallel Explore Agents

启动并行探索Agent

Multiple Task calls in a single message execute in parallel. Results return directly.

// All Task calls in ONE message = parallel execution

Task(
  description="project structure",
  subagent_type="explore",
  prompt="Project structure: PREDICT standard patterns for detected language → REPORT deviations only"
)

Task(
  description="entry points",
  subagent_type="explore",
  prompt="Entry points: FIND main files → REPORT non-standard organization"
)

Task(
  description="conventions",
  subagent_type="explore",
  prompt="Conventions: FIND config files (.eslintrc, pyproject.toml, .editorconfig) → REPORT project-specific rules"
)

Task(
  description="anti-patterns",
  subagent_type="explore",
  prompt="Anti-patterns: FIND 'DO NOT', 'NEVER', 'ALWAYS', 'DEPRECATED' comments → LIST forbidden patterns"
)

Task(
  description="build/ci",
  subagent_type="explore",
  prompt="Build/CI: FIND .github/workflows, Makefile → REPORT non-standard patterns"
)

Task(
  description="test patterns",
  subagent_type="explore",
  prompt="Test patterns: FIND test configs, test structure → REPORT unique conventions"
)

<dynamic-agents> **DYNAMIC AGENT SPAWNING**: After bash analysis, spawn ADDITIONAL explore agents based on project scale:

Factor	Threshold	Additional Agents
Total files	>100	+1 per 100 files
Total lines	>10k	+1 per 10k lines
Directory depth	≥4	+2 for deep exploration
Large files (>500 lines)	>10 files	+1 for complexity hotspots
Monorepo	detected	+1 per package/workspace
Multiple languages	>1	+1 per language

bash

undefined

单条消息中的多个Task调用会并行执行，结果直接返回。

// 单条消息中的所有Task调用 = 并行执行

Task(
  description="project structure",
  subagent_type="explore",
  prompt="Project structure: PREDICT standard patterns for detected language → REPORT deviations only"
)

Task(
  description="entry points",
  subagent_type="explore",
  prompt="Entry points: FIND main files → REPORT non-standard organization"
)

Task(
  description="conventions",
  subagent_type="explore",
  prompt="Conventions: FIND config files (.eslintrc, pyproject.toml, .editorconfig) → REPORT project-specific rules"
)

Task(
  description="anti-patterns",
  subagent_type="explore",
  prompt="Anti-patterns: FIND 'DO NOT', 'NEVER', 'ALWAYS', 'DEPRECATED' comments → LIST forbidden patterns"
)

Task(
  description="build/ci",
  subagent_type="explore",
  prompt="Build/CI: FIND .github/workflows, Makefile → REPORT non-standard patterns"
)

Task(
  description="test patterns",
  subagent_type="explore",
  prompt="Test patterns: FIND test configs, test structure → REPORT unique conventions"
)

<dynamic-agents> **动态Agent生成**：在bash分析后，根据项目规模生成额外的探索Agent：

因素	阈值	额外Agent数量
总文件数	>100	每100个文件+1个
总行数	>10k	每10k行+1个
目录深度	≥4	+2个用于深度探索
大文件（>500行）	>10个	+1个用于复杂度热点分析
单体仓库（Monorepo）	已检测到	每个包/工作区+1个
多语言	>1种	每种语言+1个

bash

undefined

Measure project scale first

先测量项目规模

total_files=$(find . -type f -not -path '/node_modules/' -not -path '/.git/' | wc -l) total_lines=$(find . -type f ( -name ".ts" -o -name ".py" -o -name ".go" ) -not -path '/node_modules/' -exec wc -l {} + 2>/dev/null | tail -1 | awk '{print $1}') large_files=$(find . -type f ( -name ".ts" -o -name ".py" ) -not -path '/node_modules/' -exec wc -l {} + 2>/dev/null | awk '$1 > 500 {count++} END {print count+0}') max_depth=$(find . -type d -not -path '/node_modules/' -not -path '/.git/*' | awk -F/ '{print NF}' | sort -rn | head -1)


Example spawning (all in ONE message for parallel execution):

// 500 files, 50k lines, depth 6, 15 large files → spawn additional agents Task( description="large files", subagent_type="explore", prompt="Large file analysis: FIND files >500 lines, REPORT complexity hotspots" )

Task( description="deep modules", subagent_type="explore", prompt="Deep modules at depth 4+: FIND hidden patterns, internal conventions" )

Task( description="cross-cutting", subagent_type="explore", prompt="Cross-cutting concerns: FIND shared utilities across directories" ) // ... more based on calculation

</dynamic-agents>


示例生成（全部在单条消息中以并行执行）：

// 500个文件、50k行、深度6、15个大文件 → 生成额外Agent Task( description="large files", subagent_type="explore", prompt="Large file analysis: FIND files >500 lines, REPORT complexity hotspots" )

Task( description="deep modules", subagent_type="explore", prompt="Deep modules at depth 4+: FIND hidden patterns, internal conventions" )

Task( description="cross-cutting", subagent_type="explore", prompt="Cross-cutting concerns: FIND shared utilities across directories" ) // ... 根据计算结果生成更多

</dynamic-agents>

Main Session: Concurrent Analysis

主会话：并行分析

While Task agents execute, main session does:

在Task Agent执行的同时，主会话执行以下操作：

1. Bash Structural Analysis

1. Bash结构分析

bash

undefined

bash

undefined

Directory depth + file counts

目录深度 + 文件计数

find . -type d -not -path '/.' -not -path '/node_modules/' -not -path '/venv/' -not -path '/dist/' -not -path '/build/' | awk -F/ '{print NF-1}' | sort -n | uniq -c

Files per directory (top 30)

各目录文件数（前30个）

find . -type f -not -path '/.' -not -path '/node_modules/' | sed 's|/[^/]*$||' | sort | uniq -c | sort -rn | head -30

Code concentration by extension

按文件扩展名统计代码分布

find . -type f ( -name ".py" -o -name ".ts" -o -name ".tsx" -o -name ".js" -o -name ".go" -o -name ".rs" ) -not -path '/node_modules/' | sed 's|/[^/]*$||' | sort | uniq -c | sort -rn | head -20

Existing AGENTS.md / CLAUDE.md

现有AGENTS.md / CLAUDE.md文件

find . -type f ( -name "AGENTS.md" -o -name "CLAUDE.md" ) -not -path '/node_modules/' 2>/dev/null

undefined

find . -type f ( -name "AGENTS.md" -o -name "CLAUDE.md" ) -not -path '/node_modules/' 2>/dev/null

undefined

2. Read Existing AGENTS.md

2. 读取现有AGENTS.md文件

For each existing file found:
  Read(filePath=file)
  Extract: key insights, conventions, anti-patterns
  Store in EXISTING_AGENTS map

--create-new

: Read all existing first (preserve context) → then delete all → regenerate.

对于每个找到的现有文件：
  Read(filePath=file)
  提取：关键见解、约定、反模式
  存储到EXISTING_AGENTS映射中

如果使用

--create-new

：先读取所有现有文件（保留上下文）→ 然后删除全部 → 重新生成。

3. LSP Codemap (if available)

3. LSP代码映射（如果可用）

lsp_servers()  # Check availability

lsp_servers()  # 检查可用性

Entry points (parallel)

入口点（并行）

lsp_document_symbols(filePath="src/index.ts") lsp_document_symbols(filePath="main.py")

Key symbols (parallel)

关键符号（并行）

lsp_workspace_symbols(filePath=".", query="class") lsp_workspace_symbols(filePath=".", query="interface") lsp_workspace_symbols(filePath=".", query="function")

Centrality for top exports

顶级导出的中心度

lsp_find_references(filePath="...", line=X, character=Y)


**LSP Fallback**: If unavailable, rely on explore agents + AST-grep.

**Merge: bash + LSP + existing + Task agent results. Mark "discovery" as completed.**

---

lsp_find_references(filePath="...", line=X, character=Y)


**LSP回退方案**：如果不可用，依赖探索Agent + AST-grep。

**合并**：bash分析结果 + LSP结果 + 现有文件内容 + Task Agent结果。将"discovery"标记为completed（已完成）。

---

Phase 2: Scoring & Location Decision

阶段2：评分与位置决策

Mark "scoring" as in_progress.

将"scoring"标记为in_progress（进行中）。

Scoring Matrix

评分矩阵

Factor	Weight	High Threshold	Source
File count	3x	>20	bash
Subdir count	2x	>5	bash
Code ratio	2x	>70%	bash
Unique patterns	1x	Has own config	explore
Module boundary	2x	Has index.ts/init.py	bash
Symbol density	2x	>30 symbols	LSP
Export count	2x	>10 exports	LSP
Reference centrality	3x	>20 refs	LSP

因素	权重	高阈值	数据来源
文件数	3倍	>20	bash
子目录数	2倍	>5	bash
代码占比	2倍	>70%	bash
独特模式	1倍	有独立配置文件	探索Agent
模块边界	2倍	有index.ts/init.py	bash
符号密度	2倍	>30个符号	LSP
导出数	2倍	>10个导出	LSP
引用中心度	3倍	>20个引用	LSP

Decision Rules

决策规则

Score	Action
Root (.)	ALWAYS create
>15	Create AGENTS.md
8-15	Create if distinct domain
<8	Skip (parent covers)

分数	操作
根目录 (.)	始终创建
>15	创建AGENTS.md
8-15	如果是独立领域则创建
<8	跳过（父目录文档已覆盖）

Output

输出

AGENTS_LOCATIONS = [
  { path: ".", type: "root" },
  { path: "src/hooks", score: 18, reason: "high complexity" },
  { path: "src/api", score: 12, reason: "distinct domain" }
]

Mark "scoring" as completed.

AGENTS_LOCATIONS = [
  { path: ".", type: "root" },
  { path: "src/hooks", score: 18, reason: "high complexity" },
  { path: "src/api", score: 12, reason: "distinct domain" }
]

将"scoring"标记为completed（已完成）。

Phase 3: Generate AGENTS.md

阶段3：生成AGENTS.md

Mark "generate" as in_progress.

将"generate"标记为in_progress（进行中）。

Root AGENTS.md (Full Treatment)

根目录AGENTS.md（完整内容）

markdown

undefined

markdown

undefined

PROJECT KNOWLEDGE BASE

项目知识库

Generated: {TIMESTAMP} Commit: {SHORT_SHA} Branch: {BRANCH}

生成时间: {TIMESTAMP} 提交哈希: {SHORT_SHA} 分支: {BRANCH}

OVERVIEW

概述

{1-2 sentences: what + core stack}

{1-2句话：项目内容 + 核心技术栈}

STRUCTURE

结构

``` {root}/ ├── {dir}/ # {non-obvious purpose only} └── {entry} ```

``` {root}/ ├── {dir}/ # 仅标注非显而易见的用途} └── {entry} ```

WHERE TO LOOK

查找指引

Task	Location	Notes

任务	位置	说明

CODE MAP

代码映射

{From LSP - skip if unavailable or project <10 files}

{来自LSP - 如果不可用或项目文件<10个则跳过}

| 符号 | 类型 | 位置 | 引用数 | 角色 |

CONVENTIONS

约定

{ONLY deviations from standard}

{仅标注与标准的差异}

ANTI-PATTERNS (THIS PROJECT)

项目反模式

{Explicitly forbidden here}

{此处明确禁止的内容}

UNIQUE STYLES

独特风格

{Project-specific}

{项目特有的规则}

COMMANDS

命令

```bash {dev/test/build} ```

```bash {开发/测试/构建命令} ```

NOTES

注意事项

{Gotchas}


**Quality gates**: 50-150 lines, no generic advice, no obvious info.

{常见陷阱}


**质量门槛**：50-150行，无通用建议，无显而易见的信息。

Subdirectory AGENTS.md (Parallel)

子目录AGENTS.md（并行生成）

Launch general agents for each location in ONE message (parallel execution):

// All in single message = parallel
Task(
  description="AGENTS.md for src/hooks",
  subagent_type="general",
  prompt="Generate AGENTS.md for: src/hooks
    - Reason: high complexity
    - 30-80 lines max
    - NEVER repeat parent content
    - Sections: OVERVIEW (1 line), STRUCTURE (if >5 subdirs), WHERE TO LOOK, CONVENTIONS (if different), ANTI-PATTERNS
    - Write directly to src/hooks/AGENTS.md"
)

Task(
  description="AGENTS.md for src/api",
  subagent_type="general",
  prompt="Generate AGENTS.md for: src/api
    - Reason: distinct domain
    - 30-80 lines max
    - NEVER repeat parent content
    - Sections: OVERVIEW (1 line), STRUCTURE (if >5 subdirs), WHERE TO LOOK, CONVENTIONS (if different), ANTI-PATTERNS
    - Write directly to src/api/AGENTS.md"
)
// ... one Task per AGENTS_LOCATIONS entry

Results return directly. Mark "generate" as completed.

在单条消息中为每个位置启动通用Agent（并行执行）：

// 全部在单条消息中 = 并行执行
Task(
  description="AGENTS.md for src/hooks",
  subagent_type="general",
  prompt="Generate AGENTS.md for: src/hooks
    - Reason: high complexity
    - 30-80 lines max
    - NEVER repeat parent content
    - Sections: OVERVIEW (1 line), STRUCTURE (if >5 subdirs), WHERE TO LOOK, CONVENTIONS (if different), ANTI-PATTERNS
    - Write directly to src/hooks/AGENTS.md"
)

Task(
  description="AGENTS.md for src/api",
  subagent_type="general",
  prompt="Generate AGENTS.md for: src/api
    - Reason: distinct domain
    - 30-80 lines max
    - NEVER repeat parent content
    - Sections: OVERVIEW (1 line), STRUCTURE (if >5 subdirs), WHERE TO LOOK, CONVENTIONS (if different), ANTI-PATTERNS
    - Write directly to src/api/AGENTS.md"
)
// ... 每个AGENTS_LOCATIONS条目对应一个Task

结果直接返回。将"generate"标记为completed（已完成）。

Phase 4: Review & Deduplicate

阶段4：审核与去重

Mark "review" as in_progress.

For each generated file:

Remove generic advice
Remove parent duplicates
Trim to size limits
Verify telegraphic style

Mark "review" as completed.

将"review"标记为in_progress（进行中）。

对于每个生成的文件：

删除通用建议
删除与父目录重复的内容
精简至行数限制内
验证电报式风格（简洁直接）

将"review"标记为completed（已完成）。

Final Report

最终报告

=== index-knowledge Complete ===

Mode: {update | create-new}

Files:
  ✓ ./AGENTS.md (root, {N} lines)
  ✓ ./src/hooks/AGENTS.md ({N} lines)

Dirs Analyzed: {N}
AGENTS.md Created: {N}
AGENTS.md Updated: {N}

Hierarchy:
  ./AGENTS.md
  └── src/hooks/AGENTS.md

=== index-knowledge 完成 ===

模式: {update | create-new}

文件:
  ✓ ./AGENTS.md (根目录, {N}行)
  ✓ ./src/hooks/AGENTS.md ({N}行)

已分析目录数: {N}
已创建AGENTS.md数: {N}
已更新AGENTS.md数: {N}

层级结构:
  ./AGENTS.md
  └── src/hooks/AGENTS.md

Anti-Patterns

反模式

Static agent count: MUST vary agents based on project size/depth
Sequential execution: MUST parallel (multiple Task calls in one message)
Ignoring existing: ALWAYS read existing first, even with --create-new
Over-documenting: Not every dir needs AGENTS.md
Redundancy: Child never repeats parent
Generic content: Remove anything that applies to ALL projects
Verbose style: Telegraphic or die

静态Agent数量：必须根据项目规模/深度调整Agent数量
顺序执行：必须并行执行（单条消息中包含多个Task调用）
忽略现有内容：即使使用--create-new，也必须先读取现有文件
过度文档化：并非每个目录都需要AGENTS.md
冗余内容：子目录文档绝不能重复父目录内容
通用内容：删除适用于所有项目的内容
冗长风格：必须使用简洁的电报式风格