index-knowledge
Compare original and translation side by side
🇺🇸
Original
English🇨🇳
Translation
Chineseindex-knowledge
索引知识库
Generate hierarchical AGENTS.md files. Root + complexity-scored subdirectories.
生成分层的AGENTS.md文件。包含根目录及带复杂度评分的子目录。
Usage
使用方法
--create-new # Read existing → remove all → regenerate from scratch
--max-depth=2 # Limit directory depth (default: 5)Default: Update mode (modify existing + create new where warranted)
--create-new # 读取现有文件 → 删除全部 → 从头重新生成
--max-depth=2 # 限制目录深度(默认值:5)默认模式:更新模式(修改现有内容 + 在必要时创建新内容)
Workflow (High-Level)
工作流程(高层级)
- Discovery + Analysis (concurrent)
- Launch parallel explore agents (multiple Task calls in one message)
- Main session: bash structure + LSP codemap + read existing AGENTS.md
- Score & Decide - Determine AGENTS.md locations from merged findings
- Generate - Root first, then subdirs in parallel
- Review - Deduplicate, trim, validate
TodoWrite([
{ id: "discovery", content: "Fire explore agents + LSP codemap + read existing", status: "pending", priority: "high" },
{ id: "scoring", content: "Score directories, determine locations", status: "pending", priority: "high" },
{ id: "generate", content: "Generate AGENTS.md files (root + subdirs)", status: "pending", priority: "high" },
{ id: "review", content: "Deduplicate, validate, trim", status: "pending", priority: "medium" }
])- 发现与分析(并行)
- 启动并行探索Agent(一条消息中包含多个Task调用)
- 主会话:bash结构分析 + LSP代码映射 + 读取现有AGENTS.md文件
- 评分与决策 - 根据合并后的结果确定AGENTS.md的生成位置
- 生成 - 先生成根目录文档,再并行生成子目录文档
- 审核 - 去重、精简、验证
TodoWrite([
{ id: "discovery", content: "Fire explore agents + LSP codemap + read existing", status: "pending", priority: "high" },
{ id: "scoring", content: "Score directories, determine locations", status: "pending", priority: "high" },
{ id: "generate", content: "Generate AGENTS.md files (root + subdirs)", status: "pending", priority: "high" },
{ id: "review", content: "Deduplicate, validate, trim", status: "pending", priority: "medium" }
])Phase 1: Discovery + Analysis (Concurrent)
阶段1:发现与分析(并行)
Mark "discovery" as in_progress.
将"discovery"标记为in_progress(进行中)。
Launch Parallel Explore Agents
启动并行探索Agent
Multiple Task calls in a single message execute in parallel. Results return directly.
// All Task calls in ONE message = parallel execution
Task(
description="project structure",
subagent_type="explore",
prompt="Project structure: PREDICT standard patterns for detected language → REPORT deviations only"
)
Task(
description="entry points",
subagent_type="explore",
prompt="Entry points: FIND main files → REPORT non-standard organization"
)
Task(
description="conventions",
subagent_type="explore",
prompt="Conventions: FIND config files (.eslintrc, pyproject.toml, .editorconfig) → REPORT project-specific rules"
)
Task(
description="anti-patterns",
subagent_type="explore",
prompt="Anti-patterns: FIND 'DO NOT', 'NEVER', 'ALWAYS', 'DEPRECATED' comments → LIST forbidden patterns"
)
Task(
description="build/ci",
subagent_type="explore",
prompt="Build/CI: FIND .github/workflows, Makefile → REPORT non-standard patterns"
)
Task(
description="test patterns",
subagent_type="explore",
prompt="Test patterns: FIND test configs, test structure → REPORT unique conventions"
)| Factor | Threshold | Additional Agents |
|---|---|---|
| Total files | >100 | +1 per 100 files |
| Total lines | >10k | +1 per 10k lines |
| Directory depth | ≥4 | +2 for deep exploration |
| Large files (>500 lines) | >10 files | +1 for complexity hotspots |
| Monorepo | detected | +1 per package/workspace |
| Multiple languages | >1 | +1 per language |
bash
undefined单条消息中的多个Task调用会并行执行,结果直接返回。
// 单条消息中的所有Task调用 = 并行执行
Task(
description="project structure",
subagent_type="explore",
prompt="Project structure: PREDICT standard patterns for detected language → REPORT deviations only"
)
Task(
description="entry points",
subagent_type="explore",
prompt="Entry points: FIND main files → REPORT non-standard organization"
)
Task(
description="conventions",
subagent_type="explore",
prompt="Conventions: FIND config files (.eslintrc, pyproject.toml, .editorconfig) → REPORT project-specific rules"
)
Task(
description="anti-patterns",
subagent_type="explore",
prompt="Anti-patterns: FIND 'DO NOT', 'NEVER', 'ALWAYS', 'DEPRECATED' comments → LIST forbidden patterns"
)
Task(
description="build/ci",
subagent_type="explore",
prompt="Build/CI: FIND .github/workflows, Makefile → REPORT non-standard patterns"
)
Task(
description="test patterns",
subagent_type="explore",
prompt="Test patterns: FIND test configs, test structure → REPORT unique conventions"
)| 因素 | 阈值 | 额外Agent数量 |
|---|---|---|
| 总文件数 | >100 | 每100个文件+1个 |
| 总行数 | >10k | 每10k行+1个 |
| 目录深度 | ≥4 | +2个用于深度探索 |
| 大文件(>500行) | >10个 | +1个用于复杂度热点分析 |
| 单体仓库(Monorepo) | 已检测到 | 每个包/工作区+1个 |
| 多语言 | >1种 | 每种语言+1个 |
bash
undefinedMeasure project scale first
先测量项目规模
total_files=$(find . -type f -not -path '/node_modules/' -not -path '/.git/' | wc -l)
total_lines=$(find . -type f ( -name ".ts" -o -name ".py" -o -name ".go" ) -not -path '/node_modules/' -exec wc -l {} + 2>/dev/null | tail -1 | awk '{print $1}')
large_files=$(find . -type f ( -name ".ts" -o -name ".py" ) -not -path '/node_modules/' -exec wc -l {} + 2>/dev/null | awk '$1 > 500 {count++} END {print count+0}')
max_depth=$(find . -type d -not -path '/node_modules/' -not -path '/.git/*' | awk -F/ '{print NF}' | sort -rn | head -1)
Example spawning (all in ONE message for parallel execution):// 500 files, 50k lines, depth 6, 15 large files → spawn additional agents
Task(
description="large files",
subagent_type="explore",
prompt="Large file analysis: FIND files >500 lines, REPORT complexity hotspots"
)
Task(
description="deep modules",
subagent_type="explore",
prompt="Deep modules at depth 4+: FIND hidden patterns, internal conventions"
)
Task(
description="cross-cutting",
subagent_type="explore",
prompt="Cross-cutting concerns: FIND shared utilities across directories"
)
// ... more based on calculation
</dynamic-agents>total_files=$(find . -type f -not -path '/node_modules/' -not -path '/.git/' | wc -l)
total_lines=$(find . -type f ( -name ".ts" -o -name ".py" -o -name ".go" ) -not -path '/node_modules/' -exec wc -l {} + 2>/dev/null | tail -1 | awk '{print $1}')
large_files=$(find . -type f ( -name ".ts" -o -name ".py" ) -not -path '/node_modules/' -exec wc -l {} + 2>/dev/null | awk '$1 > 500 {count++} END {print count+0}')
max_depth=$(find . -type d -not -path '/node_modules/' -not -path '/.git/*' | awk -F/ '{print NF}' | sort -rn | head -1)
示例生成(全部在单条消息中以并行执行):// 500个文件、50k行、深度6、15个大文件 → 生成额外Agent
Task(
description="large files",
subagent_type="explore",
prompt="Large file analysis: FIND files >500 lines, REPORT complexity hotspots"
)
Task(
description="deep modules",
subagent_type="explore",
prompt="Deep modules at depth 4+: FIND hidden patterns, internal conventions"
)
Task(
description="cross-cutting",
subagent_type="explore",
prompt="Cross-cutting concerns: FIND shared utilities across directories"
)
// ... 根据计算结果生成更多
</dynamic-agents>Main Session: Concurrent Analysis
主会话:并行分析
While Task agents execute, main session does:
在Task Agent执行的同时,主会话执行以下操作:
1. Bash Structural Analysis
1. Bash结构分析
bash
undefinedbash
undefinedDirectory depth + file counts
目录深度 + 文件计数
find . -type d -not -path '/.' -not -path '/node_modules/' -not -path '/venv/' -not -path '/dist/' -not -path '/build/' | awk -F/ '{print NF-1}' | sort -n | uniq -c
find . -type d -not -path '/.' -not -path '/node_modules/' -not -path '/venv/' -not -path '/dist/' -not -path '/build/' | awk -F/ '{print NF-1}' | sort -n | uniq -c
Files per directory (top 30)
各目录文件数(前30个)
find . -type f -not -path '/.' -not -path '/node_modules/' | sed 's|/[^/]*$||' | sort | uniq -c | sort -rn | head -30
find . -type f -not -path '/.' -not -path '/node_modules/' | sed 's|/[^/]*$||' | sort | uniq -c | sort -rn | head -30
Code concentration by extension
按文件扩展名统计代码分布
find . -type f ( -name ".py" -o -name ".ts" -o -name ".tsx" -o -name ".js" -o -name ".go" -o -name ".rs" ) -not -path '/node_modules/' | sed 's|/[^/]*$||' | sort | uniq -c | sort -rn | head -20
find . -type f ( -name ".py" -o -name ".ts" -o -name ".tsx" -o -name ".js" -o -name ".go" -o -name ".rs" ) -not -path '/node_modules/' | sed 's|/[^/]*$||' | sort | uniq -c | sort -rn | head -20
Existing AGENTS.md / CLAUDE.md
现有AGENTS.md / CLAUDE.md文件
find . -type f ( -name "AGENTS.md" -o -name "CLAUDE.md" ) -not -path '/node_modules/' 2>/dev/null
undefinedfind . -type f ( -name "AGENTS.md" -o -name "CLAUDE.md" ) -not -path '/node_modules/' 2>/dev/null
undefined2. Read Existing AGENTS.md
2. 读取现有AGENTS.md文件
For each existing file found:
Read(filePath=file)
Extract: key insights, conventions, anti-patterns
Store in EXISTING_AGENTS mapIf : Read all existing first (preserve context) → then delete all → regenerate.
--create-new对于每个找到的现有文件:
Read(filePath=file)
提取:关键见解、约定、反模式
存储到EXISTING_AGENTS映射中如果使用:先读取所有现有文件(保留上下文)→ 然后删除全部 → 重新生成。
--create-new3. LSP Codemap (if available)
3. LSP代码映射(如果可用)
lsp_servers() # Check availabilitylsp_servers() # 检查可用性Entry points (parallel)
入口点(并行)
lsp_document_symbols(filePath="src/index.ts")
lsp_document_symbols(filePath="main.py")
lsp_document_symbols(filePath="src/index.ts")
lsp_document_symbols(filePath="main.py")
Key symbols (parallel)
关键符号(并行)
lsp_workspace_symbols(filePath=".", query="class")
lsp_workspace_symbols(filePath=".", query="interface")
lsp_workspace_symbols(filePath=".", query="function")
lsp_workspace_symbols(filePath=".", query="class")
lsp_workspace_symbols(filePath=".", query="interface")
lsp_workspace_symbols(filePath=".", query="function")
Centrality for top exports
顶级导出的中心度
lsp_find_references(filePath="...", line=X, character=Y)
**LSP Fallback**: If unavailable, rely on explore agents + AST-grep.
**Merge: bash + LSP + existing + Task agent results. Mark "discovery" as completed.**
---lsp_find_references(filePath="...", line=X, character=Y)
**LSP回退方案**:如果不可用,依赖探索Agent + AST-grep。
**合并**:bash分析结果 + LSP结果 + 现有文件内容 + Task Agent结果。将"discovery"标记为completed(已完成)。
---Phase 2: Scoring & Location Decision
阶段2:评分与位置决策
Mark "scoring" as in_progress.
将"scoring"标记为in_progress(进行中)。
Scoring Matrix
评分矩阵
| Factor | Weight | High Threshold | Source |
|---|---|---|---|
| File count | 3x | >20 | bash |
| Subdir count | 2x | >5 | bash |
| Code ratio | 2x | >70% | bash |
| Unique patterns | 1x | Has own config | explore |
| Module boundary | 2x | Has index.ts/init.py | bash |
| Symbol density | 2x | >30 symbols | LSP |
| Export count | 2x | >10 exports | LSP |
| Reference centrality | 3x | >20 refs | LSP |
| 因素 | 权重 | 高阈值 | 数据来源 |
|---|---|---|---|
| 文件数 | 3倍 | >20 | bash |
| 子目录数 | 2倍 | >5 | bash |
| 代码占比 | 2倍 | >70% | bash |
| 独特模式 | 1倍 | 有独立配置文件 | 探索Agent |
| 模块边界 | 2倍 | 有index.ts/init.py | bash |
| 符号密度 | 2倍 | >30个符号 | LSP |
| 导出数 | 2倍 | >10个导出 | LSP |
| 引用中心度 | 3倍 | >20个引用 | LSP |
Decision Rules
决策规则
| Score | Action |
|---|---|
| Root (.) | ALWAYS create |
| >15 | Create AGENTS.md |
| 8-15 | Create if distinct domain |
| <8 | Skip (parent covers) |
| 分数 | 操作 |
|---|---|
| 根目录 (.) | 始终创建 |
| >15 | 创建AGENTS.md |
| 8-15 | 如果是独立领域则创建 |
| <8 | 跳过(父目录文档已覆盖) |
Output
输出
AGENTS_LOCATIONS = [
{ path: ".", type: "root" },
{ path: "src/hooks", score: 18, reason: "high complexity" },
{ path: "src/api", score: 12, reason: "distinct domain" }
]Mark "scoring" as completed.
AGENTS_LOCATIONS = [
{ path: ".", type: "root" },
{ path: "src/hooks", score: 18, reason: "high complexity" },
{ path: "src/api", score: 12, reason: "distinct domain" }
]将"scoring"标记为completed(已完成)。
Phase 3: Generate AGENTS.md
阶段3:生成AGENTS.md
Mark "generate" as in_progress.
将"generate"标记为in_progress(进行中)。
Root AGENTS.md (Full Treatment)
根目录AGENTS.md(完整内容)
markdown
undefinedmarkdown
undefinedPROJECT KNOWLEDGE BASE
项目知识库
Generated: {TIMESTAMP}
Commit: {SHORT_SHA}
Branch: {BRANCH}
生成时间: {TIMESTAMP}
提交哈希: {SHORT_SHA}
分支: {BRANCH}
OVERVIEW
概述
{1-2 sentences: what + core stack}
{1-2句话:项目内容 + 核心技术栈}
STRUCTURE
结构
```
{root}/
├── {dir}/ # {non-obvious purpose only}
└── {entry}
```
```
{root}/
├── {dir}/ # 仅标注非显而易见的用途}
└── {entry}
```
WHERE TO LOOK
查找指引
| Task | Location | Notes |
|---|
| 任务 | 位置 | 说明 |
|---|
CODE MAP
代码映射
{From LSP - skip if unavailable or project <10 files}
| Symbol | Type | Location | Refs | Role |
{来自LSP - 如果不可用或项目文件<10个则跳过}
| 符号 | 类型 | 位置 | 引用数 | 角色 |
CONVENTIONS
约定
{ONLY deviations from standard}
{仅标注与标准的差异}
ANTI-PATTERNS (THIS PROJECT)
项目反模式
{Explicitly forbidden here}
{此处明确禁止的内容}
UNIQUE STYLES
独特风格
{Project-specific}
{项目特有的规则}
COMMANDS
命令
```bash
{dev/test/build}
```
```bash
{开发/测试/构建命令}
```
NOTES
注意事项
{Gotchas}
**Quality gates**: 50-150 lines, no generic advice, no obvious info.{常见陷阱}
**质量门槛**:50-150行,无通用建议,无显而易见的信息。Subdirectory AGENTS.md (Parallel)
子目录AGENTS.md(并行生成)
Launch general agents for each location in ONE message (parallel execution):
// All in single message = parallel
Task(
description="AGENTS.md for src/hooks",
subagent_type="general",
prompt="Generate AGENTS.md for: src/hooks
- Reason: high complexity
- 30-80 lines max
- NEVER repeat parent content
- Sections: OVERVIEW (1 line), STRUCTURE (if >5 subdirs), WHERE TO LOOK, CONVENTIONS (if different), ANTI-PATTERNS
- Write directly to src/hooks/AGENTS.md"
)
Task(
description="AGENTS.md for src/api",
subagent_type="general",
prompt="Generate AGENTS.md for: src/api
- Reason: distinct domain
- 30-80 lines max
- NEVER repeat parent content
- Sections: OVERVIEW (1 line), STRUCTURE (if >5 subdirs), WHERE TO LOOK, CONVENTIONS (if different), ANTI-PATTERNS
- Write directly to src/api/AGENTS.md"
)
// ... one Task per AGENTS_LOCATIONS entryResults return directly. Mark "generate" as completed.
在单条消息中为每个位置启动通用Agent(并行执行):
// 全部在单条消息中 = 并行执行
Task(
description="AGENTS.md for src/hooks",
subagent_type="general",
prompt="Generate AGENTS.md for: src/hooks
- Reason: high complexity
- 30-80 lines max
- NEVER repeat parent content
- Sections: OVERVIEW (1 line), STRUCTURE (if >5 subdirs), WHERE TO LOOK, CONVENTIONS (if different), ANTI-PATTERNS
- Write directly to src/hooks/AGENTS.md"
)
Task(
description="AGENTS.md for src/api",
subagent_type="general",
prompt="Generate AGENTS.md for: src/api
- Reason: distinct domain
- 30-80 lines max
- NEVER repeat parent content
- Sections: OVERVIEW (1 line), STRUCTURE (if >5 subdirs), WHERE TO LOOK, CONVENTIONS (if different), ANTI-PATTERNS
- Write directly to src/api/AGENTS.md"
)
// ... 每个AGENTS_LOCATIONS条目对应一个Task结果直接返回。将"generate"标记为completed(已完成)。
Phase 4: Review & Deduplicate
阶段4:审核与去重
Mark "review" as in_progress.
For each generated file:
- Remove generic advice
- Remove parent duplicates
- Trim to size limits
- Verify telegraphic style
Mark "review" as completed.
将"review"标记为in_progress(进行中)。
对于每个生成的文件:
- 删除通用建议
- 删除与父目录重复的内容
- 精简至行数限制内
- 验证电报式风格(简洁直接)
将"review"标记为completed(已完成)。
Final Report
最终报告
=== index-knowledge Complete ===
Mode: {update | create-new}
Files:
✓ ./AGENTS.md (root, {N} lines)
✓ ./src/hooks/AGENTS.md ({N} lines)
Dirs Analyzed: {N}
AGENTS.md Created: {N}
AGENTS.md Updated: {N}
Hierarchy:
./AGENTS.md
└── src/hooks/AGENTS.md=== index-knowledge 完成 ===
模式: {update | create-new}
文件:
✓ ./AGENTS.md (根目录, {N}行)
✓ ./src/hooks/AGENTS.md ({N}行)
已分析目录数: {N}
已创建AGENTS.md数: {N}
已更新AGENTS.md数: {N}
层级结构:
./AGENTS.md
└── src/hooks/AGENTS.mdAnti-Patterns
反模式
- Static agent count: MUST vary agents based on project size/depth
- Sequential execution: MUST parallel (multiple Task calls in one message)
- Ignoring existing: ALWAYS read existing first, even with --create-new
- Over-documenting: Not every dir needs AGENTS.md
- Redundancy: Child never repeats parent
- Generic content: Remove anything that applies to ALL projects
- Verbose style: Telegraphic or die
- 静态Agent数量:必须根据项目规模/深度调整Agent数量
- 顺序执行:必须并行执行(单条消息中包含多个Task调用)
- 忽略现有内容:即使使用--create-new,也必须先读取现有文件
- 过度文档化:并非每个目录都需要AGENTS.md
- 冗余内容:子目录文档绝不能重复父目录内容
- 通用内容:删除适用于所有项目的内容
- 冗长风格:必须使用简洁的电报式风格