project-profiler
Compare original and translation side by side
🇺🇸
Original
English🇨🇳
Translation
Chineseproject-profiler
项目概要生成器(project-profiler)
Generate an LLM-optimized project profile — a judgment-rich document that lets any future LLM answer within 60 seconds:
- What are the core abstractions?
- Which modules to modify for feature X?
- What is the biggest risk/debt?
- When should / shouldn't you use this?
This is NOT a codebase map (directory + module navigation) or a diff schematic. This is architectural judgment: design tradeoffs, usage patterns, and when NOT to use.
生成一份LLM优化的项目概要文档——一份富含判断性内容的文档,能让后续任意LLM在60秒内回答以下问题:
- 核心抽象有哪些?
- 要实现功能X需要修改哪些模块?
- 最大的风险/技术债务是什么?
- 何时应该/不应该使用本项目?
这不是代码库地图(目录+模块导航)或差异示意图,而是架构判断:设计权衡、使用模式,以及不适用的场景。
Model Strategy
模型策略
- Opus: Orchestrator — runs all phases, writes the final profile. Does NOT read source code directly (except in direct mode).
- Sonnet: Subagents — read source code files, analyze patterns, report structured findings.
- All subagents launch in a single message (parallel, never sequential).
- Opus:协调器——执行所有阶段,编写最终的概要文档。不直接读取源代码(直接模式除外)。
- Sonnet:子代理——读取源代码文件,分析模式,输出结构化的分析结果。
- 所有子代理通过单条消息启动(并行执行,绝不串行)。
Phase 0: Preflight
阶段0:准备工作
0.1 Target & Project Name
0.1 目标与项目名称
Determine the target directory (use argument if provided, else ).
.Extract project name from the first available source:
- →
package.jsonname - →
pyproject.toml[project] name - →
Cargo.toml[package] name - → module path (last segment)
go.mod - Directory name as fallback
确定目标目录(若提供参数则使用参数,否则使用当前目录)。
.从以下优先级来源提取项目名称:
- →
package.json字段name - →
pyproject.toml字段[project] name - →
Cargo.toml字段[package] name - → 模块路径(最后一段)
go.mod - 目录名称(备选方案)
0.2 Run Scanner
0.2 运行扫描器
bash
uv run {SKILL_DIR}/scripts/scan-project.py {TARGET_DIR} --format summaryCapture the summary output. This provides:
- Project metadata (name, version, license, deps count)
- Tech stack (languages, frameworks, package manager)
- Language distribution (top 5 by tokens)
- Entry points (CLI, API, library)
- Project features (dockerfile, CI, tests, codebase_map)
- Detected conditional sections (Storage, Embedding, Infrastructure, etc.)
- Workspaces (monorepo packages, if any)
- Top 20 largest files
- Directory structure (depth 3)
For debugging or when full file details are needed, use instead.
--format jsonbash
uv run {SKILL_DIR}/scripts/scan-project.py {TARGET_DIR} --format summary捕获扫描器的摘要输出,内容包括:
- 项目元数据(名称、版本、许可证、依赖数量)
- 技术栈(语言、框架、包管理器)
- 语言分布(按代码令牌数排序的前5种语言)
- 入口点(CLI、API、库)
- 项目特性(Dockerfile、CI、测试、codebase_map)
- 检测到的条件模块(存储层、嵌入流水线、基础设施等)
- 工作区(若为单体仓库则包含子包)
- 前20个最大文件
- 目录结构(深度3级)
如需调试或需要完整文件详情,使用参数替代。
--format json0.3 Git Metadata
0.3 Git元数据
Run these commands (use Bash tool):
bash
undefined运行以下命令(使用Bash工具):
bash
undefinedRecent commits
最近提交记录
git -C {TARGET_DIR} log --oneline -20
git -C {TARGET_DIR} log --oneline -20
Contributors
贡献者
git -C {TARGET_DIR} log --format="%aN" | sort -u | head -20
git -C {TARGET_DIR} log --format="%aN" | sort -u | head -20
Version tags
版本标签
git -C {TARGET_DIR} tag --sort=-v:refname | head -5
git -C {TARGET_DIR} tag --sort=-v:refname | head -5
First commit date
首次提交日期
git -C {TARGET_DIR} log --format="%aI" --reverse | head -1
undefinedgit -C {TARGET_DIR} log --format="%aI" --reverse | head -1
undefined0.4 Check Existing CODEBASE_MAP
0.4 检查现有CODEBASE_MAP
If exists, note its presence. The profile will reference it rather than duplicating directory structure.
docs/CODEBASE_MAP.md若已存在,记录其存在。概要文档将引用该文件,而非重复目录结构内容。
docs/CODEBASE_MAP.md0.5 Token Budget → Execution Mode
0.5 令牌预算 → 执行模式
Based on from scanner, choose execution mode:
total_tokens| Total Tokens | Mode | Strategy |
|---|---|---|
| ≤ 80k | Direct | Skip subagents. Opus reads all files directly and performs all analysis in a single context. |
| 80k – 200k | 2 agents | Agent AB (Core + Architecture + Design), Agent C (Usage + Patterns + Deployment) |
| 200k – 400k | 3 agents | Agent A (Core + Design), Agent B (Architecture + Patterns), Agent C (Usage + Deployment) |
| > 400k | 3 agents | Agent A, Agent B, Agent C — each ≤150k tokens, with overflow files assigned to lightest agent |
Why 80k threshold: Opus has 200k context. At ≤80k source tokens, loading all files + scanner output + git metadata + writing the profile all fit comfortably. Subagent overhead (spawn + communication + wait) adds 2-3 minutes for zero benefit.
Direct mode workflow: Skip Phase 2 entirely. After Phase 0+1, proceed to Phase 3 (read scanner directly), then Phase 4, then Phase 5. Read files on-demand during synthesis — do NOT pre-read all files; read only what's needed for each section.
detected_sections根据扫描器输出的选择执行模式:
total_tokens| 总令牌数 | 模式 | 策略 |
|---|---|---|
| ≤ 80k | 直接模式 | 跳过子代理。Opus直接读取所有文件,并在单个上下文内完成所有分析。 |
| 80k – 200k | 2代理模式 | 代理AB(核心+架构+设计)、代理C(使用+模式+部署) |
| 200k – 400k | 3代理模式 | 代理A(核心+设计)、代理B(架构+模式)、代理C(使用+部署) |
| > 400k | 3代理模式 | 代理A、B、C——每个代理处理≤150k令牌的文件,溢出文件分配给负载最轻的代理 |
为何设置80k阈值:Opus的上下文窗口为200k。当源代码令牌数≤80k时,加载所有文件+扫描器输出+Git元数据+编写概要文档的内容可完全容纳。子代理的开销(启动+通信+等待)会增加2-3分钟且无任何收益。
直接模式工作流:完全跳过阶段2。完成阶段0+1后,直接进入阶段3(直接读取扫描器的输出),然后是阶段4、阶段5。在合成文档时按需读取文件——不要预读所有文件,仅读取每个章节分析所需的文件。
detected_sectionsPhase 1: Community & External Data
阶段1:社区与外部数据
Run in parallel with Phase 2 subagent launches (or with Phase 3 in direct mode).
与阶段2的子代理启动并行执行(或在直接模式下与阶段3并行执行)。
1.1 GitHub Stats
1.1 GitHub统计数据
Parse owner/repo from remote origin URL:
.git/configbash
git -C {TARGET_DIR} remote get-url originExtract from the URL. Then:
owner/repobash
gh api repos/{owner}/{repo} --jq '{stars: .stargazers_count, forks: .forks_count, open_issues: .open_issues_count}'If is unavailable or not a GitHub repo → fill with . Do not fail.
ghN/A从的远程仓库URL中解析owner/repo:
.git/configbash
git -C {TARGET_DIR} remote get-url origin从URL中提取,然后执行:
owner/repobash
gh api repos/{owner}/{repo} --jq '{stars: .stargazers_count, forks: .forks_count, open_issues: .open_issues_count}'若工具不可用或非GitHub仓库→填充为,不抛出错误。
ghN/A1.2 Package Downloads
1.2 包下载量
npm (if exists):
package.jsonWebFetch https://api.npmjs.org/downloads/point/last-month/{package_name}PyPI (if exists):
pyproject.tomlWebFetch https://pypistats.org/api/packages/{package_name}/recentIf fetch fails → fill with .
N/Anpm(若存在):
package.jsonWebFetch https://api.npmjs.org/downloads/point/last-month/{package_name}PyPI(若存在):
pyproject.tomlWebFetch https://pypistats.org/api/packages/{package_name}/recent若获取失败→填充为。
N/A1.3 License
1.3 许可证
Read from (in order): LICENSE file → package metadata field → .
N/A从以下优先级来源读取:LICENSE文件→包元数据字段→。
N/A1.4 Maturity Assessment
1.4 成熟度评估
Calculate from:
- Git history length: first commit date → now
- Release count: number of version tags
- Contributor count: unique authors
| Criteria | Score |
|---|---|
| < 3 months, < 3 releases, 1-2 contributors | experimental |
| 3-12 months, 3-10 releases, 2-5 contributors | growing |
| 1-3 years, 10-50 releases, 5-20 contributors | stable |
| > 3 years, > 50 releases, > 20 contributors | mature |
Use the lowest matching tier (conservative estimate).
根据以下指标计算:
- Git历史时长:首次提交日期→当前日期
- 发布次数:版本标签数量
- 贡献者数量:唯一作者数量
| 评估标准 | 成熟度等级 |
|---|---|
| 时长<3个月,发布次数<3,贡献者1-2人 | 实验性 |
| 3-12个月,3-10次发布,2-5名贡献者 | 成长期 |
| 1-3年,10-50次发布,5-20名贡献者 | 稳定期 |
| >3年,>50次发布,>20名贡献者 | 成熟期 |
采用最低匹配等级(保守估算)。
Phase 2: Parallel Deep Exploration
阶段2:并行深度探索
Direct mode (≤80k tokens): SKIP this entire phase. Proceed to Phase 3. Opus reads files directly during synthesis.
Launch Sonnet subagents using the tool. All subagents must be launched in a single message.
TaskAssign files to each agent based on the token budget from Phase 0.5. Use the scanner output to determine which files go to which agent.
直接模式(≤80k令牌):完全跳过此阶段,直接进入阶段3。Opus在合成文档时按需读取文件。
使用工具启动Sonnet子代理。所有子代理必须通过单条消息启动。
Task根据阶段0.5的令牌预算为每个代理分配文件,使用扫描器输出确定文件分配方案。
File Assignment Strategy
文件分配策略
If workspaces detected (monorepo):
- Group files by workspace package
- Assign complete packages to agents (never split a package across agents)
- Agent A gets packages with core business logic
- Agent B gets packages with infrastructure/shared libraries
- Agent C gets packages with CLI/API/SDK surface + docs
If no workspaces (single project):
- Sort all files by path
- Group by top-level directory
- Assign groups to agents based on their responsibility:
- Agent A gets: core source files (src/lib, core/, models/, types/) + README, CHANGELOG
- Agent B gets: architecture files (routes/, middleware/, config/, entry points) + tests/
- Agent C gets: integration files (API, CLI, SDK, examples/, docs/) + .github/
- If files don't fit neatly, distribute remaining to agents under budget
若检测到工作区(单体仓库):
- 按工作区子包对文件分组
- 将完整子包分配给代理(绝不跨代理拆分子包)
- 代理A负责核心业务逻辑子包
- 代理B负责基础设施/共享库子包
- 代理C负责CLI/API/SDK接口+文档子包
若无工作区(单项目):
- 按路径对所有文件排序
- 按顶层目录分组
- 根据代理职责分配分组:
- 代理A:核心源代码文件(src/lib、core/、models/、types/)+ README、CHANGELOG
- 代理B:架构相关文件(routes/、middleware/、config/、入口点)+ 测试文件
- 代理C:集成相关文件(API、CLI、SDK、examples/、docs/)+ .github/
- 若文件无法完美分配,将剩余文件分配给未达令牌预算的代理
Agent A: Core Abstractions + Design Decisions
代理A:核心抽象 + 设计决策
Task prompt for Agent A — subagent_type: "general-purpose", model: "sonnet"Task prompt for Agent A — subagent_type: "general-purpose", model: "sonnet"Mission
任务目标
Identify the most architecturally significant abstractions AND key design decisions in this codebase.
识别代码库中最具架构意义的抽象概念及关键设计决策。
Files to Read
需读取的文件
{LIST_OF_ASSIGNED_FILES}
Also read: README.md, CHANGELOG.md (if they exist and not already assigned)
{LIST_OF_ASSIGNED_FILES}
额外读取:README.md、CHANGELOG.md(若存在且未分配给其他代理)
Output Format
输出格式
Part 1: Core Abstractions
第一部分:核心抽象
Report the TOP 10-15 most architecturally significant abstractions, ranked by fan-in (how many other files reference them). If the project has fewer than 15 meaningful abstractions, report all.
For EACH abstraction:
报告前10-15个最具架构意义的抽象概念,按扇入数(被其他文件引用的次数)排序。若项目中有效抽象少于15个,则全部报告。
每个抽象需包含:
{Name}
{名称}
- Purpose: {≤15 words}
- Defined in: or
{file_path}:ClassName{file_path}:function_name - Type: {class / interface / type / trait / struct / protocol}
- Public methods/fields: {exact_count}
- Adapters/implementations: {count} — {names with file paths}
- Imported by: {count} files
- Key pattern: {factory / singleton / strategy / observer / none}
- 用途:{≤15字}
- 定义位置:或
{file_path}:ClassName{file_path}:function_name - 类型:{类 / 接口 / 类型 / trait / 结构体 / 协议}
- 公共方法/字段:{精确数量}
- 适配器/实现:{数量} — {带文件路径的名称列表}
- 被引用次数:{数量}个文件
- 关键模式:{工厂模式 / 单例模式 / 策略模式 / 观察者模式 / 无}
Part 2: Design Decisions
第二部分:设计决策
For EACH decision (identify 3-5):
报告3-5个关键设计决策,每个决策包含:
{Decision Title}
{决策标题}
- Problem: {what needed solving}
- Choice made: {what was chosen}
- Evidence: or
{file_path}:ClassName— {relevant code pattern}{file_path}:function_name - Alternatives NOT chosen: {what else could have been done}
- Why not: {concrete reason — performance / complexity / ecosystem / team preference}
- Tradeoff: {what is gained} vs. {what is lost}
- 问题背景:{需要解决的问题}
- 选择方案:{最终采用的方案}
- 证据:或
{file_path}:ClassName— {相关代码模式}{file_path}:function_name - 未选择的替代方案:{其他可行方案}
- 未选择原因:{具体理由——性能 / 复杂度 / 生态系统 / 团队偏好}
- 权衡结果:{获得的收益} vs. {失去的东西}
Part 3: Architecture Risks
第三部分:架构风险
For EACH risk (identify 2-4):
- Risk: {specific description}
- Location:
{file_path}:SymbolName - Impact: {what breaks if this goes wrong}
- Mitigation: {how to fix or reduce risk}
报告2-4个架构风险,每个风险包含:
- 风险描述:{具体内容}
- 风险位置:
{file_path}:SymbolName - 影响范围:{风险触发后会导致哪些问题}
- 缓解措施:{修复或降低风险的方法}
Part 4: Recommendations
第四部分:改进建议
For EACH recommendation (identify 2-4):
- Current state: — {what exists now}
{file_path} - Problem: {specific issue — not "could be better"}
- Fix: {concrete action — not "consider refactoring"}
- Effect: {measurable outcome}
报告2-4个改进建议,每个建议包含:
- 当前状态:— {当前存在的情况}
{file_path} - 问题描述:{具体问题——不能写“可以优化”}
- 修复方案:{具体行动——不能写“考虑重构”}
- 预期效果:{可衡量的结果}
Rules
规则
- Every number must come from actual code (count imports, count methods)
- No subjective language (no "well-designed", "elegant", "robust", "clean", "優雅", "完美", "強大")
- Every claim needs a reference (NOT line numbers — they break on next commit)
file:SymbolName - Each decision must have a "why NOT the alternative" answer
- Report the TOTAL count of abstractions found
undefined- 所有数字必须来自实际代码(统计引用次数、方法数量等)
- 禁止使用主观语言(如“设计良好”、“优雅”、“健壮”、“简洁”、“優雅”、“完美”、“強大”等)
- 所有结论必须带有引用(禁止使用行号——代码提交后行号会变化)
file:SymbolName - 每个决策必须包含“为何未选择替代方案”的答案
- 报告检测到的抽象概念总数
undefinedAgent B: Architecture + Code Quality Patterns
代理B:架构 + 代码质量模式
Task prompt for Agent B — subagent_type: "general-purpose", model: "sonnet"Task prompt for Agent B — subagent_type: "general-purpose", model: "sonnet"Mission
任务目标
Map the system topology, layer boundaries, data flow paths, AND code quality patterns.
绘制系统拓扑结构、层边界、数据流路径及代码质量模式。
Files to Read
需读取的文件
{LIST_OF_ASSIGNED_FILES}
{LIST_OF_ASSIGNED_FILES}
Output Format
输出格式
Part 1: Topology
第一部分:拓扑结构
- Architecture style: {monolith / microservices / serverless / library / CLI tool / plugin system}
- Entry points: {list with file paths}
- Layer count: {N}
- 架构风格:{单体应用 / 微服务 / 无服务器 / 库 / CLI工具 / 插件系统}
- 入口点:{带文件路径的列表}
- 层级数量:{N}
Part 2: Layers (table)
第二部分:层级结构(表格)
| Layer | Modules | Files | Responsibility |
|---|
| 层级 | 模块 | 文件数 | 职责 |
|---|
Part 3: Data Flow Paths
第三部分:数据流路径
For each major user-facing operation:
- {Operation name}: {step1_module} → {step2_module} → ... → {result}
- Evidence: for each step
{file:SymbolName}
- Evidence:
针对每个主要用户操作:
- {操作名称}:{step1_module} → {step2_module} → ... → {结果}
- 证据:每个步骤对应
{file:SymbolName}
- 证据:每个步骤对应
Part 4: Mermaid Diagram Elements
第四部分:Mermaid图元素
Provide raw data for Mermaid diagrams:
- Nodes: {module_name} — {file_path}
- Edges: {from} → {to} — {relationship_type: imports/calls/extends}
提供Mermaid图的原始数据:
- 节点:{module_name} — {file_path}
- 边:{from} → {to} — {关系类型:imports/calls/extends}
Part 5: Module Dependencies (structured)
第五部分:模块依赖(结构化)
For each module:
- {module_name} (): imports [{dep1}, {dep2}, ...]
{path}
每个模块需包含:
- {module_name} (): 依赖 [{dep1}, {dep2}, ...]
{path}
Part 6: Boundary Violations
第六部分:边界违规
List any cases where a lower layer imports from a higher layer.
列出所有低层模块引用高层模块的情况。
Part 7: Code Quality Patterns
第七部分:代码质量模式
- Error handling: {strategy and consistency — e.g., "try/catch at controller layer, custom AppError class"}
- Logging: {framework and coverage — e.g., "winston, structured JSON, covers all API routes"}
- Testing: {framework, coverage level, patterns — e.g., "vitest, 47 test files, unit + integration"}
- Type safety: {strict / partial / none — e.g., "strict TypeScript with no casts"}
any
- 错误处理:{策略及一致性——例如:“控制器层使用try/catch,自定义AppError类”}
- 日志记录:{框架及覆盖范围——例如:“使用winston,结构化JSON格式,覆盖所有API路由”}
- 测试:{框架、覆盖水平、模式——例如:“使用vitest,47个测试文件,单元测试+集成测试”}
- 类型安全:{严格 / 部分 / 无——例如:“严格TypeScript,无类型转换”}
any
Rules
规则
- Every number must come from actual code
- No subjective language (no "well-designed", "elegant", "robust", "clean", "優雅", "完美", "強大")
- Every claim needs a reference (NOT line numbers)
file:SymbolName - Focus on HOW data moves, not WHAT the code does
undefined- 所有数字必须来自实际代码
- 禁止使用主观语言(如“设计良好”、“优雅”、“健壮”、“简洁”、“優雅”、“完美”、“強大”等)
- 所有结论必须带有引用(禁止使用行号)
file:SymbolName - 重点关注数据如何流动,而非代码实现的功能
undefinedAgent C: Usage + Deployment + Security
代理C:使用方式 + 部署 + 安全
Task prompt for Agent C — subagent_type: "general-purpose", model: "sonnet"Task prompt for Agent C — subagent_type: "general-purpose", model: "sonnet"Mission
任务目标
Document all consumption interfaces, deployment modes, security surface, and AI agent integration points.
记录所有消费接口、部署模式、安全面及AI代理集成点。
Files to Read
需读取的文件
{LIST_OF_ASSIGNED_FILES}
{LIST_OF_ASSIGNED_FILES}
Output Format
输出格式
Part 1: Consumption Interfaces
第一部分:消费接口
For each interface found:
- Type: {Python SDK / TS SDK / REST API / MCP / CLI / Vercel AI SDK / Library import}
- Entry point: or
{file_path}:ClassName{file_path}:function_name - Public surface: {N} exported functions/classes/endpoints
- Example usage: {minimal code snippet from docs/examples or inferred from exports}
每个接口需包含:
- 类型:{Python SDK / TS SDK / REST API / MCP / CLI / Vercel AI SDK / 库导入}
- 入口点:或
{file_path}:ClassName{file_path}:function_name - 公共接口:{N}个导出函数/类/端点
- 使用示例:{来自文档/示例的最小代码片段,或从导出内容推断}
Part 2: Configuration
第二部分:配置
| Source | Path | Key Settings |
|---|
| 来源 | 路径 | 关键设置 |
|---|
Part 3: Deployment Modes
第三部分:部署模式
| Mode | Evidence | Prerequisites |
|---|
| 模式 | 证据 | 前置条件 |
|---|
Part 4: AI Agent Integration
第四部分:AI代理集成
- MCP tools: {count and names, if any}
- Function calling schemas: {count, if any}
- Tool definitions: {count, if any}
- SDK integration: {Vercel AI SDK / LangChain / LlamaIndex / custom}
- MCP工具:{数量及名称(若有)}
- 函数调用 schema:{数量(若有)}
- 工具定义:{数量(若有)}
- SDK集成:{Vercel AI SDK / LangChain / LlamaIndex / 自定义}
Part 5: Security Surface
第五部分:安全面
- API key handling: {how and where}
- Auth mechanism: {type and file}
- CORS config: {if applicable}
- Data at rest: {encrypted / plaintext / N/A}
- PII handling: {anonymized / logged / none detected}
- API密钥处理:{方式及位置}
- 认证机制:{类型及对应文件}
- CORS配置:{若适用}
- 静态数据存储:{加密 / 明文 / N/A}
- PII数据处理:{匿名化 / 日志记录 / 未检测到}
Part 6: Performance & Cost Indicators
第六部分:性能与成本指标
| Metric | Value | Source |
|---|---|---|
| {LLM calls per request} | {N} | |
| {Cache strategy} | {type} | |
| {Rate limiting} | {config} | |
| 指标 | 数值 | 来源 |
|---|---|---|
| {每次请求的LLM调用次数} | {N} | |
| {缓存策略} | {类型} | |
| {速率限制} | {配置} | |
Rules
规则
- Every number must come from actual code
- No subjective language (no "well-designed", "elegant", "robust", "clean", "優雅", "完美", "強大")
- Every claim needs a reference (NOT line numbers)
file:SymbolName - Include BOTH documented and undocumented interfaces
---- 所有数字必须来自实际代码
- 禁止使用主观语言(如“设计良好”、“优雅”、“健壮”、“简洁”、“優雅”、“完美”、“強大”等)
- 所有结论必须带有引用(禁止使用行号)
file:SymbolName - 需包含已文档化和未文档化的接口
---Phase 3: Conditional Section Detection
阶段3:条件模块检测
Read the scanner's output from Phase 0.2. This is the primary detection source — the scanner checks dependency manifests and file presence automatically.
detected_sectionsCross-reference with subagent reports (skip in direct mode) for additional evidence richness. If a subagent reports a pattern not caught by the scanner (e.g., concurrency via raw without a library dependency), add it.
Promise.allRefer to for the full pattern reference.
references/section-detection-rules.mdRecord results as a checklist:
- [x] Storage Layer — scanner detected: prisma in dependencies
- [ ] Embedding Pipeline — not detected
- [x] Infrastructure Layer — scanner detected: Dockerfile present
- [ ] Knowledge Graph — not detected
- [ ] Scalability — not detected
- [x] Concurrency — Agent B reported: Promise.all pattern in src/worker.ts读取阶段0.2中扫描器输出的内容,这是主要检测来源——扫描器会自动检查依赖清单和文件存在性。
detected_sections交叉验证(直接模式下跳过):结合子代理报告补充证据。若子代理报告了扫描器未检测到的模式(例如:未使用库依赖的情况下通过原生实现并发),则添加该模块。
Promise.all参考获取完整模式参考。
references/section-detection-rules.md将结果记录为检查表格式:
- [x] 存储层 — 扫描器检测到:依赖中包含prisma
- [ ] 嵌入流水线 — 未检测到
- [x] 基础设施层 — 扫描器检测到:存在Dockerfile
- [ ] 知识图谱 — 未检测到
- [ ] 可扩展性 — 未检测到
- [x] 并发处理 — 代理B报告:src/worker.ts中存在Promise.all模式Phase 4: Synthesis & Draft
阶段4:合成与草稿
4.1 Merge Reports
4.1 合并报告
Subagent mode: Combine all subagent outputs into a working document.
Direct mode: Read key files on-demand as you write each section. Do NOT pre-read all files. For each section, read only the files relevant to that section's analysis.
Cross-validate:
- Core abstractions ↔ Architecture layers: each abstraction belongs to a layer
- Architecture data flow ↔ Usage interfaces: flows end at documented interfaces
- Design decisions ↔ Code evidence: decisions are backed by found patterns
子代理模式:将所有子代理的输出合并为一份工作文档。
直接模式:编写每个章节时按需读取关键文件,不要预读所有文件。每个章节仅读取与该章节分析相关的文件。
交叉验证内容:
- 核心抽象 ↔ 架构层级:每个抽象必须属于某一层级
- 架构数据流 ↔ 使用接口:数据流必须终止于已文档化的接口
- 设计决策 ↔ 代码证据:决策必须有对应的代码模式支持
4.2 Generate Mermaid Diagrams + Structured Dependencies
4.2 生成Mermaid图 + 结构化依赖
Using Agent B's raw data (or direct file analysis in direct mode), create:
Architecture Topology ():
graph TB- Each node = actual module/directory
- Each edge = import/dependency relationship
- Label edges with relationship type
- Group nodes by layer using subgraph
Data Flow ():
sequenceDiagram- Each participant = actual module
- Each arrow = actual function call or event
- Cover the primary user-facing operation
Structured Module Dependencies (text, below each Mermaid diagram):
- Provide a machine-parseable dependency list as fallback for LLM readers
- Format: {path}`): imports [{dep1}, {dep2}, ...]`
- **{module_name}** (\
使用代理B的原始数据(或直接模式下的文件分析结果)生成:
架构拓扑图():
graph TB- 每个节点 = 实际模块/目录
- 每条边 = 导入/依赖关系
- 边需标注关系类型
- 使用subgraph按层级分组节点
数据流图():
sequenceDiagram- 每个参与者 = 实际模块
- 每个箭头 = 实际函数调用或事件
- 覆盖主要用户操作流程
结构化模块依赖(Mermaid图下方的文本):
- 提供机器可解析的依赖列表,作为LLM读取的备选方案
- 格式:{path}`): 依赖 [{dep1}, {dep2}, ...]`
- **{module_name}** (\
4.3 Fill Output Template
4.3 填充输出模板
Follow exactly. Fill each section:
references/output-template.md| Section | Primary Source | Secondary Source |
|---|---|---|
| 1. Project Identity | Scanner metadata + Phase 1 | Git metadata |
| 2. Architecture | Agent B (Parts 1-6) | Agent A (abstractions per layer) |
| 3. Core Abstractions | Agent A (Part 1) | Agent B (layer context) |
| 4. Conditional | Phase 3 detection + relevant agents | — |
| 5. Usage Guide | Agent C (Parts 1-4) | Scanner entry_points |
| 6. Performance & Cost | Agent C (Part 6) + Agent B | — |
| 7. Security & Privacy | Agent C (Part 5) | — |
| 8. Design Decisions | Agent A (Part 2) | Agent B (architecture context) |
| 8.5 Code Quality & Patterns | Agent B (Part 7) | Agent A (supporting observations) |
| 9. Recommendations | Agent A (Part 4) | Agents B/C (supporting evidence) |
严格遵循的格式,填充每个章节:
references/output-template.md| 章节 | 主要数据源 | 次要数据源 |
|---|---|---|
| 1. 项目标识 | 扫描器元数据 + 阶段1 | Git元数据 |
| 2. 架构设计 | 代理B(第1-6部分) | 代理A(各层级抽象) |
| 3. 核心抽象 | 代理A(第1部分) | 代理B(层级上下文) |
| 4. 条件模块 | 阶段3检测结果 + 相关代理报告 | — |
| 5. 使用指南 | 代理C(第1-4部分) | 扫描器入口点数据 |
| 6. 性能与成本 | 代理C(第6部分) + 代理B | — |
| 7. 安全与隐私 | 代理C(第5部分) | — |
| 8. 设计决策 | 代理A(第2部分) | 代理B(架构上下文) |
| 8.5 代码质量与模式 | 代理B(第7部分) | 代理A(辅助观察结果) |
| 9. 改进建议 | 代理A(第4部分) | 代理B/C(辅助证据) |
4.4 Write Output
4.4 写入输出文件
Write the profile to using the Write tool.
docs/{project-name}.md使用Write工具将概要文档写入。
docs/{project-name}.mdPhase 5: Quality Gate
阶段5:质量检查
Read and verify the output.
references/quality-checklist.md读取并验证输出内容。
references/quality-checklist.md5.1 Banned Language Scan
5.1 禁用语言扫描
Search the written file for any word from the banned list:
English:
well-designed, elegant, elegantly, robust, clean, impressive,
state-of-the-art, cutting-edge, best-in-class, beautifully,
carefully crafted, thoughtfully, well-thought-out, well-architected,
nicely, cleverly, sophisticated, powerful, seamless, seamlessly,
intuitive, intuitivelyChinese:
優雅、完美、強大、直觀、無縫、精心、巧妙、出色、卓越、先進、高效、靈活、穩健、簡潔If found → replace with verifiable descriptions and re-write.
在生成的文件中搜索以下禁用词汇:
英文禁用词:
well-designed, elegant, elegantly, robust, clean, impressive,
state-of-the-art, cutting-edge, best-in-class, beautifully,
carefully crafted, thoughtfully, well-thought-out, well-architected,
nicely, cleverly, sophisticated, powerful, seamless, seamlessly,
intuitive, intuitively中文禁用词:
優雅、完美、強大、直觀、無縫、精心、巧妙、出色、卓越、先進、高效、靈活、穩健、簡潔若发现禁用词→替换为可验证的描述并重写内容。
5.2 Number Audit
5.2 数字审计
Scan for all numeric claims. Each must have a traceable source.
Remove or fix any "approximately", "around", "roughly", "several", "many", "numerous".
扫描所有数值型结论,每个数值必须有可追溯的来源。
删除或修正所有“大约”、“左右”、“大概”、“若干”、“许多”等模糊表述。
5.3 Structure Verification
5.3 结构验证
- Every section starts with
##blockquote summary> - No directory tree duplicated from CODEBASE_MAP.md
- No file extension enumeration (use percentages)
- No generic concluding paragraph
- At least one Mermaid diagram in Architecture section
- Structured module dependency list below each Mermaid diagram
- All Mermaid nodes reference actual modules
- 每个章节开头都有
##块引用摘要> - 未重复CODEBASE_MAP.md中的目录树内容
- 未枚举文件扩展名(使用百分比替代)
- 无通用总结段落
- 架构章节至少包含一个Mermaid图
- 每个Mermaid图下方都有结构化模块依赖列表
- 所有Mermaid节点都引用实际模块
5.4 Core Question Test
5.4 核心问题测试
For each of the 4 core questions, locate the specific answer in the output:
- Core abstractions → Section 3
- Module to modify → Section 2 Layer Boundaries table
- Biggest risk → Section 9 first recommendation
- When to use/not use → Section 1 positioning line
针对以下4个核心问题,在输出文件中定位具体答案:
- 核心抽象 → 第3章节
- 需修改的模块 → 第2章节的层级边界表格
- 最大风险 → 第9章节的第一条建议
- 适用/不适用场景 → 第1章节的定位描述
5.5 Evidence Audit
5.5 证据审计
- Section 3: every abstraction has reference
file:SymbolName - Section 8: every decision has + alternative + tradeoff
file:SymbolName - Section 8.5: code quality patterns have framework names + coverage facts
- Section 9: every recommendation has + specific problem + concrete fix
file_path
If any check fails → fix the issue in the file and re-verify.
- 第3章节:每个抽象都有引用
file:SymbolName - 第8章节:每个决策都有引用+替代方案+权衡结果
file:SymbolName - 第8.5章节:代码质量模式包含框架名称+覆盖事实
- 第9章节:每个建议都有+具体问题+可行修复方案
file_path
若任何检查不通过→修复文件内容并重新验证。
Output
最终输出
After all phases complete, report to the user:
Profile generated: docs/{project-name}.md
- {total_files} files scanned ({total_tokens} tokens)
- {N} core abstractions identified
- {N} design decisions documented
- {N} recommendations
- Conditional sections: {list of included sections or "none"}所有阶段完成后,向用户报告:
已生成项目概要文档:docs/{project-name}.md
- 扫描文件总数:{total_files}(总令牌数:{total_tokens})
- 识别核心抽象数量:{N}
- 记录设计决策数量:{N}
- 提出改进建议数量:{N}
- 条件模块:{已包含的模块列表或“无”}