project-profiler

Compare original and translation side by side

🇺🇸

Original

English

🇨🇳

Translation

Chinese

project-profiler

项目概要生成器（project-profiler）

Generate an LLM-optimized project profile — a judgment-rich document that lets any future LLM answer within 60 seconds:

What are the core abstractions?
Which modules to modify for feature X?
What is the biggest risk/debt?
When should / shouldn't you use this?

This is NOT a codebase map (directory + module navigation) or a diff schematic. This is architectural judgment: design tradeoffs, usage patterns, and when NOT to use.

生成一份LLM优化的项目概要文档——一份富含判断性内容的文档，能让后续任意LLM在60秒内回答以下问题：

核心抽象有哪些？
要实现功能X需要修改哪些模块？
最大的风险/技术债务是什么？
何时应该/不应该使用本项目？

这不是代码库地图（目录+模块导航）或差异示意图，而是架构判断：设计权衡、使用模式，以及不适用的场景。

Model Strategy

模型策略

Opus: Orchestrator — runs all phases, writes the final profile. Does NOT read source code directly (except in direct mode).
Sonnet: Subagents — read source code files, analyze patterns, report structured findings.
All subagents launch in a single message (parallel, never sequential).

Opus：协调器——执行所有阶段，编写最终的概要文档。不直接读取源代码（直接模式除外）。
Sonnet：子代理——读取源代码文件，分析模式，输出结构化的分析结果。
所有子代理通过单条消息启动（并行执行，绝不串行）。

Phase 0: Preflight

阶段0：准备工作

0.1 Target & Project Name

0.1 目标与项目名称

Determine the target directory (use argument if provided, else

Extract project name from the first available source:

```
package.json
```
→
```
name
```
```
pyproject.toml
```
→
```
[project] name
```
```
Cargo.toml
```
→
```
[package] name
```
```
go.mod
```
→ module path (last segment)
Directory name as fallback

确定目标目录（若提供参数则使用参数，否则使用当前目录

）。

从以下优先级来源提取项目名称：

```
package.json
```
→
```
name
```
字段
```
pyproject.toml
```
→
```
[project] name
```
字段
```
Cargo.toml
```
→
```
[package] name
```
字段
```
go.mod
```
→ 模块路径（最后一段）
目录名称（备选方案）

0.2 Run Scanner

0.2 运行扫描器

bash

uv run {SKILL_DIR}/scripts/scan-project.py {TARGET_DIR} --format summary

Capture the summary output. This provides:

Project metadata (name, version, license, deps count)
Tech stack (languages, frameworks, package manager)
Language distribution (top 5 by tokens)
Entry points (CLI, API, library)
Project features (dockerfile, CI, tests, codebase_map)
Detected conditional sections (Storage, Embedding, Infrastructure, etc.)
Workspaces (monorepo packages, if any)
Top 20 largest files
Directory structure (depth 3)

For debugging or when full file details are needed, use

--format json

instead.

bash

uv run {SKILL_DIR}/scripts/scan-project.py {TARGET_DIR} --format summary

捕获扫描器的摘要输出，内容包括：

项目元数据（名称、版本、许可证、依赖数量）
技术栈（语言、框架、包管理器）
语言分布（按代码令牌数排序的前5种语言）
入口点（CLI、API、库）
项目特性（Dockerfile、CI、测试、codebase_map）
检测到的条件模块（存储层、嵌入流水线、基础设施等）
工作区（若为单体仓库则包含子包）
前20个最大文件
目录结构（深度3级）

如需调试或需要完整文件详情，使用

--format json

参数替代。

0.3 Git Metadata

0.3 Git元数据

Run these commands (use Bash tool):

bash

undefined

运行以下命令（使用Bash工具）：

bash

undefined

Recent commits

Contributors

贡献者

git -C {TARGET_DIR} log --format="%aN" | sort -u | head -20

Version tags

版本标签

git -C {TARGET_DIR} tag --sort=-v:refname | head -5

First commit date

首次提交日期

git -C {TARGET_DIR} log --format="%aI" --reverse | head -1

undefined

git -C {TARGET_DIR} log --format="%aI" --reverse | head -1

undefined

0.4 Check Existing CODEBASE_MAP

0.4 检查现有CODEBASE_MAP

docs/CODEBASE_MAP.md

exists, note its presence. The profile will reference it rather than duplicating directory structure.

若

docs/CODEBASE_MAP.md

已存在，记录其存在。概要文档将引用该文件，而非重复目录结构内容。

0.5 Token Budget → Execution Mode

0.5 令牌预算 → 执行模式

Based on

total_tokens

from scanner, choose execution mode:

Total Tokens	Mode	Strategy
≤ 80k	Direct	Skip subagents. Opus reads all files directly and performs all analysis in a single context.
80k – 200k	2 agents	Agent AB (Core + Architecture + Design), Agent C (Usage + Patterns + Deployment)
200k – 400k	3 agents	Agent A (Core + Design), Agent B (Architecture + Patterns), Agent C (Usage + Deployment)
> 400k	3 agents	Agent A, Agent B, Agent C — each ≤150k tokens, with overflow files assigned to lightest agent

Why 80k threshold: Opus has 200k context. At ≤80k source tokens, loading all files + scanner output + git metadata + writing the profile all fit comfortably. Subagent overhead (spawn + communication + wait) adds 2-3 minutes for zero benefit.

Direct mode workflow: Skip Phase 2 entirely. After Phase 0+1, proceed to Phase 3 (read scanner

detected_sections

directly), then Phase 4, then Phase 5. Read files on-demand during synthesis — do NOT pre-read all files; read only what's needed for each section.

根据扫描器输出的

total_tokens

选择执行模式：

总令牌数	模式	策略
≤ 80k	直接模式	跳过子代理。Opus直接读取所有文件，并在单个上下文内完成所有分析。
80k – 200k	2代理模式	代理AB（核心+架构+设计）、代理C（使用+模式+部署）
200k – 400k	3代理模式	代理A（核心+设计）、代理B（架构+模式）、代理C（使用+部署）
> 400k	3代理模式	代理A、B、C——每个代理处理≤150k令牌的文件，溢出文件分配给负载最轻的代理

为何设置80k阈值：Opus的上下文窗口为200k。当源代码令牌数≤80k时，加载所有文件+扫描器输出+Git元数据+编写概要文档的内容可完全容纳。子代理的开销（启动+通信+等待）会增加2-3分钟且无任何收益。

直接模式工作流：完全跳过阶段2。完成阶段0+1后，直接进入阶段3（直接读取扫描器的

detected_sections

输出），然后是阶段4、阶段5。在合成文档时按需读取文件——不要预读所有文件，仅读取每个章节分析所需的文件。

Phase 1: Community & External Data

阶段1：社区与外部数据

Run in parallel with Phase 2 subagent launches (or with Phase 3 in direct mode).

与阶段2的子代理启动并行执行（或在直接模式下与阶段3并行执行）。

1.1 GitHub Stats

1.1 GitHub统计数据

Parse owner/repo from

.git/config

remote origin URL:

bash

git -C {TARGET_DIR} remote get-url origin

Extract

owner/repo

from the URL. Then:

bash

gh api repos/{owner}/{repo} --jq '{stars: .stargazers_count, forks: .forks_count, open_issues: .open_issues_count}'

gh

is unavailable or not a GitHub repo → fill with

N/A

. Do not fail.

从

.git/config

的远程仓库URL中解析owner/repo：

bash

git -C {TARGET_DIR} remote get-url origin

从URL中提取

owner/repo

，然后执行：

bash

gh api repos/{owner}/{repo} --jq '{stars: .stargazers_count, forks: .forks_count, open_issues: .open_issues_count}'

若

gh

工具不可用或非GitHub仓库→填充为

N/A

，不抛出错误。

1.2 Package Downloads

1.2 包下载量

npm (if

package.json

exists):

WebFetch https://api.npmjs.org/downloads/point/last-month/{package_name}

PyPI (if

pyproject.toml

exists):

WebFetch https://pypistats.org/api/packages/{package_name}/recent

If fetch fails → fill with

N/A

npm（若存在

package.json

）：

WebFetch https://api.npmjs.org/downloads/point/last-month/{package_name}

PyPI（若存在

pyproject.toml

）：

WebFetch https://pypistats.org/api/packages/{package_name}/recent

若获取失败→填充为

N/A

。

1.3 License

1.3 许可证

Read from (in order): LICENSE file → package metadata field →

N/A

从以下优先级来源读取：LICENSE文件→包元数据字段→

N/A

。

1.4 Maturity Assessment

1.4 成熟度评估

Calculate from:

Git history length: first commit date → now
Release count: number of version tags
Contributor count: unique authors

Criteria	Score
< 3 months, < 3 releases, 1-2 contributors	experimental
3-12 months, 3-10 releases, 2-5 contributors	growing
1-3 years, 10-50 releases, 5-20 contributors	stable
> 3 years, > 50 releases, > 20 contributors	mature

Use the lowest matching tier (conservative estimate).

根据以下指标计算：

Git历史时长：首次提交日期→当前日期
发布次数：版本标签数量
贡献者数量：唯一作者数量

评估标准	成熟度等级
时长<3个月，发布次数<3，贡献者1-2人	实验性
3-12个月，3-10次发布，2-5名贡献者	成长期
1-3年，10-50次发布，5-20名贡献者	稳定期
>3年，>50次发布，>20名贡献者	成熟期

采用最低匹配等级（保守估算）。

Phase 2: Parallel Deep Exploration

阶段2：并行深度探索

Direct mode (≤80k tokens): SKIP this entire phase. Proceed to Phase 3. Opus reads files directly during synthesis.

Launch Sonnet subagents using the

Task

tool. All subagents must be launched in a single message.

Assign files to each agent based on the token budget from Phase 0.5. Use the scanner output to determine which files go to which agent.

直接模式（≤80k令牌）：完全跳过此阶段，直接进入阶段3。Opus在合成文档时按需读取文件。

使用

Task

工具启动Sonnet子代理。所有子代理必须通过单条消息启动。

根据阶段0.5的令牌预算为每个代理分配文件，使用扫描器输出确定文件分配方案。

File Assignment Strategy

文件分配策略

If workspaces detected (monorepo):

Group files by workspace package
Assign complete packages to agents (never split a package across agents)
Agent A gets packages with core business logic
Agent B gets packages with infrastructure/shared libraries
Agent C gets packages with CLI/API/SDK surface + docs

If no workspaces (single project):

Sort all files by path
Group by top-level directory
Assign groups to agents based on their responsibility:
- Agent A gets: core source files (src/lib, core/, models/, types/) + README, CHANGELOG
- Agent B gets: architecture files (routes/, middleware/, config/, entry points) + tests/
- Agent C gets: integration files (API, CLI, SDK, examples/, docs/) + .github/
If files don't fit neatly, distribute remaining to agents under budget

若检测到工作区（单体仓库）：

按工作区子包对文件分组
将完整子包分配给代理（绝不跨代理拆分子包）
代理A负责核心业务逻辑子包
代理B负责基础设施/共享库子包
代理C负责CLI/API/SDK接口+文档子包

若无工作区（单项目）：

按路径对所有文件排序
按顶层目录分组
根据代理职责分配分组：
- 代理A：核心源代码文件（src/lib、core/、models/、types/）+ README、CHANGELOG
- 代理B：架构相关文件（routes/、middleware/、config/、入口点）+ 测试文件
- 代理C：集成相关文件（API、CLI、SDK、examples/、docs/）+ .github/
若文件无法完美分配，将剩余文件分配给未达令牌预算的代理

Agent A: Core Abstractions + Design Decisions

代理A：核心抽象 + 设计决策

Task prompt for Agent A — subagent_type: "general-purpose", model: "sonnet"

Task prompt for Agent A — subagent_type: "general-purpose", model: "sonnet"

Mission

任务目标

Identify the most architecturally significant abstractions AND key design decisions in this codebase.

识别代码库中最具架构意义的抽象概念及关键设计决策。

Files to Read

需读取的文件

{LIST_OF_ASSIGNED_FILES} Also read: README.md, CHANGELOG.md (if they exist and not already assigned)

{LIST_OF_ASSIGNED_FILES} 额外读取：README.md、CHANGELOG.md（若存在且未分配给其他代理）

Output Format

输出格式

Part 1: Core Abstractions

第一部分：核心抽象

Report the TOP 10-15 most architecturally significant abstractions, ranked by fan-in (how many other files reference them). If the project has fewer than 15 meaningful abstractions, report all.

For EACH abstraction:

报告前10-15个最具架构意义的抽象概念，按扇入数（被其他文件引用的次数）排序。若项目中有效抽象少于15个，则全部报告。

每个抽象需包含：

{Name}

{名称}

Purpose: {≤15 words}

Defined in:

{file_path}:ClassName

{file_path}:function_name

Type: {class / interface / type / trait / struct / protocol}
Public methods/fields: {exact_count}
Adapters/implementations: {count} — {names with file paths}
Imported by: {count} files
Key pattern: {factory / singleton / strategy / observer / none}

用途：{≤15字}

定义位置：

{file_path}:ClassName

或

{file_path}:function_name

类型：{类 / 接口 / 类型 / trait / 结构体 / 协议}
公共方法/字段：{精确数量}
适配器/实现：{数量} — {带文件路径的名称列表}
被引用次数：{数量}个文件
关键模式：{工厂模式 / 单例模式 / 策略模式 / 观察者模式 / 无}

Part 2: Design Decisions

第二部分：设计决策

For EACH decision (identify 3-5):

报告3-5个关键设计决策，每个决策包含：

{Decision Title}

{决策标题}

Problem: {what needed solving}
Choice made: {what was chosen}

Evidence:

{file_path}:ClassName

{file_path}:function_name

— {relevant code pattern}

Alternatives NOT chosen: {what else could have been done}
Why not: {concrete reason — performance / complexity / ecosystem / team preference}
Tradeoff: {what is gained} vs. {what is lost}

问题背景：{需要解决的问题}
选择方案：{最终采用的方案}

证据：

{file_path}:ClassName

或

{file_path}:function_name

— {相关代码模式}

未选择的替代方案：{其他可行方案}
未选择原因：{具体理由——性能 / 复杂度 / 生态系统 / 团队偏好}
权衡结果：{获得的收益} vs. {失去的东西}

Part 3: Architecture Risks

第三部分：架构风险

For EACH risk (identify 2-4):

Risk: {specific description}
Location:
```
{file_path}:SymbolName
```
Impact: {what breaks if this goes wrong}
Mitigation: {how to fix or reduce risk}

报告2-4个架构风险，每个风险包含：

风险描述：{具体内容}
风险位置：
```
{file_path}:SymbolName
```
影响范围：{风险触发后会导致哪些问题}
缓解措施：{修复或降低风险的方法}

Part 4: Recommendations

第四部分：改进建议

For EACH recommendation (identify 2-4):

Current state:
```
{file_path}
```
— {what exists now}
Problem: {specific issue — not "could be better"}
Fix: {concrete action — not "consider refactoring"}
Effect: {measurable outcome}

报告2-4个改进建议，每个建议包含：

当前状态：
```
{file_path}
```
— {当前存在的情况}
问题描述：{具体问题——不能写“可以优化”}
修复方案：{具体行动——不能写“考虑重构”}
预期效果：{可衡量的结果}

Rules

规则

Every number must come from actual code (count imports, count methods)
No subjective language (no "well-designed", "elegant", "robust", "clean", "優雅", "完美", "強大")
Every claim needs a
```
file:SymbolName
```
reference (NOT line numbers — they break on next commit)
Each decision must have a "why NOT the alternative" answer
Report the TOTAL count of abstractions found

undefined

所有数字必须来自实际代码（统计引用次数、方法数量等）
禁止使用主观语言（如“设计良好”、“优雅”、“健壮”、“简洁”、“優雅”、“完美”、“強大”等）
所有结论必须带有
```
file:SymbolName
```
引用（禁止使用行号——代码提交后行号会变化）
每个决策必须包含“为何未选择替代方案”的答案
报告检测到的抽象概念总数

undefined

Agent B: Architecture + Code Quality Patterns

代理B：架构 + 代码质量模式

Task prompt for Agent B — subagent_type: "general-purpose", model: "sonnet"

Task prompt for Agent B — subagent_type: "general-purpose", model: "sonnet"

Mission

任务目标

Map the system topology, layer boundaries, data flow paths, AND code quality patterns.

绘制系统拓扑结构、层边界、数据流路径及代码质量模式。

Files to Read

需读取的文件

{LIST_OF_ASSIGNED_FILES}

Output Format

输出格式

Part 1: Topology

第一部分：拓扑结构

Architecture style: {monolith / microservices / serverless / library / CLI tool / plugin system}
Entry points: {list with file paths}
Layer count: {N}

架构风格：{单体应用 / 微服务 / 无服务器 / 库 / CLI工具 / 插件系统}
入口点：{带文件路径的列表}
层级数量：{N}

Part 2: Layers (table)

第二部分：层级结构（表格）

Layer	Modules	Files	Responsibility

层级	模块	文件数	职责

Part 3: Data Flow Paths

第三部分：数据流路径

For each major user-facing operation:

{Operation name}: {step1_module} → {step2_module} → ... → {result}
- Evidence:
```
{file:SymbolName}
```
  for each step

针对每个主要用户操作：

{操作名称}：{step1_module} → {step2_module} → ... → {结果}
- 证据：每个步骤对应
```
{file:SymbolName}
```

Part 4: Mermaid Diagram Elements

第四部分：Mermaid图元素

Provide raw data for Mermaid diagrams:

Nodes: {module_name} — {file_path}
Edges: {from} → {to} — {relationship_type: imports/calls/extends}

提供Mermaid图的原始数据：

节点：{module_name} — {file_path}
边：{from} → {to} — {关系类型：imports/calls/extends}

Part 5: Module Dependencies (structured)

第五部分：模块依赖（结构化）

For each module:

{module_name} (
```
{path}
```
): imports [{dep1}, {dep2}, ...]

每个模块需包含：

{module_name} (
```
{path}
```
): 依赖 [{dep1}, {dep2}, ...]

Part 6: Boundary Violations

第六部分：边界违规

List any cases where a lower layer imports from a higher layer.

列出所有低层模块引用高层模块的情况。

Part 7: Code Quality Patterns

第七部分：代码质量模式

Error handling: {strategy and consistency — e.g., "try/catch at controller layer, custom AppError class"}
Logging: {framework and coverage — e.g., "winston, structured JSON, covers all API routes"}
Testing: {framework, coverage level, patterns — e.g., "vitest, 47 test files, unit + integration"}
Type safety: {strict / partial / none — e.g., "strict TypeScript with no
```
any
```
casts"}

错误处理：{策略及一致性——例如：“控制器层使用try/catch，自定义AppError类”}
日志记录：{框架及覆盖范围——例如：“使用winston，结构化JSON格式，覆盖所有API路由”}
测试：{框架、覆盖水平、模式——例如：“使用vitest，47个测试文件，单元测试+集成测试”}
类型安全：{严格 / 部分 / 无——例如：“严格TypeScript，无
```
any
```
类型转换”}

Rules

规则

Every number must come from actual code
No subjective language (no "well-designed", "elegant", "robust", "clean", "優雅", "完美", "強大")
Every claim needs a
```
file:SymbolName
```
reference (NOT line numbers)
Focus on HOW data moves, not WHAT the code does

undefined

所有数字必须来自实际代码
禁止使用主观语言（如“设计良好”、“优雅”、“健壮”、“简洁”、“優雅”、“完美”、“強大”等）
所有结论必须带有
```
file:SymbolName
```
引用（禁止使用行号）
重点关注数据如何流动，而非代码实现的功能

undefined

Agent C: Usage + Deployment + Security

代理C：使用方式 + 部署 + 安全

Task prompt for Agent C — subagent_type: "general-purpose", model: "sonnet"

Task prompt for Agent C — subagent_type: "general-purpose", model: "sonnet"

Mission

任务目标

Document all consumption interfaces, deployment modes, security surface, and AI agent integration points.

记录所有消费接口、部署模式、安全面及AI代理集成点。

Files to Read

需读取的文件

{LIST_OF_ASSIGNED_FILES}

Output Format

输出格式

Part 1: Consumption Interfaces

第一部分：消费接口

For each interface found:

Type: {Python SDK / TS SDK / REST API / MCP / CLI / Vercel AI SDK / Library import}

Entry point:

{file_path}:ClassName

{file_path}:function_name

Public surface: {N} exported functions/classes/endpoints
Example usage: {minimal code snippet from docs/examples or inferred from exports}

每个接口需包含：

类型：{Python SDK / TS SDK / REST API / MCP / CLI / Vercel AI SDK / 库导入}

入口点：

{file_path}:ClassName

或

{file_path}:function_name

公共接口：{N}个导出函数/类/端点
使用示例：{来自文档/示例的最小代码片段，或从导出内容推断}

Part 2: Configuration

第二部分：配置

Source	Path	Key Settings

来源	路径	关键设置

Part 3: Deployment Modes

第三部分：部署模式

Mode	Evidence	Prerequisites

模式	证据	前置条件

Part 4: AI Agent Integration

第四部分：AI代理集成

MCP tools: {count and names, if any}
Function calling schemas: {count, if any}
Tool definitions: {count, if any}
SDK integration: {Vercel AI SDK / LangChain / LlamaIndex / custom}

MCP工具：{数量及名称（若有）}
函数调用 schema：{数量（若有）}
工具定义：{数量（若有）}
SDK集成：{Vercel AI SDK / LangChain / LlamaIndex / 自定义}

Part 5: Security Surface

第五部分：安全面

API key handling: {how and where}
Auth mechanism: {type and file}
CORS config: {if applicable}
Data at rest: {encrypted / plaintext / N/A}
PII handling: {anonymized / logged / none detected}

API密钥处理：{方式及位置}
认证机制：{类型及对应文件}
CORS配置：{若适用}
静态数据存储：{加密 / 明文 / N/A}
PII数据处理：{匿名化 / 日志记录 / 未检测到}

Part 6: Performance & Cost Indicators

第六部分：性能与成本指标

Metric	Value	Source
{LLM calls per request}	{N}	`{file:SymbolName}`
{Cache strategy}	{type}	`{file:SymbolName}`
{Rate limiting}	{config}	`{file:SymbolName}`

指标	数值	来源
{每次请求的LLM调用次数}	{N}	`{file:SymbolName}`
{缓存策略}	{类型}	`{file:SymbolName}`
{速率限制}	{配置}	`{file:SymbolName}`

Rules

规则

Every number must come from actual code
No subjective language (no "well-designed", "elegant", "robust", "clean", "優雅", "完美", "強大")
Every claim needs a
```
file:SymbolName
```
reference (NOT line numbers)
Include BOTH documented and undocumented interfaces

---

所有数字必须来自实际代码
禁止使用主观语言（如“设计良好”、“优雅”、“健壮”、“简洁”、“優雅”、“完美”、“強大”等）
所有结论必须带有
```
file:SymbolName
```
引用（禁止使用行号）
需包含已文档化和未文档化的接口

---

Phase 3: Conditional Section Detection

阶段3：条件模块检测

Read the scanner's

detected_sections

output from Phase 0.2. This is the primary detection source — the scanner checks dependency manifests and file presence automatically.

Cross-reference with subagent reports (skip in direct mode) for additional evidence richness. If a subagent reports a pattern not caught by the scanner (e.g., concurrency via raw

Promise.all

without a library dependency), add it.

Refer to

references/section-detection-rules.md

for the full pattern reference.

Record results as a checklist:

- [x] Storage Layer — scanner detected: prisma in dependencies
- [ ] Embedding Pipeline — not detected
- [x] Infrastructure Layer — scanner detected: Dockerfile present
- [ ] Knowledge Graph — not detected
- [ ] Scalability — not detected
- [x] Concurrency — Agent B reported: Promise.all pattern in src/worker.ts

读取阶段0.2中扫描器输出的

detected_sections

内容，这是主要检测来源——扫描器会自动检查依赖清单和文件存在性。

交叉验证（直接模式下跳过）：结合子代理报告补充证据。若子代理报告了扫描器未检测到的模式（例如：未使用库依赖的情况下通过原生

Promise.all

实现并发），则添加该模块。

参考

references/section-detection-rules.md

获取完整模式参考。

将结果记录为检查表格式：

- [x] 存储层 — 扫描器检测到：依赖中包含prisma
- [ ] 嵌入流水线 — 未检测到
- [x] 基础设施层 — 扫描器检测到：存在Dockerfile
- [ ] 知识图谱 — 未检测到
- [ ] 可扩展性 — 未检测到
- [x] 并发处理 — 代理B报告：src/worker.ts中存在Promise.all模式

Phase 4: Synthesis & Draft

阶段4：合成与草稿

4.1 Merge Reports

4.1 合并报告

Subagent mode: Combine all subagent outputs into a working document. Direct mode: Read key files on-demand as you write each section. Do NOT pre-read all files. For each section, read only the files relevant to that section's analysis.

Cross-validate:

Core abstractions ↔ Architecture layers: each abstraction belongs to a layer
Architecture data flow ↔ Usage interfaces: flows end at documented interfaces
Design decisions ↔ Code evidence: decisions are backed by found patterns

子代理模式：将所有子代理的输出合并为一份工作文档。 直接模式：编写每个章节时按需读取关键文件，不要预读所有文件。每个章节仅读取与该章节分析相关的文件。

交叉验证内容：

核心抽象 ↔ 架构层级：每个抽象必须属于某一层级
架构数据流 ↔ 使用接口：数据流必须终止于已文档化的接口
设计决策 ↔ 代码证据：决策必须有对应的代码模式支持

4.2 Generate Mermaid Diagrams + Structured Dependencies

4.2 生成Mermaid图 + 结构化依赖

Using Agent B's raw data (or direct file analysis in direct mode), create:

Architecture Topology (

graph TB

Each node = actual module/directory
Each edge = import/dependency relationship
Label edges with relationship type
Group nodes by layer using subgraph

Data Flow (

sequenceDiagram

Each participant = actual module
Each arrow = actual function call or event
Cover the primary user-facing operation

Structured Module Dependencies (text, below each Mermaid diagram):

Provide a machine-parseable dependency list as fallback for LLM readers
Format:
```
- **{module_name}** (\
```
{path}`): imports [{dep1}, {dep2}, ...]`

使用代理B的原始数据（或直接模式下的文件分析结果）生成：

架构拓扑图（

graph TB

）：

每个节点 = 实际模块/目录
每条边 = 导入/依赖关系
边需标注关系类型
使用subgraph按层级分组节点

数据流图（

sequenceDiagram

）：

每个参与者 = 实际模块
每个箭头 = 实际函数调用或事件
覆盖主要用户操作流程

结构化模块依赖（Mermaid图下方的文本）：

提供机器可解析的依赖列表，作为LLM读取的备选方案
格式：
```
- **{module_name}** (\
```
{path}`): 依赖 [{dep1}, {dep2}, ...]`

4.3 Fill Output Template

4.3 填充输出模板

references/output-template.md

exactly. Fill each section:

Section	Primary Source	Secondary Source
1. Project Identity	Scanner metadata + Phase 1	Git metadata
2. Architecture	Agent B (Parts 1-6)	Agent A (abstractions per layer)
3. Core Abstractions	Agent A (Part 1)	Agent B (layer context)
4. Conditional	Phase 3 detection + relevant agents	—
5. Usage Guide	Agent C (Parts 1-4)	Scanner entry_points
6. Performance & Cost	Agent C (Part 6) + Agent B	—
7. Security & Privacy	Agent C (Part 5)	—
8. Design Decisions	Agent A (Part 2)	Agent B (architecture context)
8.5 Code Quality & Patterns	Agent B (Part 7)	Agent A (supporting observations)
9. Recommendations	Agent A (Part 4)	Agents B/C (supporting evidence)

严格遵循

references/output-template.md

的格式，填充每个章节：

章节	主要数据源	次要数据源
1. 项目标识	扫描器元数据 + 阶段1	Git元数据
2. 架构设计	代理B（第1-6部分）	代理A（各层级抽象）
3. 核心抽象	代理A（第1部分）	代理B（层级上下文）
4. 条件模块	阶段3检测结果 + 相关代理报告	—
5. 使用指南	代理C（第1-4部分）	扫描器入口点数据
6. 性能与成本	代理C（第6部分） + 代理B	—
7. 安全与隐私	代理C（第5部分）	—
8. 设计决策	代理A（第2部分）	代理B（架构上下文）
8.5 代码质量与模式	代理B（第7部分）	代理A（辅助观察结果）
9. 改进建议	代理A（第4部分）	代理B/C（辅助证据）

4.4 Write Output

4.4 写入输出文件

Write the profile to

docs/{project-name}.md

using the Write tool.

使用Write工具将概要文档写入

docs/{project-name}.md

。

Phase 5: Quality Gate

阶段5：质量检查

Read

references/quality-checklist.md

and verify the output.

读取

references/quality-checklist.md

并验证输出内容。

5.1 Banned Language Scan

5.1 禁用语言扫描

Search the written file for any word from the banned list:

English:

well-designed, elegant, elegantly, robust, clean, impressive,
state-of-the-art, cutting-edge, best-in-class, beautifully,
carefully crafted, thoughtfully, well-thought-out, well-architected,
nicely, cleverly, sophisticated, powerful, seamless, seamlessly,
intuitive, intuitively

Chinese:

優雅、完美、強大、直觀、無縫、精心、巧妙、出色、卓越、先進、高效、靈活、穩健、簡潔

If found → replace with verifiable descriptions and re-write.

在生成的文件中搜索以下禁用词汇：

英文禁用词：

well-designed, elegant, elegantly, robust, clean, impressive,
state-of-the-art, cutting-edge, best-in-class, beautifully,
carefully crafted, thoughtfully, well-thought-out, well-architected,
nicely, cleverly, sophisticated, powerful, seamless, seamlessly,
intuitive, intuitively

中文禁用词：

優雅、完美、強大、直觀、無縫、精心、巧妙、出色、卓越、先進、高效、靈活、穩健、簡潔

若发现禁用词→替换为可验证的描述并重写内容。

5.2 Number Audit

5.2 数字审计

Scan for all numeric claims. Each must have a traceable source. Remove or fix any "approximately", "around", "roughly", "several", "many", "numerous".

扫描所有数值型结论，每个数值必须有可追溯的来源。删除或修正所有“大约”、“左右”、“大概”、“若干”、“许多”等模糊表述。

5.3 Structure Verification

5.3 结构验证

Every
```
##
```
section starts with
```
>
```
blockquote summary
No directory tree duplicated from CODEBASE_MAP.md
No file extension enumeration (use percentages)
No generic concluding paragraph
At least one Mermaid diagram in Architecture section
Structured module dependency list below each Mermaid diagram
All Mermaid nodes reference actual modules

每个
```
##
```
章节开头都有
```
>
```
块引用摘要
未重复CODEBASE_MAP.md中的目录树内容
未枚举文件扩展名（使用百分比替代）
无通用总结段落
架构章节至少包含一个Mermaid图
每个Mermaid图下方都有结构化模块依赖列表
所有Mermaid节点都引用实际模块

5.4 Core Question Test

5.4 核心问题测试

For each of the 4 core questions, locate the specific answer in the output:

Core abstractions → Section 3
Module to modify → Section 2 Layer Boundaries table
Biggest risk → Section 9 first recommendation
When to use/not use → Section 1 positioning line

针对以下4个核心问题，在输出文件中定位具体答案：

核心抽象 → 第3章节
需修改的模块 → 第2章节的层级边界表格
最大风险 → 第9章节的第一条建议
适用/不适用场景 → 第1章节的定位描述

5.5 Evidence Audit

5.5 证据审计

Section 3: every abstraction has
```
file:SymbolName
```
reference
Section 8: every decision has
```
file:SymbolName
```
+ alternative + tradeoff
Section 8.5: code quality patterns have framework names + coverage facts
Section 9: every recommendation has
```
file_path
```
+ specific problem + concrete fix

If any check fails → fix the issue in the file and re-verify.

第3章节：每个抽象都有
```
file:SymbolName
```
引用
第8章节：每个决策都有
```
file:SymbolName
```
引用+替代方案+权衡结果
第8.5章节：代码质量模式包含框架名称+覆盖事实
第9章节：每个建议都有
```
file_path
```
+具体问题+可行修复方案

若任何检查不通过→修复文件内容并重新验证。

Output

最终输出

After all phases complete, report to the user:

Profile generated: docs/{project-name}.md
- {total_files} files scanned ({total_tokens} tokens)
- {N} core abstractions identified
- {N} design decisions documented
- {N} recommendations
- Conditional sections: {list of included sections or "none"}

所有阶段完成后，向用户报告：

已生成项目概要文档：docs/{project-name}.md
- 扫描文件总数：{total_files}（总令牌数：{total_tokens}）
- 识别核心抽象数量：{N}
- 记录设计决策数量：{N}
- 提出改进建议数量：{N}
- 条件模块：{已包含的模块列表或“无”}

project-profiler

Original

Translation

project-profiler

项目概要生成器（project-profiler）

Model Strategy

模型策略

Phase 0: Preflight

阶段0：准备工作

0.1 Target & Project Name

0.1 目标与项目名称

0.2 Run Scanner

0.2 运行扫描器

0.3 Git Metadata

0.3 Git元数据

Recent commits

最近提交记录

Contributors

贡献者

Version tags

版本标签

First commit date

首次提交日期

0.4 Check Existing CODEBASE_MAP

0.4 检查现有CODEBASE_MAP

0.5 Token Budget → Execution Mode

0.5 令牌预算 → 执行模式

Phase 1: Community & External Data

阶段1：社区与外部数据

1.1 GitHub Stats

1.1 GitHub统计数据

1.2 Package Downloads

1.2 包下载量

1.3 License

1.3 许可证

1.4 Maturity Assessment

1.4 成熟度评估

Phase 2: Parallel Deep Exploration

阶段2：并行深度探索

File Assignment Strategy

文件分配策略

Agent A: Core Abstractions + Design Decisions

代理A：核心抽象 + 设计决策

Mission

任务目标

Files to Read

需读取的文件

Output Format

输出格式

Part 1: Core Abstractions

第一部分：核心抽象

{Name}

{名称}

Part 2: Design Decisions

第二部分：设计决策

{Decision Title}

{决策标题}

Part 3: Architecture Risks

第三部分：架构风险

Part 4: Recommendations

第四部分：改进建议

Rules

规则

Agent B: Architecture + Code Quality Patterns

代理B：架构 + 代码质量模式

Mission

任务目标

Files to Read

需读取的文件

Output Format

输出格式

Part 1: Topology

第一部分：拓扑结构

Part 2: Layers (table)

第二部分：层级结构（表格）

Part 3: Data Flow Paths

第三部分：数据流路径

Part 4: Mermaid Diagram Elements

第四部分：Mermaid图元素

Part 5: Module Dependencies (structured)