cartographer

Compare original and translation side by side

🇺🇸

Original

English
🇨🇳

Translation

Chinese

Cartographer

Cartographer

Map and document codebases of any size using parallel AI subagents.
Creates
docs/CODEBASE_MAP.md
with architecture diagrams, file purposes, dependencies, and navigation guides. Updates
CLAUDE.md
with a summary.
使用并行AI子代理映射并记录任意规模的代码库。
生成包含架构图、文件用途、依赖关系和导航指南的
docs/CODEBASE_MAP.md
,并更新
CLAUDE.md
中的代码库摘要。

Triggers

触发条件

Activate when user says: "map this codebase", "cartographer", "/cartographer", "create codebase map", "document the architecture", "understand this codebase", or when onboarding to a new project.
当用户说出以下指令时触发:"map this codebase"、"cartographer"、"/cartographer"、"create codebase map"、"document the architecture"、"understand this codebase",或在新项目入职场景下触发。

Critical Principle

核心原则

"Opus orchestrates, Sonnet reads."
Never have Opus read codebase files directly. Always delegate file reading to Sonnet subagents—even for small codebases. Opus plans the work, spawns subagents, and synthesizes their reports.
“Opus负责统筹,Sonnet负责读取。”
绝不要让Opus直接读取代码库文件。始终将文件读取任务委托给Sonnet子代理——即便是小型代码库也需如此。Opus负责规划工作、生成子代理、整合报告。

Process

执行流程

1. Check for Existing Map

1. 检查现有映射文件

First check if
docs/CODEBASE_MAP.md
already exists.
If map exists:
  1. Read the
    last_mapped
    timestamp from the map's frontmatter
  2. Check for changes since last map:
    • Run
      git log --oneline --since="<last_mapped>"
      if git available
    • If no git, run scanner and compare file counts/paths
  3. If significant changes detected, proceed to update mode
  4. If no changes, inform user the map is current
If map does not exist: Proceed to full mapping.
首先检查
docs/CODEBASE_MAP.md
是否已存在。
若映射文件已存在:
  1. 从文件前置元数据中读取
    last_mapped
    时间戳
  2. 检查自上次映射以来的代码库变化:
    • 若Git可用,执行
      git log --oneline --since="<last_mapped>"
    • 若无Git,执行扫描脚本并对比文件数量/路径
  3. 若检测到显著变化,进入更新模式
  4. 若无变化,告知用户当前映射文件已为最新版本
若映射文件不存在: 执行完整映射流程。

2. Scan the Codebase

2. 扫描代码库

Run the scanner script to get an overview:
bash
undefined
运行扫描脚本获取代码库概览:
bash
undefined

Option 1: If uv is available (preferred)

选项1:若uv可用(推荐)

uv run ~/.claude/skills/cartographer/scripts/scan-codebase.py . --format json
uv run ~/.claude/skills/cartographer/scripts/scan-codebase.py . --format json

Option 2: Direct execution

选项2:直接执行

~/.claude/skills/cartographer/scripts/scan-codebase.py . --format json
~/.claude/skills/cartographer/scripts/scan-codebase.py . --format json

Option 3: Explicit python3

选项3:显式调用python3

python3 ~/.claude/skills/cartographer/scripts/scan-codebase.py . --format json

**Install tiktoken if missing:**
```bash
pip install tiktoken
python3 ~/.claude/skills/cartographer/scripts/scan-codebase.py . --format json

**若缺少tiktoken,执行安装:**
```bash
pip install tiktoken

or with uv:

或使用uv:

uv pip install tiktoken

The output provides:
- Complete file tree with token counts per file
- Total token budget needed
- Skipped files (binary, too large)
uv pip install tiktoken

扫描输出包含:
- 带单文件token计数的完整文件树
- 所需总token预算
- 已跳过的文件(二进制文件、过大文件)

3. Plan Subagent Assignments

3. 规划子代理任务分配

Analyze the scan output to divide work among subagents.
Token budget per subagent: ~150,000 tokens (safe margin under Sonnet's 200k context limit)
Grouping strategy:
  1. Group files by directory/module (keeps related code together)
  2. Balance token counts across groups
  3. Aim for more subagents with smaller chunks (150k max each)
For small codebases (<100k tokens): Still use a single Sonnet subagent. Opus orchestrates, Sonnet reads—never have Opus read the codebase directly.
Example assignment:
Subagent 1: src/api/, src/middleware/ (~120k tokens)
Subagent 2: src/components/, src/hooks/ (~140k tokens)
Subagent 3: src/lib/, src/utils/ (~100k tokens)
Subagent 4: tests/, docs/ (~80k tokens)
分析扫描输出,将工作分配给多个子代理。
单代理安全token预算:约150,000 tokens(低于Sonnet的200k上下文限制)
分组策略:
  1. 按目录/模块分组(保持相关代码关联)
  2. 平衡各组的token计数
  3. 尽量拆分出更多子代理,每个代理处理更小的代码块(单代理最大150k tokens)
小型代码库(<100k tokens): 仍使用单个Sonnet子代理。Opus负责统筹,Sonnet负责读取代码——绝不要让Opus直接读取代码库。
任务分配示例:
子代理1:src/api/、src/middleware/(约120k tokens)
子代理2:src/components/、src/hooks/(约140k tokens)
子代理3:src/lib/、src/utils/(约100k tokens)
子代理4:tests/、docs/(约80k tokens)

4. Spawn Sonnet Subagents in Parallel

4. 并行生成Sonnet子代理

Use the Task tool with
subagent_type: "Explore"
and
model: "sonnet"
for each group.
CRITICAL: Spawn all subagents in a SINGLE message with multiple Task tool calls.
Each subagent prompt should:
  1. List the specific files/directories to read
  2. Request analysis of:
    • Purpose of each file/module
    • Key exports and public APIs
    • Dependencies (what it imports)
    • Dependents (what imports it, if discoverable)
    • Patterns and conventions used
    • Gotchas or non-obvious behavior
  3. Request output as structured markdown
Example subagent prompt:
You are mapping part of a codebase. Read and analyze these files:
- src/api/routes.ts
- src/api/middleware/auth.ts
- src/api/middleware/rateLimit.ts
[... list all files in this group]

For each file, document:
1. **Purpose**: One-line description
2. **Exports**: Key functions, classes, types exported
3. **Imports**: Notable dependencies
4. **Patterns**: Design patterns or conventions used
5. **Gotchas**: Non-obvious behavior, edge cases, warnings

Also identify:
- How these files connect to each other
- Entry points and data flow
- Any configuration or environment dependencies

Return your analysis as markdown with clear headers per file/module.
为每个分组调用Task工具,设置
subagent_type: "Explore"
model: "sonnet"
重要注意事项:在单条消息中通过多个Task工具调用一次性生成所有子代理。
每个子代理的提示应包含:
  1. 需读取的具体文件/目录列表
  2. 需分析的内容:
    • 每个文件/模块的用途
    • 核心导出与公共API
    • 依赖关系(导入的内容)
    • 被依赖关系(导入它的内容,若可识别)
    • 使用的模式与约定
    • 潜在陷阱或非直观行为
  3. 要求输出为结构化Markdown
子代理提示示例:
你正在映射代码库的一部分,请读取并分析以下文件:
- src/api/routes.ts
- src/api/middleware/auth.ts
- src/api/middleware/rateLimit.ts
[...列出该分组下的所有文件]

请为每个文件记录:
1. **用途**:单行描述
2. **导出**:核心函数、类、类型
3. **导入**:关键依赖
4. **模式**:使用的设计模式或约定
5. **陷阱**:非直观行为、边界情况、注意事项

同时请识别:
- 这些文件之间的关联方式
- 入口点与数据流
- 任何配置或环境依赖

请以Markdown格式返回分析结果,每个文件/模块使用清晰的标题区分。

5. Synthesize Reports

5. 整合子代理报告

Once all subagents complete, synthesize their outputs:
  1. Merge all subagent reports
  2. Deduplicate any overlapping analysis
  3. Identify cross-cutting concerns (shared patterns, common gotchas)
  4. Build the architecture diagram showing module relationships
  5. Extract key navigation paths for common tasks
所有子代理完成任务后,整合它们的输出:
  1. 合并所有子代理的报告
  2. 去重重复的分析内容
  3. 识别跨模块关注点(共享模式、通用陷阱)
  4. 构建架构图展示模块间的关系
  5. 提取常见任务的核心导航路径

Diagram Rendering

架构图渲染

For architecture diagrams, invoke
/beautiful-mermaid
to render Mermaid as production-quality SVG/PNG.
对于架构图,调用
/beautiful-mermaid
将Mermaid代码渲染为生产级SVG/PNG格式。

6. Write CODEBASE_MAP.md

6. 生成CODEBASE_MAP.md

Create
docs/CODEBASE_MAP.md
with this structure:
markdown
---
last_mapped: YYYY-MM-DDTHH:MM:SSZ
total_files: N
total_tokens: N
---
创建
docs/CODEBASE_MAP.md
,结构如下:
markdown
---
last_mapped: YYYY-MM-DDTHH:MM:SSZ
total_files: N
total_tokens: N
---

Codebase Map

代码库映射

Auto-generated by Cartographer. Last mapped: [date]
由Cartographer自动生成,最后映射时间:[日期]

System Overview

系统概述

[2-3 paragraph summary of what this codebase does]
[2-3段代码库功能摘要]

Architecture

架构

mermaid
graph TB
    subgraph Client
        Web[Web App]
    end
    subgraph API
        Server[API Server]
        Auth[Auth Middleware]
    end
    subgraph Data
        DB[(Database)]
        Cache[(Cache)]
    end
    Web --> Server
    Server --> Auth
    Server --> DB
    Server --> Cache
[Adapt diagram to match actual architecture]
mermaid
graph TB
    subgraph Client
        Web[Web应用]
    end
    subgraph API
        Server[API服务器]
        Auth[认证中间件]
    end
    subgraph Data
        DB[(数据库)]
        Cache[(缓存)]
    end
    Web --> Server
    Server --> Auth
    Server --> DB
    Server --> Cache
[根据实际架构调整图表]

Directory Structure

目录结构

[Tree with purpose annotations]
[带用途注释的文件树]

Module Guide

模块指南

[Module Name]

[模块名称]

Purpose: [description] Entry point: [file] Key files:
FilePurposeTokens
Exports: [key APIs] Dependencies: [what it needs] Dependents: [what needs it]
[Repeat for each module]
用途:[描述] 入口点:[文件路径] 核心文件:
文件用途Token数
导出内容:[核心API] 依赖:[所需资源] 被依赖:[依赖该模块的内容]
[为每个模块重复上述结构]

Data Flow

数据流

mermaid
sequenceDiagram
    participant User
    participant Web
    participant API
    participant DB

    User->>Web: Action
    Web->>API: Request
    API->>DB: Query
    DB-->>API: Result
    API-->>Web: Response
    Web-->>User: Update UI
[Create diagrams for: auth flow, main data operations, etc.]
mermaid
sequenceDiagram
    participant User
    participant Web
    participant API
    participant DB

    User->>Web: 执行操作
    Web->>API: 发送请求
    API->>DB: 发起查询
    DB-->>API: 返回结果
    API-->>Web: 返回响应
    Web-->>User: 更新界面
[创建认证流程、核心数据操作等相关图表]

Conventions

约定规范

[Naming patterns, code style, architectural rules]
[命名模式、代码风格、架构规则]

Gotchas

注意事项

[Non-obvious behaviors, warnings, things that trip people up]
[非直观行为、边界情况、警告信息]

Navigation Guide

导航指南

To add a new API endpoint: [files to touch] To add a new component: [files to touch] To modify auth: [files to touch] To add a database migration: [files to touch] [etc. based on codebase type]
undefined
添加新API端点:[需修改的文件] 添加新组件:[需修改的文件] 修改认证逻辑:[需修改的文件] 添加数据库迁移:[需修改的文件] [根据代码库类型补充其他常见任务]
undefined

7. Update CLAUDE.md

7. 更新CLAUDE.md

Add or update the codebase summary in CLAUDE.md:
markdown
undefined
在CLAUDE.md中添加或更新代码库摘要:
markdown
undefined

Codebase Overview

代码库概述

[2-3 sentence summary]
Stack: [key technologies] Structure: [high-level layout]
For detailed architecture, see docs/CODEBASE_MAP.md.

If `AGENTS.md` exists, update it similarly.
[2-3句话的摘要]
技术栈:[核心技术] 结构:[高层级布局]
详细架构请查看docs/CODEBASE_MAP.md

若`AGENTS.md`存在,执行类似的更新操作。

Update Mode

更新模式

When updating an existing map:
  1. Identify changed files from git or scanner diff
  2. Spawn subagents only for changed modules
  3. Merge new analysis with existing map
  4. Update
    last_mapped
    timestamp
  5. Preserve unchanged sections
当更新现有映射文件时:
  1. 通过Git或扫描器差异识别已变更的文件
  2. 仅为变更的模块生成子代理
  3. 将新分析内容与现有映射文件合并
  4. 更新
    last_mapped
    时间戳
  5. 保留未变更的内容

Token Budget Reference

Token预算参考

ModelContext WindowSafe Budget per Subagent
Sonnet200,000150,000
Opus200,000100,000
Haiku200,000100,000
Always use Sonnet subagents—best balance of capability and cost for file analysis.
模型上下文窗口单代理安全预算
Sonnet200,000150,000
Opus200,000100,000
Haiku200,000100,000
始终使用Sonnet子代理——在文件分析任务中,它在能力与成本间达到最佳平衡。

Troubleshooting

故障排除

Scanner fails with tiktoken error:
bash
pip install tiktoken
扫描器因tiktoken报错:
bash
pip install tiktoken

or with uv:

或使用uv:

uv pip install tiktoken

**Python not found:**
Try `python3`, `python`, or use `uv run` which handles Python automatically.

**Codebase too large even for subagents:**
- Increase number of subagents
- Focus on src/ directories, skip vendored code
- Use `--max-tokens` flag to skip huge files

**Git not available:**
- Fall back to file count/path comparison
- Store file list hash in map frontmatter for change detection
uv pip install tiktoken

**未找到Python:**
尝试使用`python3`、`python`,或使用`uv run`自动处理Python环境。

**代码库过大,子代理无法处理:**
- 增加子代理数量
- 重点处理src/目录,跳过第三方依赖代码
- 使用`--max-tokens`参数跳过超大文件

**Git不可用:**
- 退回到文件数量/路径对比
- 在映射文件前置元数据中存储文件列表哈希值用于变更检测

Output

输出结果

After completion, report what was created:
  • docs/CODEBASE_MAP.md
    - full architecture documentation
  • Updated
    CLAUDE.md
    with summary
If cartographer helped you, consider starring: https://github.com/kingbootoshi/cartographer
完成后,向用户报告已生成的内容:
  • docs/CODEBASE_MAP.md
    - 完整的架构文档
  • 更新后的
    CLAUDE.md
    - 包含代码库摘要
若Cartographer对你有帮助,欢迎为项目点赞:https://github.com/kingbootoshi/cartographer