architecture-md

Compare original and translation side by side

🇺🇸

Original

English
🇨🇳

Translation

Chinese

ARCHITECTURE.md Generator

ARCHITECTURE.md 生成工具

Generate high-quality ARCHITECTURE.md files that give newcomers a mental map of a codebase. Based on matklad's article: the biggest contributor bottleneck is not writing code, it's figuring out where to change it. ARCHITECTURE.md bridges that gap.
生成高质量的ARCHITECTURE.md文件,为新成员提供代码库的心智地图。基于matklad的文章:贡献者面临的最大瓶颈并非编写代码,而是搞清楚要在哪里修改代码。ARCHITECTURE.md正是用来填补这一空白的。

Core Principles

核心原则

  1. Short and stable -- Only describe things unlikely to change frequently. Don't synchronize with code. Revisit a couple of times a year.
  2. Bird's eye first -- Start with the problem being solved, not the solution.
  3. Codemap over prose -- Answer "where's the thing that does X?" and "what does this thing do?" for every module.
  4. Name, don't link -- Name important files, modules, types. Don't hyperlink (links go stale). Encourage symbol search.
  5. Invariants are gold -- Explicitly call out what's deliberately absent. Important invariants are often expressed as absence, and are impossible to divine from reading code.
  6. Mark boundaries -- API boundaries between layers constrain all possible implementations behind them. Finding a boundary by randomly reading code is hard.
  7. Cross-cutting concerns last -- After the codemap, address things that are everywhere and nowhere (error handling, testing, config).
  1. 简洁且稳定 —— 仅描述不太可能频繁变更的内容。无需与代码同步更新。每年重新审视几次即可。
  2. 先总览全局 —— 从项目要解决的问题入手,而非直接讲解决方案。
  3. 代码地图优先 —— 为每个模块解答「实现X功能的代码在哪里?」以及「这个模块的作用是什么?」。
  4. 只命名,不链接 —— 列出重要的文件、模块、类型名称。不要使用超链接(链接会失效)。鼓励使用符号搜索。
  5. 不变量是黄金 —— 明确指出那些刻意省略的内容。重要的不变量通常以“不存在”的形式体现,仅通过阅读代码无法推断出来。
  6. 标记边界 —— 各层级之间的API边界会约束其背后的所有可能实现。通过随机阅读代码很难找到这些边界。
  7. 横切关注点后置 —— 在代码地图之后,再处理那些无处不在又无迹可寻的内容(错误处理、测试、配置等)。

Workflow

工作流程

Step 1: Explore the Codebase

步骤1:探索代码库

Use
tree
, glob, and read tools to understand the project:
  • Read README, package.json/Cargo.toml/pyproject.toml for the project's purpose
  • Run
    tree -L 2 -d
    (or similar) to see directory structure
  • Identify entry points (main files, index files, bin directories)
  • Read key files at module boundaries to understand the layers
使用
tree
、glob和阅读工具来理解项目:
  • 阅读README、package.json/Cargo.toml/pyproject.toml了解项目用途
  • 运行
    tree -L 2 -d
    (或类似命令)查看目录结构
  • 确定入口点(主文件、索引文件、bin目录)
  • 阅读模块边界处的关键文件,理解层级结构

Step 2: Identify the Architecture

步骤2:梳理架构

Map out:
  • The problem being solved -- What does this project do? What's the input/output?
  • Coarse-grained modules -- What does each top-level directory/package do?
  • Data flow -- How does data move through the system? Input -> ??? -> Output
  • API boundaries -- Which modules are public interfaces vs internal implementation?
  • Architectural invariants -- What rules are enforced by structure? What's deliberately absent?
  • Cross-cutting concerns -- Error handling, testing strategy, configuration, observability
梳理以下内容:
  • 要解决的问题 —— 该项目的功能是什么?输入和输出分别是什么?
  • 粗粒度模块 —— 每个顶级目录/包的作用是什么?
  • 数据流 —— 数据如何在系统中流转?输入 -> ??? -> 输出
  • API边界 —— 哪些模块是公共接口,哪些是内部实现?
  • 架构不变量 —— 结构上强制执行了哪些规则?刻意省略了什么内容?
  • 横切关注点 —— 错误处理、测试策略、配置、可观测性

Step 3: Write the ARCHITECTURE.md

步骤3:编写ARCHITECTURE.md

Follow the template below. Keep the total document under ~300 lines for most projects.
遵循下方模板进行编写。大多数项目的文档总长度应控制在约300行以内。

Template

模板

markdown
undefined
markdown
undefined

Architecture

架构

[One paragraph: what this project does at the highest level. What problem it solves.]
[一段文字:说明该项目的最高层级功能,以及它解决的问题。]

Bird's Eye View

全局总览

[How data flows through the system at the coarsest level. Input -> Processing stages -> Output. Keep this to 1-3 paragraphs.]
[描述系统最粗粒度的数据流:输入 -> 处理阶段 -> 输出。 控制在1-3段文字。]

Code Map

代码地图

[Brief intro: "This section describes the high-level structure of the codebase. Pay attention to Boundary and Invariant callouts."]
[简短介绍:「本节描述代码库的高层级结构。 请注意边界不变量的标注。」]

path/to/module-a/

path/to/module-a/

[What this module does in 1-3 sentences. Key types:
ImportantType
,
AnotherType
.]
Boundary: [If this is an API boundary, say so and what it means.]
Invariant: [What's deliberately absent or enforced. E.g., "This module never does I/O" or "Nothing here depends on the HTTP layer."]
[用1-3句话说明该模块的作用。关键类型:
ImportantType
AnotherType
。]
边界: [如果这是一个API边界,请说明这一点及其含义。]
不变量: [刻意省略或强制执行的内容。例如:「该模块从不执行I/O操作」 或「此处不依赖HTTP层」。]

path/to/module-b/

path/to/module-b/

[Repeat for each significant module.]
[为每个重要模块重复上述内容。]

path/to/module-c/

path/to/module-c/

[...]
[...]

Cross-Cutting Concerns

横切关注点

Error Handling

错误处理

[How errors are handled across the codebase. Is it Result-based? Exceptions? Do errors propagate or get caught at boundaries?]
[说明整个代码库的错误处理方式。是基于Result的?还是使用异常? 错误是向上传播还是在边界处捕获?]

Testing

测试

[Testing strategy. Where do tests live? What kinds of tests exist? What are the important test boundaries?]
[测试策略。测试用例存放在哪里?有哪些类型的测试? 重要的测试边界是什么?]

[Other concerns as applicable]

[其他相关关注点]

[Configuration, observability/logging, code generation, build system, etc. Only include sections that are genuinely cross-cutting.]
undefined
[配置、可观测性/日志、代码生成、构建系统等。 仅包含真正属于横切性质的部分。]
undefined

Rules

规则

What to Include

需包含的内容

  • Directory/module purposes (1-3 sentences each)
  • Names of important types, traits, interfaces, functions (for symbol search)
  • API boundaries between layers
  • Architectural invariants -- especially things that are deliberately absent
  • Data flow at the system level
  • Cross-cutting concerns that affect multiple modules
  • 目录/模块的用途(每个1-3句话)
  • 重要类型、 trait、接口、函数的名称(用于符号搜索)
  • 各层级之间的API边界
  • 架构不变量 —— 尤其是那些刻意省略的内容
  • 系统级别的数据流
  • 影响多个模块的横切关注点

What to Omit

需省略的内容

  • Implementation details of how individual modules work (that's inline doc)
  • Links to specific files or lines (they go stale)
  • Anything that changes with routine PRs
  • Exhaustive API documentation (that's rustdoc/typedoc/javadoc territory)
  • Setup instructions (that's README)
  • Contribution guidelines (that's CONTRIBUTING.md)
  • 单个模块的实现细节(这是内联文档的职责)
  • 指向特定文件或行的链接(会失效)
  • 常规PR中会变更的内容
  • 详尽的API文档(这是rustdoc/typedoc/javadoc的范畴)
  • 安装设置说明(这是README的职责)
  • 贡献指南(这是CONTRIBUTING.md的职责)

Style Rules

格式规则

  • Use
    ### \
    path/to/module/`` headers with backtick-quoted paths for the codemap
  • Use Boundary: and Invariant: prefixed callouts (bold label, not blockquotes)
  • Keep module descriptions to 1-3 sentences
  • Name types in backticks: "Key types:
    FooBar
    ,
    BazQux
    "
  • Write in present tense, active voice
  • Prefer concrete over abstract: "parses CLI arguments" not "handles input processing"
  • 代码地图部分使用
    ### \
    path/to/module/``格式的标题,路径用反引号包裹
  • 使用**边界:不变量:**作为前缀的标注(加粗标签,而非块引用)
  • 模块描述控制在1-3句话以内
  • 类型名称用反引号包裹:「关键类型:
    FooBar
    BazQux
  • 使用现在时态、主动语态
  • 优先使用具体表述:「解析CLI参数」而非「处理输入」

Quality Checklist

质量检查清单

Before finishing, verify:
  • Can a newcomer find "the thing that does X" using only this doc?
  • Are API boundaries clearly marked?
  • Are architectural invariants (especially absences) called out?
  • Is every section stable enough to survive 6 months without update?
  • Are important types/modules named (not linked)?
  • Is there a bird's eye view before the codemap?
  • Are cross-cutting concerns addressed?
  • Does the codemap order match the data flow or dependency direction?
  • Is it under ~300 lines? (Shorter = more likely to be read and maintained)
完成前,请验证以下内容:
  • 新成员仅通过这份文档能否找到「实现X功能的代码」?
  • API边界是否标记清晰?
  • 架构不变量(尤其是省略的内容)是否明确标注?
  • 每个部分是否足够稳定,能够在6个月内无需更新?
  • 是否已列出重要的类型/模块名称(而非链接)?
  • 代码地图之前是否有全局总览?
  • 横切关注点是否已覆盖?
  • 代码地图的顺序是否与数据流或依赖方向一致?
  • 文档长度是否控制在约300行以内?(越短,被阅读和维护的可能性越高)

Reference Example

参考示例

See references/example.md for a complete example ARCHITECTURE.md for a hypothetical TypeScript project, demonstrating all the patterns above.
查看references/example.md获取完整的ARCHITECTURE.md示例,该示例针对一个假设的TypeScript项目,展示了上述所有模式。