onboarding

Compare original and translation side by side

🇺🇸

Original

English

🇨🇳

Translation

Chinese

Generate Onboarding Document

生成入职文档

Crawl a repository and generate

ONBOARDING.md

at the repo root -- a document that helps new contributors understand the codebase without requiring the creator to explain it.

Onboarding is a general problem in software, but it is more acute in fast-moving codebases where code is written faster than documentation -- whether through AI-assisted development, rapid prototyping, or simply a team that ships faster than it documents. This skill reconstructs the mental model from the code itself.

This skill always regenerates the document from scratch. It does not read or diff a previous version. If

ONBOARDING.md

already exists, it is overwritten.

爬取仓库内容并在仓库根目录生成

ONBOARDING.md

文档——这份文档能帮助新贡献者无需创建者额外解释就能理解代码库。

入职引导是软件开发中的普遍问题，在代码编写速度快于文档更新的快速迭代代码库中尤为突出——无论是通过AI辅助开发、快速原型构建，还是团队交付速度远超文档更新速度的场景。本功能将从代码本身重构出开发者的心智模型。

本功能始终从头重新生成文档，不会读取或对比旧版本。如果

ONBOARDING.md

已存在，它将被覆盖。

Core Principles

核心原则

Write for humans first -- Clear prose that a new developer can read and understand. Agent utility is a side effect of good human writing, not a separate goal.
Show, don't just tell -- Use ASCII diagrams for architecture and flow, markdown tables for structured information, and backtick formatting for all file paths, commands, and code references.
Five sections, each earning its place -- Every section answers a question a new contributor will ask in their first hour. No speculative sections.
State what you can observe, not what you must infer -- Do not fabricate design rationale or assess fragility. If the code doesn't reveal why a decision was made, don't guess.
Never include secrets -- The onboarding document is committed to the repository. Never include API keys, tokens, passwords, connection strings with credentials, or any other secret values. Reference environment variable names (
```
STRIPE_SECRET_KEY
```
), never their values. If a
```
.env
```
file contains actual secrets, extract only the variable names.
Link, don't duplicate -- When existing documentation covers a topic well, link to it inline rather than re-explaining.

以人为本——使用新开发者能轻松阅读和理解的清晰文字。Agent的实用性是优质人工文档的附带效果，而非独立目标。
展示而非仅告知——使用ASCII图表展示架构和流程，用markdown表格呈现结构化信息，所有文件路径、命令和代码引用都使用反引号格式化。
五个必要章节——每个章节都回答新贡献者在最初一小时内会提出的问题，无多余章节。
陈述可观察事实，而非主观推断——不要编造设计理由或评估代码脆弱性。如果代码未揭示某个决策的原因，请勿猜测。
绝不包含敏感信息——入职文档会提交到仓库中，绝不能包含API密钥、令牌、密码、带凭证的连接字符串或其他敏感值。仅引用环境变量名称（如
```
STRIPE_SECRET_KEY
```
），绝不提及值。如果
```
.env
```
文件包含实际敏感信息，仅提取变量名称。
链接而非重复——当已有文档能很好地覆盖某个主题时，直接内联链接到该文档，无需重新解释。

Execution Flow

执行流程

Phase 1: Gather Inventory

阶段1：收集清单

Run the bundled inventory script (

scripts/inventory.mjs

) to get a structural map of the repository without reading every file:

bash

node scripts/inventory.mjs --root .

Parse the JSON output. This provides:

Project name, languages, frameworks, package manager, test framework
Directory structure (top-level + one level into source directories)
Entry points per detected ecosystem
Available scripts/commands
Existing documentation files (with first-heading titles for triage)
Test infrastructure
Infrastructure and external dependencies (env files, docker services, detected integrations)
Monorepo structure (if applicable)

If the script fails or returns an error field, report the issue to the user and stop. Do not attempt to write

ONBOARDING.md

from incomplete data.

运行内置的清单脚本（

scripts/inventory.mjs

），无需读取每个文件即可获取仓库的结构映射：

bash

node scripts/inventory.mjs --root .

解析JSON输出，它将提供：

项目名称、使用语言、框架、包管理器、测试框架
目录结构（顶层目录及源码目录下一级）
各检测到的生态系统的入口点
可用的脚本/命令
现有文档文件（包含用于分类的一级标题）
测试基础设施
基础设施和外部依赖（环境文件、Docker服务、检测到的集成）
单体仓库结构（如适用）

如果脚本执行失败或返回错误字段，向用户报告问题并停止操作。不要尝试用不完整的数据编写

ONBOARDING.md

。

Phase 2: Read Key Files

阶段2：读取关键文件

Guided by the inventory, read files that are essential for understanding the codebase. Use the native file-read tool (not shell commands).

What to read and why:

Read files in parallel batches where there are no dependencies between them. For example, batch README.md, entry points, and AGENTS.md/CLAUDE.md together in a single turn since none depend on each other's content.

Only read files whose content is needed to write the five sections with concrete, specific detail. The inventory already provides structure, languages, frameworks, scripts, and entry point paths -- don't re-read files just to confirm what the inventory already says. Different repos need different amounts of reading; a small CLI tool might need 4 files, a complex monorepo might need 20. Let the sections drive what you read, not an arbitrary count.

Priority order:

README.md (if exists) -- for project purpose and setup instructions
Primary entry points -- the files listed in
```
entryPoints
```
from the inventory. These reveal what the application does when it starts.
Route/controller files -- look for
```
routes/
```
,
```
app/controllers/
```
,
```
src/routes/
```
,
```
src/api/
```
, or similar directories from the inventory structure. Read the main route file to understand the primary flow.
Configuration files that reveal architecture and external dependencies --
```
docker-compose.yml
```
,
```
.env.example
```
,
```
.env.sample
```
, database config,
```
next.config.*
```
,
```
vite.config.*
```
, or similar. Only read these if they exist in the inventory. Never read
.env
itself -- only
```
.env.example
```
or
```
.env.sample
```
templates. Extract variable names only, never values.
AGENTS.md or CLAUDE.md (if exists) -- for project conventions and patterns already documented.
Discovered documentation -- the inventory's
```
docs
```
list includes each file's title (first heading). Use those titles to decide which docs are relevant to the five sections without reading them first. Only read the full content of docs whose titles indicate direct relevance. Skip dated brainstorm/plan files unless the focus hint specifically calls for them.

Do not read files speculatively. Every file read should be justified by the inventory output and traceable to a section that needs it.

根据清单结果，读取理解代码库必不可少的文件。使用原生文件读取工具（而非shell命令）。

读取内容及原因：

在文件之间无依赖关系的情况下，并行批量读取文件。例如，可在同一轮次中批量读取README.md、入口点文件和AGENTS.md/CLAUDE.md，因为它们彼此之间无内容依赖。

仅读取为撰写五个章节提供具体细节所需的文件。清单已提供结构、语言、框架、脚本和入口点路径——不要为了确认清单已提供的信息而重新读取文件。不同仓库需要读取的文件数量不同：小型CLI工具可能只需4个文件，复杂的单体仓库可能需要20个。由章节需求决定读取内容，而非任意数量。

优先级顺序：

README.md（如果存在）——用于了解项目用途和设置说明
主要入口点——清单中
```
entryPoints
```
列出的文件，这些文件揭示应用启动时的行为
路由/控制器文件——从清单结构中查找
```
routes/
```
、
```
app/controllers/
```
、
```
src/routes/
```
、
```
src/api/
```
或类似目录。读取主路由文件以理解核心流程
揭示架构和外部依赖的配置文件——
```
docker-compose.yml
```
、
```
.env.example
```
、
```
.env.sample
```
、数据库配置、
```
next.config.*
```
、
```
vite.config.*
```
或类似文件。仅当清单中存在这些文件时才读取。绝不读取
.env
文件本身——仅读取
```
.env.example
```
或
```
.env.sample
```
模板。仅提取变量名称，绝不提取值。
AGENTS.md或CLAUDE.md（如果存在）——用于了解已记录的项目约定和模式
已发现的文档——清单的
```
docs
```
列表包含每个文件的标题（一级标题）。先通过标题判断哪些文档与五个章节相关，无需先读取全文。仅读取标题显示直接相关的文档的完整内容。除非重点提示特别要求，否则跳过过时的头脑风暴/计划文件。

不要随意读取文件。每个读取的文件都应能通过清单输出证明其必要性，并可追溯到需要它的章节。

Phase 3: Write ONBOARDING.md

阶段3：编写ONBOARDING.md

Synthesize the inventory data and key file contents into a document with exactly five sections. Write the file to the repo root.

Title: Use

# {Project Name} Onboarding Guide

as the document heading. Derive the project name from the inventory. Do not use the filename as a heading.

Writing style -- the document should read like a knowledgeable teammate explaining the project over coffee, not like generated documentation.

Voice and tone:

Write in second person ("you") -- speak directly to the new contributor
Use active voice and present tense: "The router dispatches requests to handlers" not "Requests are dispatched by the router to handlers"
Be direct. Lead sentences with what matters, not with setup: "Run
```
bun dev
```
to start the server" not "In order to start the development server, you will need to run the following command"
Match the formality of the codebase. A scrappy prototype gets casual prose. An enterprise system gets more precise language. Read the README and existing docs for tone cues.

Clarity:

Every sentence should teach the reader something or tell them what to do. Cut any sentence that doesn't.
Prefer concrete over abstract: "
```
src/services/billing.ts
```
charges the customer's card" not "The billing module handles payment-related business logic"
When introducing a term, define it immediately in context. Don't make the reader scroll to a glossary.
Use the simplest word that's accurate. "Use" not "utilize." "Start" not "initialize." "Send" not "transmit."

What to avoid:

Filler and throat-clearing: "It's important to note that", "As mentioned above", "In this section we will"
Vague summarization: "This module handles various aspects of..." -- say specifically what it does
Hedge words when stating facts: "This essentially serves as", "This is basically" -- if you know what it does, say it plainly
Superlatives and marketing language: "robust", "powerful", "comprehensive", "seamless"
Meta-commentary about the document itself: "This document aims to..." -- just do the thing

Formatting requirements -- apply consistently throughout:

Use backticks for all file names (
```
package.json
```
), paths (
```
src/routes/
```
), commands (
```
bun test
```
), function/class names, environment variables, and technical terms
Use markdown headers (
```
##
```
) for the five sections
Use ASCII diagrams and markdown tables where specified below
Use bold for emphasis sparingly
Keep paragraphs short -- 2-4 sentences

Section separators -- Insert a horizontal rule (

---

) between each

##

section. These documents are dense and benefit from strong visual breaks when scanning.

Width constraint for code blocks -- 80 columns max. Markdown code blocks render with

white-space: pre

and never wrap, so wide lines cause horizontal scrolling on GitHub, tablets, and narrow viewports. Tables are fine -- markdown renderers wrap them. Apply these rules to all content inside ``` fences:

ASCII architecture diagrams: Stack boxes vertically instead of laying them out horizontally. Never place more than 2 boxes on the same horizontal line, and keep each box label under 20 characters. This caps diagrams at ~60 chars wide.
Flow diagrams: Keep file path + annotation under 80 chars. If a description is too long, move it to a line below or shorten it.
Directory trees: Keep inline
```
# comments
```
under 30 characters. Prefer brief role descriptions ("Editor plugins") over exhaustive lists ("marks, heatmap, suggestions, collab cursors, etc.").

将清单数据和关键文件内容整合为包含恰好五个章节的文档。将文件写入仓库根目录。

标题：使用

# {项目名称} 入职指南

作为文档标题。从清单中获取项目名称。不要将文件名作为标题。

写作风格——文档应像知识渊博的同事在咖啡时间讲解项目，而非生成式文档。

语气和语调：

使用第二人称（“你”）——直接与新贡献者对话
使用主动语态和现在时：“路由器将请求分发给处理程序”而非“请求由路由器分发给处理程序”
直截了当。句子开头就点明重点，而非铺垫：“运行
```
bun dev
```
启动服务器”而非“为了启动开发服务器，你需要运行以下命令”
匹配代码库的正式程度。简陋的原型使用随意的文字，企业系统使用更严谨的语言。从README和现有文档中获取语气线索。

清晰度：

每句话都应教会读者一些知识或告诉他们要做什么。删除任何无意义的句子。
优先使用具体表述而非抽象概括：“
```
src/services/billing.ts
```
负责向客户的银行卡收费”而非“计费模块处理与支付相关的业务逻辑”
引入术语时，立即在上下文中定义。不要让读者滚动到术语表查找。
使用准确的最简单词汇。“使用”而非“利用”，“启动”而非“初始化”，“发送”而非“传输”。

需要避免的内容：

填充内容和开场白：“需要注意的是”、“如上所述”、“在本节中我们将”
模糊总结：“此模块处理各个方面的……”——具体说明它的功能
陈述事实时使用含糊词汇：“这本质上是”、“这基本上是”——如果知道它的功能，直接明确说明
最高级和营销语言：“健壮的”、“强大的”、“全面的”、“无缝的”
关于文档本身的元评论：“本文档旨在……”——直接呈现内容即可

格式要求——全程保持一致：

所有文件名（
```
package.json
```
）、路径（
```
src/routes/
```
）、命令（
```
bun test
```
）、函数/类名、环境变量和技术术语都使用反引号
使用markdown标题（
```
##
```
）作为五个章节的标题
按以下说明使用ASCII图表和markdown表格
谨慎使用粗体强调内容
段落要短——2-4句话

章节分隔符——在每个

##

章节之间插入水平分隔线（

---

）。这些文档内容密集，扫描时清晰的视觉分隔会更友好。

代码块宽度限制——最多80列。 Markdown代码块使用

white-space: pre

渲染且不会自动换行，过宽的行在GitHub、平板和窄视口上会导致水平滚动。表格不受此限制——Markdown渲染器会自动换行。对所有```围栏内的内容应用以下规则：

ASCII架构图：垂直堆叠方框而非水平排列。同一水平线上最多放置2个方框，每个方框标签不超过20个字符。这样可将图表宽度限制在约60字符以内。
流程图：文件路径+注释不超过80字符。如果描述过长，将其移到下一行或缩短。
目录树：内联
```
# 注释
```
不超过30字符。优先使用简短的角色描述（“编辑器插件”）而非详尽列表（“标记、热图、建议、协作光标等”）。

Section 1: What Is This?

章节1：这是什么？

Answer: What does this project do, who is it for, and what problem does it solve?

Draw from

README.md

, manifest descriptions (e.g.,

package.json

description field), and what the entry points reveal about the application's purpose.

If the project's purpose cannot be clearly determined from the code, state that plainly: "This project's purpose is not documented. Based on the code structure, it appears to be..."

Keep to 1-3 paragraphs.

回答：这个项目的功能是什么，面向谁，解决什么问题？

从

README.md

、清单描述（如

package.json

的描述字段）以及入口点揭示的应用用途中提取信息。

如果无法从代码中明确判断项目用途，直接说明：“此项目的用途未记录。根据代码结构，它似乎是……”

保持1-3个段落。

Section 2: How Is It Organized?

章节2：如何组织？

Answer: What is the architecture, what are the key modules, how do they connect, and what does the system depend on externally?

This section covers both the internal structure and the system boundary -- what the application talks to outside itself.

System architecture -- When a project has multiple major surfaces or deployment targets (e.g., a native app, a web server, and an API), include an ASCII architecture diagram showing how they relate at the system level before diving into directory structure. This helps the reader build a mental model of the system before seeing individual files.

Use vertical stacking to keep diagrams under 80 columns:

+------------------+
| Native macOS App |
| (Swift/WKWebView)|
+--------+---------+
         |  bridge
         v
+------------------+
| Editor Engine    |  <-- shared core
| (Milkdown/Yjs)  |
+--------+---------+
         |  Vite build
         v
+------------------+    WebSocket    +----------------+
| Browser Client   |<=============>| Express Server  |
+------------------+               +--------+--------+
                                            |
                                   +--------v--------+
                                   | SQLite + Yjs    |
                                   +-----------------+

Skip this for simple projects (single-purpose libraries, CLI tools) where the directory tree already tells the whole story.

Internal structure -- Include an ASCII directory tree showing the high-level layout:

project-name/
  src/
    routes/       # HTTP route handlers
    services/     # Business logic
    models/       # Data layer
  tests/          # Test suite
  config/         # Environment and app configuration

Annotate directories with a brief comment explaining their role. Only include directories that matter -- skip build artifacts, config files, and boilerplate.

When there are distinct modules or components with clear responsibilities, present them in a table:

| Module | Responsibility |
|--------|---------------|
| `src/routes/` | HTTP request handling and routing |
| `src/services/` | Core business logic |
| `src/models/` | Database models and queries |

Describe how the modules connect -- what calls what, where data flows between them.

External dependencies and integrations -- Surface everything the system talks to outside its own codebase. This is often the biggest blocker for new contributors trying to run the project. Look for signals in:

```
docker-compose.yml
```
(databases, caches, message queues)
Environment variable references in config files or
```
.env.example
```
Import statements for client libraries (database drivers, API SDKs, cloud storage)
The inventory's detected frameworks (e.g., Prisma implies a database)

Present as a table when there are multiple dependencies:

| Dependency | What it's used for | Configured via |
|-----------|-------------------|---------------|
| PostgreSQL | Primary data store | `DATABASE_URL` |
| Redis | Session cache and job queue | `REDIS_URL` |
| Stripe API | Payment processing | `STRIPE_SECRET_KEY` |
| S3 | File uploads | `AWS_*` env vars |

If no external dependencies are detected, state that: "This project appears self-contained with no external service dependencies."

回答：架构是什么，核心模块有哪些，它们如何连接，系统依赖哪些外部服务？

本节涵盖内部结构和系统边界——即应用与外部系统的交互内容。

系统架构——当项目有多个主要界面或部署目标（如原生应用、Web服务器和API）时，在深入目录结构之前，先包含ASCII架构图展示系统层面的关系。这有助于读者在查看单个文件之前建立系统的心智模型。

使用垂直堆叠保持图表宽度不超过80列：

+------------------+
| Native macOS App |
| (Swift/WKWebView)|
+--------+---------+
         |  bridge
         v
+------------------+
| Editor Engine    |  <-- shared core
| (Milkdown/Yjs)  |
+--------+---------+
         |  Vite build
         v
+------------------+    WebSocket    +----------------+
| Browser Client   |<=============>| Express Server  |
+------------------+               +--------+--------+
                                            |
                                   +--------v--------+
                                   | SQLite + Yjs    |
                                   +-----------------+

对于简单项目（单一用途的库、CLI工具），如果目录树已能完整说明情况，可跳过此图。

内部结构——包含ASCII目录树展示顶层布局：

project-name/
  src/
    routes/       # HTTP路由处理程序
    services/     # 业务逻辑
    models/       # 数据层
  tests/          # 测试套件
  config/         # 环境和应用配置

为目录添加简短注释说明其作用。仅包含重要目录——跳过构建产物、配置文件和模板代码。

当存在职责明确的独立模块或组件时，用表格呈现：

| 模块 | 职责 |
|--------|---------------|
| `src/routes/` | HTTP请求处理和路由 |
| `src/services/` | 核心业务逻辑 |
| `src/models/` | 数据库模型和查询 |

描述模块之间的连接方式——谁调用谁，数据如何在它们之间流动。

外部依赖和集成——列出系统与自身代码库之外的所有交互内容。这通常是新贡献者尝试运行项目时遇到的最大障碍。从以下线索中查找：

```
docker-compose.yml
```
（数据库、缓存、消息队列）
配置文件或
```
.env.example
```
中的环境变量引用
客户端库的导入语句（数据库驱动、API SDK、云存储）
清单检测到的框架（如Prisma意味着存在数据库）

当有多个依赖时，用表格呈现：

| 依赖项 | 用途 | 配置方式 |
|-----------|-------------------|---------------|
| PostgreSQL | 主数据存储 | `DATABASE_URL` |
| Redis | 会话缓存和任务队列 | `REDIS_URL` |
| Stripe API | 支付处理 | `STRIPE_SECRET_KEY` |
| S3 | 文件上传 | `AWS_*` 环境变量 |

如果未检测到外部依赖，说明：“此项目似乎是自包含的，无外部服务依赖。”

Section 3: Key Concepts and Abstractions

章节3：核心概念与抽象

Answer: What vocabulary and patterns does someone need to understand to talk about this codebase?

This section covers two things:

Domain terms -- The project-specific vocabulary: entity names, API resource names, database tables, configuration concepts, and jargon that a new reader would not immediately recognize.

Architectural abstractions -- The structural patterns in the codebase that shape how code is organized and how a contributor should think about making changes. These are especially important in codebases where the original author may not have consciously chosen these patterns -- they may have been introduced by an AI or adopted from a template without documentation.

Examples of architectural abstractions worth surfacing:

"Business logic lives in the service layer (
```
src/services/
```
), not in route handlers"
"Authentication runs through middleware in
```
src/middleware/auth.ts
```
before every protected route"
"Database access uses the repository pattern -- each model has a corresponding repository class"
"Background jobs are defined in
```
src/jobs/
```
and dispatched through a Redis-backed queue"

Present both domain terms and abstractions in a single table:

| Concept | What it means in this codebase |
|---------|-------------------------------|
| `Widget` | The primary entity users create and manage |
| `Pipeline` | A sequence of processing steps applied to incoming data |
| Service layer | Business logic in `src/services/`, not handlers |
| Middleware chain | Requests flow through `src/middleware/` first |

Aim for 5-15 entries. Include only concepts that would confuse a new reader or that represent non-obvious architectural decisions. Skip universally understood terms.

回答：要讨论此代码库，需要了解哪些词汇和模式？

本节涵盖两部分内容：

领域术语——项目特定的词汇：实体名称、API资源名称、数据库表、配置概念以及新读者无法立即理解的行话。

架构抽象——代码库中的结构模式，这些模式决定了代码的组织方式以及贡献者应如何思考变更。在原始作者可能未有意识选择这些模式的代码库中，这些抽象尤为重要——它们可能是由AI引入或从模板中采用但未记录的。

值得突出的架构抽象示例：

“业务逻辑位于服务层（
```
src/services/
```
），而非路由处理程序中”
“认证通过
```
src/middleware/auth.ts
```
中的中间件在每个受保护路由之前运行”
“数据库访问使用仓库模式——每个模型都有对应的仓库类”
“后台任务定义在
```
src/jobs/
```
中，并通过基于Redis的队列分发”

将领域术语和抽象放在同一个表格中：

| 概念 | 在本代码库中的含义 |
|---------|-------------------------------|
| `Widget` | 用户创建和管理的核心实体 |
| `Pipeline` | 应用于输入数据的一系列处理步骤 |
| 服务层 | `src/services/`中的业务逻辑，而非处理程序 |
| 中间件链 | 请求首先流经`src/middleware/` |

目标是5-15个条目。仅包含会让新读者困惑或代表非显而易见的架构决策的概念。跳过通用术语。

Section 4: Primary Flows

章节4：核心流程

Answer: What happens when the main things this app does actually happen?

Trace one flow per distinct surface or user type. A "surface" is a meaningfully different entry path into the system -- a native app, a web UI, an API consumer, a CLI user. Each flow should reveal parts of the architecture that previous flows didn't cover. Stop when the next flow would mostly retrace files already shown.

For a simple library or CLI, that's one flow. For a full-stack app with a web UI and an API, that's two. For a product with native + web + agent surfaces, that's three. Let the architecture drive the count, not an arbitrary number.

Include an ASCII flow diagram for the most important flow:

User Request
  |
  v
src/routes/widgets.ts
  validates input, extracts params
  |
  v
src/services/widget.ts
  applies business rules, calls DB
  |
  v
src/models/widget.ts
  persists to PostgreSQL
  |
  v
Response (201 Created)

At each step, reference the specific file path. Keep file path + annotation under 80 characters -- put the annotation on the next line if needed (as shown above).

Additional flows can use a numbered list instead of a full diagram if the first diagram already establishes the structural pattern.

回答：当应用执行主要功能时，会发生什么？

为每个不同的界面或用户类型追踪一个流程。“界面”是指进入系统的显著不同的入口路径——原生应用、Web UI、API消费者、CLI用户。每个流程应揭示之前流程未覆盖的架构部分。当后续流程主要重复已展示的文件时停止。

对于简单的库或CLI，只需一个流程。对于包含Web UI和API的全栈应用，需要两个流程。对于包含原生+Web+Agent界面的产品，需要三个流程。由架构决定流程数量，而非任意数字。

为最重要的流程包含ASCII流程图：

用户请求
  |
  v
src/routes/widgets.ts
  验证输入，提取参数
  |
  v
src/services/widget.ts
  应用业务规则，调用数据库
  |
  v
src/models/widget.ts
  持久化到PostgreSQL
  |
  v
响应（201 Created）

在每个步骤中引用具体的文件路径。文件路径+注释不超过80字符——如果需要，将注释移到下一行（如上所示）。

如果第一个流程图已建立结构模式，其他流程可使用编号列表而非完整图表。

Section 5: Where Do I Start?

章节5：我从哪里开始？

Answer: How do I set up the project, run it, and make common changes?

Cover three things:

Setup -- Prerequisites, install steps, environment config. Draw from README and the inventory's scripts. Format commands in code blocks:
```
bun install
cp .env.example .env
bun dev
```
Running and testing -- How to start the dev server, run tests, lint. Use the inventory's detected scripts.
Common change patterns -- Where to go for the 2-3 most common types of changes. For example:
- "To add a new API endpoint, create a route handler in
```
src/routes/
```
  and register it in
```
src/routes/index.ts
```
  "
- "To add a new database model, create a file in
```
src/models/
```
  and run
```
bun migrate
```
  "
Key files to start with (for complex projects) -- A table mapping areas of the codebase to specific entry-point files with a brief "why start here" note. This gives a new contributor a concrete reading list instead of staring at a large directory tree. For example:
```
| Area | File | Why |
|------|------|-----|
| Editor core | `src/editor/index.ts` | All editor wiring |
| Data model | `src/formats/marks.ts` | The annotation system everything builds on |
| Server entry | `server/index.ts` | Express app setup and route mounting |
```
Skip this for projects with fewer than ~10 source files where the directory tree is already a sufficient reading list.
Practical tips (for complex projects) -- If the codebase has areas that are particularly large, complex, or have non-obvious gotchas, surface them as brief contributor tips. These communicate real situational awareness that helps a new contributor avoid pitfalls. For example:
- "The editor module is ~450KB. Most behavior is wired through plugins in
```
src/editor/plugins/
```
  -- understand the plugin architecture before making editor changes."
- "The collab subsystem has many guards and epoch checks. Read the test names to understand what invariants are maintained."
Skip this for simple projects where the codebase is small enough to hold in your head.

回答：如何设置项目、运行它并进行常见变更？

涵盖三部分内容：

设置——前置条件、安装步骤、环境配置。从README和清单的脚本中提取信息。将命令格式化为代码块：
```
bun install
cp .env.example .env
bun dev
```
运行和测试——如何启动开发服务器、运行测试、代码检查。使用清单检测到的脚本。
常见变更模式——2-3种最常见变更的操作位置。例如：
- “要添加新API端点，在
```
src/routes/
```
  中创建路由处理程序并在
```
src/routes/index.ts
```
  中注册”
- “要添加新数据库模型，在
```
src/models/
```
  中创建文件并运行
```
bun migrate
```
  ”
核心入门文件（针对复杂项目）——表格将代码库的不同区域映射到具体的入口文件，并附带简短的“为何从此处开始”说明。这为新贡献者提供了具体的阅读列表，而非让他们面对庞大的目录树不知所措。例如：
```
| 领域 | 文件 | 原因 |
|------|------|-----|
| 编辑器核心 | `src/editor/index.ts` | 所有编辑器的核心逻辑 |
| 数据模型 | `src/formats/marks.ts` | 所有功能基于的注解系统 |
| 服务器入口 | `server/index.ts` | Express应用设置和路由挂载 |
```
对于源码文件少于约10个的项目，如果目录树已足够作为阅读列表，可跳过此部分。
实用技巧（针对复杂项目）——如果代码库中存在特别庞大、复杂或有非明显陷阱的区域，将其作为贡献者提示突出显示。这些提示传达了实际的场景认知，帮助新贡献者避免陷阱。例如：
- “编辑器模块约450KB。大多数行为通过
```
src/editor/plugins/
```
  中的插件实现——在对编辑器进行变更之前，先理解插件架构。”
- “协作子系统有许多防护和周期检查。通过测试名称了解需要维护的不变量。”
对于代码库小到可以完全理解的简单项目，可跳过此部分。

Inline Documentation Links

内联文档链接

While writing each section, check whether any file from the inventory's

docs

list is directly relevant to what the section explains. If so, link inline:

Authentication uses token-based middleware -- see
docs/solutions/auth-pattern.md
for the full pattern.

Do not create a separate references or further-reading section. If no relevant docs exist for a section, the section stands alone -- do not mention their absence.

撰写每个章节时，检查清单的

docs

列表中是否有与章节内容直接相关的文件。如果有，内联链接：

认证使用基于令牌的中间件——详见
docs/solutions/auth-pattern.md
中的完整模式。

不要创建单独的参考资料或进一步阅读章节。如果章节无相关现有文档，直接撰写即可——无需提及缺少文档。

Phase 4: Quality Check

阶段4：质量检查

Before writing the file, verify:

Every section answers its question without padding or filler
No secrets, API keys, tokens, passwords, or credential values anywhere in the document
No fabricated design rationale ("we chose X because...")
No fragility or risk assessments
File paths referenced in the document correspond to real files from the inventory
All file names, paths, commands, code references, and technical terms use backtick formatting
Document title uses "# {Project Name} Onboarding Guide" format, not the filename
System-level architecture diagram included for multi-surface projects (skipped for simple libraries/CLIs)
All code block content (diagrams, trees, flow traces) fits within 80 columns
ASCII diagrams are present in the architecture and/or primary flow sections
One flow per distinct surface or user type (architecture drives the count, not an arbitrary number)
External dependencies and integrations are surfaced in the architecture section (or explicitly noted as absent)
Tables are used for module responsibilities, domain terms/abstractions, and external dependencies
Markdown styling is consistent throughout (headers, bold, code blocks, tables)
Existing docs are linked inline only where directly relevant
Writing is direct and concrete -- no filler, no hedge words, no meta-commentary about the document
Tone matches the codebase (casual for scrappy projects, precise for enterprise)

Write the file to the repo root as

ONBOARDING.md

写入文件前，验证以下内容：

每个章节都回答了对应的问题，无填充内容
文档中无敏感信息、API密钥、令牌、密码或凭证值
无编造的设计理由（“我们选择X是因为……”）
无脆弱性或风险评估
文档中引用的文件路径与清单中的真实文件对应
所有文件名、路径、命令、代码引用和技术术语都使用反引号格式化
文档标题使用“# {项目名称} 入职指南”格式，而非文件名
多界面项目包含系统级架构图（简单库/CLI可跳过）
所有代码块内容（图表、树状图、流程追踪）宽度不超过80列
架构和/或核心流程章节包含ASCII图表
每个不同的界面或用户类型对应一个流程（数量由架构决定，而非任意数字）
架构章节中突出显示了外部依赖和集成（或明确说明不存在）
表格用于展示模块职责、领域术语/抽象和外部依赖
Markdown样式全程一致（标题、粗体、代码块、表格）
仅在直接相关的位置内联链接现有文档
写作直接具体——无填充内容、无含糊词汇、无关于文档本身的元评论
语气与代码库匹配（简陋项目使用随意语气，企业项目使用严谨语气）

将文件写入仓库根目录，命名为

ONBOARDING.md

。

Phase 5: Present Result

阶段5：呈现结果

After writing, inform the user that

ONBOARDING.md

has been generated. Offer next steps using the platform's blocking question tool when available (

AskUserQuestion

in Claude Code,

request_user_input

in Codex,

ask_user

in Gemini). Otherwise, present numbered options in chat.

Options:

Open the file for review
Share to Proof
Done

Based on selection:

Open for review -> Open
```
ONBOARDING.md
```
using the current platform's file-open or editor mechanism

Share to Proof -> Upload the document:

bash

CONTENT=$(cat ONBOARDING.md)
TITLE="Onboarding: <project name from inventory>"
RESPONSE=$(curl -s -X POST https://www.proofeditor.ai/share/markdown \
  -H "Content-Type: application/json" \
  -d "$(jq -n --arg title "$TITLE" --arg markdown "$CONTENT" --arg by "ai:compound" '{title: $title, markdown: $markdown, by: $by}')")
PROOF_URL=$(echo "$RESPONSE" | jq -r '.tokenUrl')

Display

View & collaborate in Proof: <PROOF_URL>

if successful, then return to the options

Done -> No further action

撰写完成后，告知用户

ONBOARDING.md

已生成。如果平台支持阻塞式提问工具（如Claude Code中的

AskUserQuestion

、Codex中的

request_user_input

、Gemini中的

ask_user

），提供后续步骤选项。否则，在聊天中呈现编号选项。

选项：

打开文件查看
分享到Proof
完成

根据选择执行操作：

打开查看 -> 使用当前平台的文件打开或编辑器机制打开
```
ONBOARDING.md
```

分享到Proof -> 上传文档：

bash

CONTENT=$(cat ONBOARDING.md)
TITLE="Onboarding: <project name from inventory>"
RESPONSE=$(curl -s -X POST https://www.proofeditor.ai/share/markdown \
  -H "Content-Type: application/json" \
  -d "$(jq -n --arg title "$TITLE" --arg markdown "$CONTENT" --arg by "ai:compound" '{title: $title, markdown: $markdown, by: $by}')")
PROOF_URL=$(echo "$RESPONSE" | jq -r '.tokenUrl')

如果成功，显示“在Proof中查看并协作：<PROOF_URL>”，然后返回选项

完成 -> 无进一步操作