harness-step2-fill-docs
Compare original and translation side by side
🇺🇸
Original
English🇨🇳
Translation
ChineseHarness Step 2: 填充 docs/ 知识库内容
Harness Step 2: Fill in docs/ Knowledge Base Content
目标
Goal
通过深度阅读项目代码,将隐藏在代码里的架构知识、命名约定、技术决策,
显式地写入 docs/ 各文件。让 agent 在任何 session 都能快速理解项目全貌。
核心原则:推断出来的内容要标注来源,无法确定的内容标注「待补充」,
不要用模糊的占位符糊弄过去。
By deeply reading the project code, explicitly document the architectural knowledge, naming conventions, and technical decisions hidden in the code into each file in docs/. Enable agents to quickly understand the overall project overview in any session.
Core Principle: Mark the source for inferred content, mark "To be supplemented" for uncertain content, do not use vague placeholders.
执行步骤
Execution Steps
Step 1:深度扫描
Step 1: In-depth Scanning
在写任何文档之前,先充分读懂项目。按顺序执行:
bash
undefinedBefore writing any documentation, fully understand the project first. Execute in the following order:
bash
undefined1. 确认 docs/ 骨架已存在
1. Confirm the docs/ skeleton exists
ls docs/
ls docs/
2. 读懂目录结构(3层)
2. Understand the directory structure (3 levels)
find . -maxdepth 3
-not -path '/node_modules/' -not -path '/.git/'
-not -path '/pycache/' -not -path '/dist/'
-not -path '/.next/' -not -path '/build/' | sort
-not -path '/node_modules/' -not -path '/.git/'
-not -path '/pycache/' -not -path '/dist/'
-not -path '/.next/' -not -path '/build/' | sort
find . -maxdepth 3
-not -path '/node_modules/' -not -path '/.git/'
-not -path '/pycache/' -not -path '/dist/'
-not -path '/.next/' -not -path '/build/' | sort
-not -path '/node_modules/' -not -path '/.git/'
-not -path '/pycache/' -not -path '/dist/'
-not -path '/.next/' -not -path '/build/' | sort
3. 读主要入口文件
3. Read main entry files
(根据技术栈判断:main.ts / main.py / app.go / index.js 等)
(Judge based on tech stack: main.ts / main.py / app.go / index.js, etc.)
4. 读模块边界(各主要目录的 index 文件或第一个文件)
4. Read module boundaries (index file or first file of each main directory)
目标:搞清楚每个目录的职责
Goal: Clarify the responsibility of each directory
5. 读依赖声明
5. Read dependency declarations
cat package.json 2>/dev/null || cat pyproject.toml 2>/dev/null ||
cat go.mod 2>/dev/null || cat Cargo.toml 2>/dev/null
cat go.mod 2>/dev/null || cat Cargo.toml 2>/dev/null
cat package.json 2>/dev/null || cat pyproject.toml 2>/dev/null ||
cat go.mod 2>/dev/null || cat Cargo.toml 2>/dev/null
cat go.mod 2>/dev/null || cat Cargo.toml 2>/dev/null
6. 读已有文档(复用,不重复)
6. Read existing documentation (reuse, do not duplicate)
cat README.md 2>/dev/null
cat AGENTS.md 2>/dev/null
扫描目标——在写文档前,必须能回答这些问题:
- 这个项目分成哪几个主要模块?每个模块做什么?
- 代码调用链是怎样的?(UI → ? → ? → 数据层)
- 用了哪些主要的库/框架?能推断出选择原因吗?
- 文件命名有什么规律?变量命名有什么规律?
- 什么情况会导致测试失败?验收标准是什么?
---cat README.md 2>/dev/null
cat AGENTS.md 2>/dev/null
Scanning Goals — Before writing documentation, you must be able to answer these questions:
- What are the main modules of this project? What does each module do?
- What is the code call chain like? (UI → ? → ? → Data Layer)
- Which main libraries/frameworks are used? Can you infer the reasons for the choice?
- What are the file naming rules? What are the variable naming rules?
- What situations will cause test failures? What are the acceptance criteria?
---Step 2:写 docs/ARCHITECTURE.md
docs/ARCHITECTURE.mdStep 2: Write docs/ARCHITECTURE.md
docs/ARCHITECTURE.md写什么:模块划分、依赖方向、主要数据流。写"是什么结构"和"为什么这样分",不写具体实现。
⚠️ 强制要求:描述组件/模块关系前,必须验证 import
写任何"A 被 B 使用"、"A 内嵌了 B"、"A 页面包含 C 组件"这类断言之前,
必须用 Grep 确认实际 import,不得根据文件名或目录位置猜测。
bash
undefinedWhat to write: Module division, dependency directions, main data flow. Write "what the structure is" and "why it is divided this way", not the specific implementation.
⚠️ Mandatory Requirement: Verify imports before describing component/module relationships
Before making any assertions like "A is used by B", "A embeds B", or "A page contains C component", you must use Grep to confirm the actual import. Do not guess based on file names or directory locations.
bash
undefined验证某组件是否被某页面实际引用
Verify if a component is actually referenced by a page
grep -r "ChatInterface" frontend/src/app/[locale]/book/[bookCode]/ 2>/dev/null
grep -r "ChatInterface" frontend/src/app/[locale]/book/[bookCode]/ 2>/dev/null
验证某组件被哪些文件实际引用
Verify which files actually reference a component
grep -rl "ComponentName" src/ 2>/dev/null
如果 grep 无结果,说明没有引用关系——即使组件在同一目录下也不能断言它被使用。
未经验证的关系统一标注「待验证:未找到 import,请人工确认」。
**格式模板**:
```markdowngrep -rl "ComponentName" src/ 2>/dev/null
If Grep returns no results, there is no reference relationship — even if the components are in the same directory, you cannot assert they are used. All unverified relationships must be marked as "To be verified: No import found, please confirm manually".
**Format Template**:
```markdown架构说明
Architecture Description
整体结构
Overall Structure
[用文字描述整体分层,再用目录树辅助说明]
[目录树,只到关键层级,不要穷举所有文件]
[Describe the overall layering in text, then use a directory tree for illustration]
[Directory tree, only to key levels, do not list all files]
依赖方向规则
Dependency Direction Rules
[用箭头图或列表说明哪层可以引用哪层]
关键约束:
- [约束1,说明原因]
- [约束2,说明原因]
[Use arrow diagrams or lists to explain which layers can reference which layers]
Key Constraints:
- [Constraint 1, explain the reason]
- [Constraint 2, explain the reason]
主要数据流
Main Data Flow
[描述最核心的 1-2 条请求/数据流,从入口到数据库]
[Describe the 1-2 core request/data flows, from entry to database]
待补充
To be Supplemented
- [扫描时无法确定的内容]
**写作要求**:
- 依赖规则要具体,不要写"保持清晰的分层"这种废话
- 每条约束附上原因("不要在 UI 层调 DB,因为……")
- 无法从代码推断的内容,明确标注「待补充:需人工确认」
---- [Content that cannot be determined during scanning]
**Writing Requirements**:
- Dependency rules must be specific, do not write vague statements like "maintain clear layering"
- Attach reasons for each constraint (e.g., "Do not call the DB from the UI layer because...")
- Clearly mark content that cannot be inferred from code as "To be supplemented: Need manual confirmation"
---Step 3:写 docs/CONVENTIONS.md
docs/CONVENTIONS.mdStep 3: Write docs/CONVENTIONS.md
docs/CONVENTIONS.md写什么:从代码里归纳出来的命名规律和文件组织规律。
扫描方法:
bash
undefinedWhat to write: Naming rules and file organization rules summarized from the code.
Scanning Method:
bash
undefined看文件命名规律
Check file naming rules
find src -name ".ts" -o -name ".py" -o -name "*.go" 2>/dev/null | head -30
find src -name ".ts" -o -name ".py" -o -name "*.go" 2>/dev/null | head -30
看函数/变量命名(随机抽几个文件)
Check function/variable naming (randomly select a few files)
head -50 [主要源文件路径]
**格式模板**:
```markdownhead -50 [main source file path]
**Format Template**:
```markdown代码约定
Code Conventions
文件命名
File Naming
- [规律1]:示例
XxxYyy.tsx - [规律2]:示例
xxx-yyy.ts
- [Rule 1]: Example
XxxYyy.tsx - [Rule 2]: Example
xxx-yyy.ts
变量和函数命名
Variable and Function Naming
- 变量/函数:[规律 + 示例]
- 类/组件:[规律 + 示例]
- 常量:[规律 + 示例]
- Variables/Functions: [Rule + Example]
- Classes/Components: [Rule + Example]
- Constants: [Rule + Example]
目录组织
Directory Organization
[每个主要目录放什么类型的文件]
[What types of files are placed in each main directory]
Git Commit 格式
Git Commit Format
[从 git log 里归纳,或写推荐格式]
type 可选:feat / fix / docs / refactor / test
type(scope): 描述[Summarize from git log, or write recommended format]
Optional types: feat / fix / docs / refactor / test
type(scope): description待补充
To be Supplemented
- [无法从代码推断的约定]
**写作要求**:
- 每条规律附上从代码中观察到的实例
- 如果代码本身命名不一致,如实写出来并标注「当前不一致,建议统一为……」
- 不要发明项目里没有的约定
---- [Conventions that cannot be inferred from code]
**Writing Requirements**:
- Attach code examples observed from the project for each rule
- If there are inconsistencies in code naming, record them truthfully and mark "Currently inconsistent, it is recommended to unify as..."
- Do not invent conventions that do not exist in the project
---Step 4:写 docs/TECH_DECISIONS.md
docs/TECH_DECISIONS.mdStep 4: Write docs/TECH_DECISIONS.md
docs/TECH_DECISIONS.md写什么:技术选型的原因。这是最难写的一份,因为原因往往不在代码里。
扫描方法:
bash
undefinedWhat to write: Reasons for technology selection. This is the most difficult document to write because the reasons are often not in the code.
Scanning Method:
bash
undefined看所有直接依赖
Check all direct dependencies
cat package.json | grep '"dependencies"' -A 50 2>/dev/null
cat package.json | grep '"dependencies"' -A 50 2>/dev/null
或
Or
cat pyproject.toml | grep -A 30 '[tool.poetry.dependencies]' 2>/dev/null
**格式模板**:
```markdowncat pyproject.toml | grep -A 30 '[tool.poetry.dependencies]' 2>/dev/null
**Format Template**:
```markdown技术决策记录
Technical Decision Records
[框架/库名]
[Framework/Library Name]
用途:[这个库/框架用来做什么]
选择原因:[能推断出的原因,或标注「待补充」]
替代方案:[如果明显有替代品,列出并说明为何不选]
注意事项:[使用时需要特别注意的地方]
选择原因:[能推断出的原因,或标注「待补充」]
替代方案:[如果明显有替代品,列出并说明为何不选]
注意事项:[使用时需要特别注意的地方]
Purpose: [What this library/framework is used for]
Reasons for Selection: [Inferable reasons, or mark "To be supplemented"]
Alternative Solutions: [List obvious alternatives and explain why they were not chosen]
Notes: [Points that need special attention during use]
Reasons for Selection: [Inferable reasons, or mark "To be supplemented"]
Alternative Solutions: [List obvious alternatives and explain why they were not chosen]
Notes: [Points that need special attention during use]
待补充
To be Supplemented
- [无法从代码推断选型原因的库,需要人工说明]
**写作要求**:
- 只写主要的框架和库,不要把每个工具依赖都列一遍
- 能推断的写推断,推断不了的明确标「待补充:原始选型原因不明,请补充」
- 不要凭空捏造选型理由
---- [Libraries whose selection reasons cannot be inferred from code, need manual explanation]
**Writing Requirements**:
- Only write major frameworks and libraries, do not list every tool dependency
- Write inferences where possible, clearly mark "To be supplemented: Original selection reason unknown, please supplement" for un inferable content
- Do not fabricate selection reasons out of thin air
---Step 5:写 docs/QUALITY.md
docs/QUALITY.mdStep 5: Write docs/QUALITY.md
docs/QUALITY.md写什么:什么叫"完成",以及代码审查的检查清单。
扫描方法:
bash
undefinedWhat to write: What "done" means, and the code review checklist.
Scanning Method:
bash
undefined看测试文件的模式
Check test file patterns
find . -name ".test." -o -name ".spec." -o -name "_test." 2>/dev/null | head -10
find . -name ".test." -o -name ".spec." -o -name "_test." 2>/dev/null | head -10
看 CI 配置(如果有)
Check CI configuration (if available)
cat .github/workflows/*.yml 2>/dev/null | head -60
**格式模板**:
```markdowncat .github/workflows/*.yml 2>/dev/null | head -60
**Format Template**:
```markdown质量标准
Quality Standards
Definition of Done(完成的定义)
Definition of Done
一个任务算完成,必须满足:
- 功能在本地运行正常
- 写了对应测试(覆盖正常路径 + 至少一个异常路径)
- [根据项目实际情况补充,如:类型检查通过、lint 无报错]
- git commit 信息清晰
- 如修改了架构或约定,docs/ 已同步更新
A task is considered complete only if it meets:
- Functions run normally locally
- Corresponding tests are written (covering normal paths + at least one exception path)
- [Supplement based on actual project situation, e.g., type checking passed, no lint errors]
- Clear git commit message
- If architecture or conventions are modified, docs/ has been updated synchronously
代码审查检查清单
Code Review Checklist
正确性
- [项目特有的正确性检查,如:多租户隔离、权限验证]
可维护性
- 命名是否符合 CONVENTIONS.md?
- 有无重复代码可提取?
- 业务逻辑是否在正确的层?(见 ARCHITECTURE.md)
Correctness
- [Project-specific correctness checks, e.g., multi-tenant isolation, permission verification]
Maintainability
- Does the naming conform to CONVENTIONS.md?
- Is there duplicate code that can be extracted?
- Is the business logic in the correct layer? (See ARCHITECTURE.md)
测试要求
Testing Requirements
[从现有测试文件归纳出的测试约定,或写推荐标准]
[Testing conventions summarized from existing test files, or write recommended standards]
待补充
To be Supplemented
- [无法从代码推断的验收标准]
---- [Acceptance criteria that cannot be inferred from code]
---Step 6:写 docs/exec-plans/tech-debt-tracker.md
docs/exec-plans/tech-debt-tracker.mdStep 6: Write docs/exec-plans/tech-debt-tracker.md
docs/exec-plans/tech-debt-tracker.md写什么:扫描过程中发现的潜在问题和技术债务。
格式:
markdown
undefinedWhat to write: Potential issues and technical debts discovered during scanning.
Format:
markdown
undefined技术债务追踪
Technical Debt Tracker
每条格式:
[优先级: 高/中/低] 问题描述 — 影响范围Each entry format:
[Priority: High/Medium/Low] Issue Description — Scope of Impact当前债务
Current Debts
[扫描时发现的问题,诚实地写]
[Issues discovered during scanning, record truthfully]
已解决
Resolved
(空)
**判断债务的线索**:
- 重复代码(同样的逻辑在多个地方出现)
- 命名不一致(同一概念有多种叫法)
- TODO / FIXME 注释
- 过于庞大的文件(超过 300 行)
- 没有测试的核心模块
---(Empty)
**Clues for Identifying Debts**:
- Duplicate code (same logic appearing in multiple places)
- Inconsistent naming (multiple names for the same concept)
- TODO / FIXME comments
- Overly large files (more than 300 lines)
- Core modules without tests
---质量检验
Quality Inspection
每个文件写完后,逐一自检:
- 有没有"待补充"的地方?→ 整理成清单告知用户
- 有没有凭空捏造的内容?→ 删掉,换成「待补充」
- ARCHITECTURE.md 的依赖规则是否具体可执行?
- CONVENTIONS.md 的规律是否有代码实例支撑?
- QUALITY.md 的 DoD 是否包含项目特有的检查项?
After writing each file, perform self-checks one by one:
- Are there any "To be supplemented" sections? → Compile a list and inform the user
- Is there any fabricated content? → Delete it and replace with "To be supplemented"
- Are the dependency rules in ARCHITECTURE.md specific and executable?
- Are the rules in CONVENTIONS.md supported by code examples?
- Does the DoD in QUALITY.md include project-specific check items?
完成后告知用户
Notify User After Completion
输出摘要,包含三部分:
已写入的内容:列出每个文件写了什么
需要人工确认的「待补充」清单:
汇总所有文件里标注了「待补充」的条目,这是用户最需要关注的部分
下一步:
- 人工补充「待补充」的内容后,这份知识库就可以投入使用
- 之后运行 skill,建立跨 session 的状态管理(progress 文件 + tasks.json)
harness-step3-state-management
Output a summary including three parts:
Content Written: List what was written in each file
"To be Supplemented" List Needing Manual Confirmation:
Summarize all entries marked "To be supplemented" in all files — this is the part the user needs to focus on most
Next Steps:
- After manually supplementing the "To be supplemented" content, this knowledge base can be put into use
- Then run the skill to establish cross-session state management (progress file + tasks.json)
harness-step3-state-management