Harness Step 2: 填充 docs/ 知识库内容

Harness Step 2: Fill in docs/ Knowledge Base Content

目标

Goal

通过深度阅读项目代码，将隐藏在代码里的架构知识、命名约定、技术决策，显式地写入 docs/ 各文件。让 agent 在任何 session 都能快速理解项目全貌。

核心原则：推断出来的内容要标注来源，无法确定的内容标注「待补充」，不要用模糊的占位符糊弄过去。

By deeply reading the project code, explicitly document the architectural knowledge, naming conventions, and technical decisions hidden in the code into each file in docs/. Enable agents to quickly understand the overall project overview in any session.

Core Principle: Mark the source for inferred content, mark "To be supplemented" for uncertain content, do not use vague placeholders.

执行步骤

Execution Steps

Step 1：深度扫描

Step 1: In-depth Scanning

在写任何文档之前，先充分读懂项目。按顺序执行：

bash

undefined

Before writing any documentation, fully understand the project first. Execute in the following order:

bash

undefined

1. 确认 docs/ 骨架已存在

1. Confirm the docs/ skeleton exists

ls docs/

2. 读懂目录结构（3层）

2. Understand the directory structure (3 levels)

find . -maxdepth 3
-not -path '/node_modules/' -not -path '/.git/'
-not -path '/pycache/' -not -path '/dist/'
-not -path '/.next/' -not -path '/build/' | sort

3. 读主要入口文件

3. Read main entry files

（根据技术栈判断：main.ts / main.py / app.go / index.js 等）

(Judge based on tech stack: main.ts / main.py / app.go / index.js, etc.)

4. 读模块边界（各主要目录的 index 文件或第一个文件）

4. Read module boundaries (index file or first file of each main directory)

目标：搞清楚每个目录的职责

Goal: Clarify the responsibility of each directory

5. 读依赖声明

5. Read dependency declarations

cat package.json 2>/dev/null || cat pyproject.toml 2>/dev/null ||
cat go.mod 2>/dev/null || cat Cargo.toml 2>/dev/null

6. 读已有文档（复用，不重复）

6. Read existing documentation (reuse, do not duplicate)

cat README.md 2>/dev/null cat AGENTS.md 2>/dev/null


扫描目标——在写文档前，必须能回答这些问题：
- 这个项目分成哪几个主要模块？每个模块做什么？
- 代码调用链是怎样的？（UI → ? → ? → 数据层）
- 用了哪些主要的库/框架？能推断出选择原因吗？
- 文件命名有什么规律？变量命名有什么规律？
- 什么情况会导致测试失败？验收标准是什么？

---

cat README.md 2>/dev/null cat AGENTS.md 2>/dev/null


Scanning Goals — Before writing documentation, you must be able to answer these questions:
- What are the main modules of this project? What does each module do?
- What is the code call chain like? (UI → ? → ? → Data Layer)
- Which main libraries/frameworks are used? Can you infer the reasons for the choice?
- What are the file naming rules? What are the variable naming rules?
- What situations will cause test failures? What are the acceptance criteria?

---

Step 2：写

docs/ARCHITECTURE.md

Step 2: Write

docs/ARCHITECTURE.md

写什么：模块划分、依赖方向、主要数据流。写"是什么结构"和"为什么这样分"，不写具体实现。

⚠️ 强制要求：描述组件/模块关系前，必须验证 import

写任何"A 被 B 使用"、"A 内嵌了 B"、"A 页面包含 C 组件"这类断言之前，必须用 Grep 确认实际 import，不得根据文件名或目录位置猜测。

bash

undefined

What to write: Module division, dependency directions, main data flow. Write "what the structure is" and "why it is divided this way", not the specific implementation.

⚠️ Mandatory Requirement: Verify imports before describing component/module relationships

Before making any assertions like "A is used by B", "A embeds B", or "A page contains C component", you must use Grep to confirm the actual import. Do not guess based on file names or directory locations.

bash

undefined

验证某组件是否被某页面实际引用

Verify if a component is actually referenced by a page

grep -r "ChatInterface" frontend/src/app/[locale]/book/[bookCode]/ 2>/dev/null

验证某组件被哪些文件实际引用

Verify which files actually reference a component

grep -rl "ComponentName" src/ 2>/dev/null


如果 grep 无结果，说明没有引用关系——即使组件在同一目录下也不能断言它被使用。
未经验证的关系统一标注「待验证：未找到 import，请人工确认」。

**格式模板**：

```markdown

grep -rl "ComponentName" src/ 2>/dev/null


If Grep returns no results, there is no reference relationship — even if the components are in the same directory, you cannot assert they are used. All unverified relationships must be marked as "To be verified: No import found, please confirm manually".

**Format Template**:

```markdown

架构说明

Architecture Description

整体结构

Overall Structure

[用文字描述整体分层，再用目录树辅助说明]

[目录树，只到关键层级，不要穷举所有文件]

[Describe the overall layering in text, then use a directory tree for illustration]

[Directory tree, only to key levels, do not list all files]

依赖方向规则

Dependency Direction Rules

[用箭头图或列表说明哪层可以引用哪层]

关键约束：

[约束1，说明原因]
[约束2，说明原因]

[Use arrow diagrams or lists to explain which layers can reference which layers]

Key Constraints:

[Constraint 1, explain the reason]
[Constraint 2, explain the reason]

主要数据流

Main Data Flow

[描述最核心的 1-2 条请求/数据流，从入口到数据库]

[Describe the 1-2 core request/data flows, from entry to database]

待补充

To be Supplemented

[扫描时无法确定的内容]


**写作要求**：
- 依赖规则要具体，不要写"保持清晰的分层"这种废话
- 每条约束附上原因（"不要在 UI 层调 DB，因为……"）
- 无法从代码推断的内容，明确标注「待补充：需人工确认」

---

[Content that cannot be determined during scanning]


**Writing Requirements**:
- Dependency rules must be specific, do not write vague statements like "maintain clear layering"
- Attach reasons for each constraint (e.g., "Do not call the DB from the UI layer because...")
- Clearly mark content that cannot be inferred from code as "To be supplemented: Need manual confirmation"

---

Step 3：写

docs/CONVENTIONS.md

Step 3: Write

docs/CONVENTIONS.md

写什么：从代码里归纳出来的命名规律和文件组织规律。

扫描方法：

bash

undefined

What to write: Naming rules and file organization rules summarized from the code.

Scanning Method:

bash

undefined

看文件命名规律

Check file naming rules

find src -name ".ts" -o -name ".py" -o -name "*.go" 2>/dev/null | head -30

看函数/变量命名（随机抽几个文件）

Check function/variable naming (randomly select a few files)

head -50 [主要源文件路径]


**格式模板**：

```markdown

head -50 [main source file path]


**Format Template**:

```markdown

代码约定

Code Conventions

文件命名

File Naming

[规律1]：示例
```
XxxYyy.tsx
```
[规律2]：示例
```
xxx-yyy.ts
```

[Rule 1]: Example
```
XxxYyy.tsx
```
[Rule 2]: Example
```
xxx-yyy.ts
```

变量和函数命名

Variable and Function Naming

变量/函数：[规律 + 示例]
类/组件：[规律 + 示例]
常量：[规律 + 示例]

Variables/Functions: [Rule + Example]
Classes/Components: [Rule + Example]
Constants: [Rule + Example]

目录组织

Directory Organization

[每个主要目录放什么类型的文件]

[What types of files are placed in each main directory]

Git Commit 格式

Git Commit Format

[从 git log 里归纳，或写推荐格式]

type(scope): 描述

type 可选：feat / fix / docs / refactor / test

[Summarize from git log, or write recommended format]

type(scope): description

Optional types: feat / fix / docs / refactor / test

待补充

To be Supplemented

[无法从代码推断的约定]


**写作要求**：
- 每条规律附上从代码中观察到的实例
- 如果代码本身命名不一致，如实写出来并标注「当前不一致，建议统一为……」
- 不要发明项目里没有的约定

---

[Conventions that cannot be inferred from code]


**Writing Requirements**:
- Attach code examples observed from the project for each rule
- If there are inconsistencies in code naming, record them truthfully and mark "Currently inconsistent, it is recommended to unify as..."
- Do not invent conventions that do not exist in the project

---

Step 4：写

docs/TECH_DECISIONS.md

Step 4: Write

docs/TECH_DECISIONS.md

写什么：技术选型的原因。这是最难写的一份，因为原因往往不在代码里。

扫描方法：

bash

undefined

What to write: Reasons for technology selection. This is the most difficult document to write because the reasons are often not in the code.

Scanning Method:

bash

undefined

看所有直接依赖

Check all direct dependencies

cat package.json | grep '"dependencies"' -A 50 2>/dev/null

或

Or

cat pyproject.toml | grep -A 30 '[tool.poetry.dependencies]' 2>/dev/null


**格式模板**：

```markdown

cat pyproject.toml | grep -A 30 '[tool.poetry.dependencies]' 2>/dev/null


**Format Template**:

```markdown

技术决策记录

Technical Decision Records

[框架/库名]

[Framework/Library Name]

用途：[这个库/框架用来做什么]
选择原因：[能推断出的原因，或标注「待补充」]
替代方案：[如果明显有替代品，列出并说明为何不选]
注意事项：[使用时需要特别注意的地方]

Purpose: [What this library/framework is used for]
Reasons for Selection: [Inferable reasons, or mark "To be supplemented"]
Alternative Solutions: [List obvious alternatives and explain why they were not chosen]
Notes: [Points that need special attention during use]

待补充

To be Supplemented

[无法从代码推断选型原因的库，需要人工说明]


**写作要求**：
- 只写主要的框架和库，不要把每个工具依赖都列一遍
- 能推断的写推断，推断不了的明确标「待补充：原始选型原因不明，请补充」
- 不要凭空捏造选型理由

---

[Libraries whose selection reasons cannot be inferred from code, need manual explanation]


**Writing Requirements**:
- Only write major frameworks and libraries, do not list every tool dependency
- Write inferences where possible, clearly mark "To be supplemented: Original selection reason unknown, please supplement" for un inferable content
- Do not fabricate selection reasons out of thin air

---

Step 5：写

docs/QUALITY.md

Step 5: Write

docs/QUALITY.md

写什么：什么叫"完成"，以及代码审查的检查清单。

扫描方法：

bash

undefined

What to write: What "done" means, and the code review checklist.

Scanning Method:

bash

undefined

看测试文件的模式

Check test file patterns

find . -name ".test." -o -name ".spec." -o -name "_test." 2>/dev/null | head -10

看 CI 配置（如果有）

Check CI configuration (if available)

cat .github/workflows/*.yml 2>/dev/null | head -60


**格式模板**：

```markdown

cat .github/workflows/*.yml 2>/dev/null | head -60


**Format Template**:

```markdown

质量标准

Quality Standards

Definition of Done（完成的定义）

Definition of Done

一个任务算完成，必须满足：

功能在本地运行正常
写了对应测试（覆盖正常路径 + 至少一个异常路径）
[根据项目实际情况补充，如：类型检查通过、lint 无报错]
git commit 信息清晰
如修改了架构或约定，docs/ 已同步更新

A task is considered complete only if it meets:

Functions run normally locally
Corresponding tests are written (covering normal paths + at least one exception path)
[Supplement based on actual project situation, e.g., type checking passed, no lint errors]
Clear git commit message
If architecture or conventions are modified, docs/ has been updated synchronously

代码审查检查清单

Code Review Checklist

正确性

[项目特有的正确性检查，如：多租户隔离、权限验证]

可维护性

命名是否符合 CONVENTIONS.md？
有无重复代码可提取？
业务逻辑是否在正确的层？（见 ARCHITECTURE.md）

Correctness

[Project-specific correctness checks, e.g., multi-tenant isolation, permission verification]

Maintainability

Does the naming conform to CONVENTIONS.md?
Is there duplicate code that can be extracted?
Is the business logic in the correct layer? (See ARCHITECTURE.md)

测试要求

Testing Requirements

[从现有测试文件归纳出的测试约定，或写推荐标准]

[Testing conventions summarized from existing test files, or write recommended standards]

待补充

To be Supplemented

[无法从代码推断的验收标准]

---

[Acceptance criteria that cannot be inferred from code]

---

Step 6：写

docs/exec-plans/tech-debt-tracker.md

Step 6: Write

docs/exec-plans/tech-debt-tracker.md

写什么：扫描过程中发现的潜在问题和技术债务。

格式：

markdown

undefined

What to write: Potential issues and technical debts discovered during scanning.

Format:

markdown

undefined

技术债务追踪

Technical Debt Tracker

每条格式：

[优先级: 高/中/低] 问题描述 — 影响范围

Each entry format:

[Priority: High/Medium/Low] Issue Description — Scope of Impact

当前债务

Current Debts

[扫描时发现的问题，诚实地写]

[Issues discovered during scanning, record truthfully]

已解决

Resolved

（空）


**判断债务的线索**：
- 重复代码（同样的逻辑在多个地方出现）
- 命名不一致（同一概念有多种叫法）
- TODO / FIXME 注释
- 过于庞大的文件（超过 300 行）
- 没有测试的核心模块

---

(Empty)


**Clues for Identifying Debts**:
- Duplicate code (same logic appearing in multiple places)
- Inconsistent naming (multiple names for the same concept)
- TODO / FIXME comments
- Overly large files (more than 300 lines)
- Core modules without tests

---

质量检验

Quality Inspection

每个文件写完后，逐一自检：

有没有"待补充"的地方？→ 整理成清单告知用户
有没有凭空捏造的内容？→ 删掉，换成「待补充」
ARCHITECTURE.md 的依赖规则是否具体可执行？
CONVENTIONS.md 的规律是否有代码实例支撑？
QUALITY.md 的 DoD 是否包含项目特有的检查项？

After writing each file, perform self-checks one by one:

Are there any "To be supplemented" sections? → Compile a list and inform the user
Is there any fabricated content? → Delete it and replace with "To be supplemented"
Are the dependency rules in ARCHITECTURE.md specific and executable?
Are the rules in CONVENTIONS.md supported by code examples?
Does the DoD in QUALITY.md include project-specific check items?

完成后告知用户

Notify User After Completion

输出摘要，包含三部分：

已写入的内容：列出每个文件写了什么

需要人工确认的「待补充」清单：汇总所有文件里标注了「待补充」的条目，这是用户最需要关注的部分

下一步：

人工补充「待补充」的内容后，这份知识库就可以投入使用
之后运行
```
harness-step3-state-management
```
skill，建立跨 session 的状态管理（progress 文件 + tasks.json）

Output a summary including three parts:

Content Written: List what was written in each file

"To be Supplemented" List Needing Manual Confirmation: Summarize all entries marked "To be supplemented" in all files — this is the part the user needs to focus on most

Next Steps:

After manually supplementing the "To be supplemented" content, this knowledge base can be put into use
Then run the
```
harness-step3-state-management
```
skill to establish cross-session state management (progress file + tasks.json)

harness-step2-fill-docs

Original

Translation

Harness Step 2: 填充 docs/ 知识库内容

Harness Step 2: Fill in docs/ Knowledge Base Content

目标

Goal

执行步骤

Execution Steps

Step 1：深度扫描

Step 1: In-depth Scanning

1. 确认 docs/ 骨架已存在

1. Confirm the docs/ skeleton exists

2. 读懂目录结构（3层）

2. Understand the directory structure (3 levels)

3. 读主要入口文件

3. Read main entry files

（根据技术栈判断：main.ts / main.py / app.go / index.js 等）

(Judge based on tech stack: main.ts / main.py / app.go / index.js, etc.)

4. 读模块边界（各主要目录的 index 文件或第一个文件）

4. Read module boundaries (index file or first file of each main directory)

目标：搞清楚每个目录的职责

Goal: Clarify the responsibility of each directory

5. 读依赖声明

5. Read dependency declarations

6. 读已有文档（复用，不重复）

6. Read existing documentation (reuse, do not duplicate)

Step 2：写 docs/ARCHITECTURE.md

Step 2: Write docs/ARCHITECTURE.md

验证某组件是否被某页面实际引用

Verify if a component is actually referenced by a page

验证某组件被哪些文件实际引用

Verify which files actually reference a component

架构说明

Architecture Description

整体结构

Overall Structure

依赖方向规则

Dependency Direction Rules

主要数据流

Main Data Flow

待补充

To be Supplemented

Step 3：写 docs/CONVENTIONS.md

Step 3: Write docs/CONVENTIONS.md

看文件命名规律

Check file naming rules

看函数/变量命名（随机抽几个文件）

Check function/variable naming (randomly select a few files)

代码约定

Code Conventions

文件命名

File Naming

变量和函数命名

Variable and Function Naming

目录组织

Directory Organization

Git Commit 格式

Git Commit Format

待补充

To be Supplemented

Step 4：写 docs/TECH_DECISIONS.md

Step 4: Write docs/TECH_DECISIONS.md

看所有直接依赖

Check all direct dependencies

或

Or

技术决策记录

Technical Decision Records

[框架/库名]

[Framework/Library Name]

待补充

To be Supplemented

Step 5：写 docs/QUALITY.md

Step 5: Write docs/QUALITY.md

看测试文件的模式

Check test file patterns

看 CI 配置（如果有）

Check CI configuration (if available)

质量标准

Step 2：写
`docs/ARCHITECTURE.md`

Step 2: Write
`docs/ARCHITECTURE.md`

Step 3：写
`docs/CONVENTIONS.md`

Step 3: Write
`docs/CONVENTIONS.md`

Step 4：写
`docs/TECH_DECISIONS.md`

Step 4: Write
`docs/TECH_DECISIONS.md`

Step 5：写
`docs/QUALITY.md`

Step 5: Write
`docs/QUALITY.md`

Step 6：写
`docs/exec-plans/tech-debt-tracker.md`

Step 6: Write
`docs/exec-plans/tech-debt-tracker.md`