deep-dive-analysis

Compare original and translation side by side

🇺🇸

Original

English

🇨🇳

Translation

Chinese

Deep Dive Analysis Skill

深度分析技能

Overview

概述

This skill combines mechanical structure extraction with Claude's semantic understanding to produce comprehensive codebase documentation. Unlike simple AST parsing, this skill captures:

WHAT the code does (structure, functions, classes)
WHY it exists (business purpose, design decisions)
HOW it integrates (dependencies, contracts, flows)
CONSEQUENCES of changes (side effects, failure modes)

该技能结合机械结构提取与Claude的语义理解能力，生成全面的代码库文档。与简单的AST解析不同，该技能能够捕捉：

做什么：代码的功能（结构、函数、类）
为什么存在：业务目的、设计决策
如何集成：依赖关系、契约、流程
变更影响：副作用、故障模式

Capabilities

功能特性

Mechanical Analysis (Scripts):

Extract code structure (classes, functions, imports)
Map dependencies (internal/external)
Find symbol usages across the codebase
Track analysis progress
Classify files by criticality

Semantic Analysis (Claude AI):

Recognize architectural and design patterns
Identify red flags and anti-patterns
Trace data and control flows
Document contracts and invariants
Assess quality and maintainability

Documentation Maintenance:

Review and maintain documentation (Phase 8)
Fix broken links and update navigation indexes
Analyze and rewrite code comments (antirez standards)

Use this skill when:

Analyzing a codebase you're unfamiliar with
Generating documentation that explains WHY, not just WHAT
Identifying architectural patterns and anti-patterns
Performing code review with semantic understanding
Onboarding to a new project
Creating documentation for new contributors

机械分析（脚本）：

提取代码结构（类、函数、导入）
映射依赖关系（内部/外部）
查找代码库中符号的所有引用
追踪分析进度
按重要性对文件进行分类

语义分析（Claude AI）：

识别架构与设计模式
识别风险标记与反模式
追踪数据与控制流
记录契约与不变量
评估质量与可维护性

文档维护：

审查与维护文档（第8阶段）
修复失效链接并更新导航索引
分析并重写代码注释（遵循antirez标准）

适用场景：

分析不熟悉的代码库
生成不仅说明“做什么”还解释“为什么”的文档
识别架构模式与反模式
基于语义理解进行代码审查
新项目入职培训
为新贡献者创建文档

Prerequisites

前置条件

analysis_progress.json must exist in project root (created by DEEP_DIVE_PLAN setup)
DEEP_DIVE_PLAN.md should be reviewed to understand phase structure

项目根目录下必须存在analysis_progress.json文件（由DEEP_DIVE_PLAN初始化流程创建）
应阅读DEEP_DIVE_PLAN.md以了解阶段结构

CRITICAL PRINCIPLE: ABSOLUTE SOURCE OF TRUTH

核心原则：绝对唯一可信来源

THE DOCUMENTATION GENERATED BY THIS SKILL IS THE ABSOLUTE AND UNQUESTIONABLE SOURCE OF TRUTH FOR YOUR PROJECT.

ANY INFORMATION NOT VERIFIED WITH IRREFUTABLE EVIDENCE FROM SOURCE CODE IS FALSE, UNRELIABLE, AND UNACCEPTABLE.

该技能生成的文档是您项目中绝对且无可置疑的唯一可信来源。

任何未通过源代码提供的无可辩驳证据验证的信息均为虚假、不可靠且不可接受。

IMPORTANT LIMITATION: Verification is Multi-Layer

重要限制：验证是多层级的

╔══════════════════════════════════════════════════════════════════════════════╗
║                     VERIFICATION TRUST MODEL                                  ║
╠══════════════════════════════════════════════════════════════════════════════╣
║  Layer 1: TOOL-VALIDATED                                                     ║
║    └── Automated checks: file exists, line in range, AST symbol match        ║
║    └── Marker: [VALIDATED: file.py:123 @ 2025-12-20]                         ║
║                                                                              ║
║  Layer 2: HUMAN-VERIFIED                                                     ║
║    └── Manual review: semantic correctness, behavior match                   ║
║    └── Marker: [VERIFIED: file.py:123 by @reviewer @ 2025-12-20]             ║
║                                                                              ║
║  Layer 3: RUNTIME-CONFIRMED                                                  ║
║    └── Log/trace evidence of actual behavior                                 ║
║    └── Marker: [CONFIRMED: trace_id=abc123 @ 2025-12-20]                     ║
╚══════════════════════════════════════════════════════════════════════════════╝

Tool validation catches STRUCTURAL issues (file moved, line shifted, symbol renamed).
Human verification ensures SEMANTIC correctness (code does what doc says).
Runtime confirmation proves BEHAVIORAL truth (system actually works this way).

ALL THREE LAYERS are required for critical documentation.

╔══════════════════════════════════════════════════════════════════════════════╗
║                     VERIFICATION TRUST MODEL                                  ║
╠══════════════════════════════════════════════════════════════════════════════╣
║  Layer 1: TOOL-VALIDATED                                                     ║
║    └── Automated checks: file exists, line in range, AST symbol match        ║
║    └── Marker: [VALIDATED: file.py:123 @ 2025-12-20]                         ║
║                                                                              ║
║  Layer 2: HUMAN-VERIFIED                                                     ║
║    └── Manual review: semantic correctness, behavior match                   ║
║    └── Marker: [VERIFIED: file.py:123 by @reviewer @ 2025-12-20]             ║
║                                                                              ║
║  Layer 3: RUNTIME-CONFIRMED                                                  ║
║    └── Log/trace evidence of actual behavior                                 ║
║    └── Marker: [CONFIRMED: trace_id=abc123 @ 2025-12-20]                     ║
╚══════════════════════════════════════════════════════════════════════════════╝

工具验证可发现结构问题（文件移动、行号偏移、符号重命名）。人工验证确保语义正确性（代码行为与文档描述一致）。运行时确认证明行为真实性（系统实际按此方式运行）。关键文档必须经过全部三个层级的验证。

The Iron Law of Documentation

文档铁律

╔══════════════════════════════════════════════════════════════════════════════╗
║  DOCUMENTATION = f(SOURCE_CODE) + VERIFICATION                               ║
║                                                                              ║
║  If NOT verified_against_code(statement) → statement is FALSE                ║
║  If NOT exists_in_codebase(reference)    → reference is FABRICATED           ║
║  If NOT traceable_to_source(claim)       → claim is SPECULATION              ║
╚══════════════════════════════════════════════════════════════════════════════╝

╔══════════════════════════════════════════════════════════════════════════════╗
║  DOCUMENTATION = f(SOURCE_CODE) + VERIFICATION                               ║
║                                                                              ║
║  If NOT verified_against_code(statement) → statement is FALSE                ║
║  If NOT exists_in_codebase(reference)    → reference is FABRICATED           ║
║  If NOT traceable_to_source(claim)       → claim is SPECULATION              ║
╚══════════════════════════════════════════════════════════════════════════════╝

文档 = f(源代码) + 验证若未通过代码验证（陈述）→ 陈述为假若代码库中不存在（引用）→ 引用为虚构若无法追溯到源代码（声明）→ 声明为猜测

Mandatory Rules (VIOLATION = FAILURE)

强制规则（违反即失败）

NEVER document anything without reading the actual source code first
NEVER assume any existing documentation, comment, or docstring is accurate
NEVER write documentation based on memory, inference, or "what should be"
ALWAYS derive truth EXCLUSIVELY from reading and tracing actual code
ALWAYS provide source file + line number for every technical claim
ALWAYS verify state machines, enums, constants against actual definitions
TREAT all pre-existing docs as unverified claims requiring validation
MARK any unverifiable statement as
```
[UNVERIFIED - REQUIRES CODE CHECK]
```

绝对不要在未阅读实际源代码的情况下记录任何内容
绝对不要假设任何现有文档、注释或文档字符串是准确的
绝对不要基于记忆、推断或“应该是什么”来编写文档
始终仅通过阅读和追踪实际代码来获取真实信息
始终为每个技术声明提供源文件+行号
始终对照实际定义验证状态机、枚举和常量
将所有现有文档视为需要验证的未经验证声明
将任何无法验证的标记为
```
[UNVERIFIED - REQUIRES CODE CHECK]
```

Verification Requirements

验证要求

Documentation Type	Required Evidence
Enum/State values	Exact match with source code enum definition
Function behavior	Code path tracing, actual implementation reading
Constants/Timeouts	Variable definition in source with file:line
Message formats	Message class definition, field validation
Architecture claims	Import graph analysis, actual class relationships
Flow diagrams	Verified against runtime logs OR code path analysis

文档类型	所需证据
枚举/状态值	与源代码枚举定义完全匹配
函数行为	代码路径追踪、实际实现阅读
常量/超时时间	源代码中的变量定义，包含文件:行号
消息格式	消息类定义、字段验证
架构声明	导入图分析、实际类关系
流程图	对照运行时日志或代码路径分析验证

Documentation Verification Status

文档验证状态

Every section of documentation MUST have one of these status markers:

```
[VERIFIED: source_file.py:123]
```
- Confirmed against source code
```
[VERIFIED: trace_id=xyz]
```
- Confirmed against runtime logs
```
[UNVERIFIED]
```
- Requires verification before trusting
```
[DEPRECATED]
```
- Code has changed, documentation outdated

UNVERIFIED documentation is UNTRUSTED documentation.

文档的每个部分必须包含以下状态标记之一：

```
[VERIFIED: source_file.py:123]
```
- 已对照源代码确认
```
[VERIFIED: trace_id=xyz]
```
- 已对照运行时日志确认
```
[UNVERIFIED]
```
- 信任前需要验证
```
[DEPRECATED]
```
- 代码已变更，文档过时

未验证的文档是不可信的文档。

CRITICAL PRINCIPLE: NO HISTORICAL DEPTH

核心原则：不涉及历史内容

DOCUMENTATION DESCRIBES ONLY THE CURRENT STATE OF THE ART.

NO HISTORY. NO ARCHAEOLOGY. NO "WAS". ONLY "IS".

╔══════════════════════════════════════════════════════════════════════════════╗
║                     THE TEMPORAL PURITY PRINCIPLE                            ║
╠══════════════════════════════════════════════════════════════════════════════╣
║  Documentation = PRESENT_TENSE(current_implementation)                       ║
║                                                                              ║
║  FORBIDDEN:                                                                  ║
║  ✗ "was/were/previously/formerly/used to"                                    ║
║  ✗ "deprecated since version X" → just REMOVE it                             ║
║  ✗ "changed from X to Y" → only describe Y                                   ║
║  ✗ "in the old system..." → irrelevant, delete                               ║
║  ✗ inline changelogs → use CHANGELOG.md or git                               ║
║                                                                              ║
║  REQUIRED:                                                                   ║
║  ✓ Present tense: "The system uses..." not "The system used..."              ║
║  ✓ Current state only: Document what IS, not what WAS                        ║
║  ✓ Git for archaeology: History lives in version control, not docs           ║
╚══════════════════════════════════════════════════════════════════════════════╝

The Rule:

When you find documentation containing historical language, DELETE IT. Git blame exists for archaeology. Documentation exists for the present.

文档仅描述当前的最新状态。

不涉及历史。不追溯过往。不描述“曾经是”。只记录“现在是”。

╔══════════════════════════════════════════════════════════════════════════════╗
║                     THE TEMPORAL PURITY PRINCIPLE                            ║
╠══════════════════════════════════════════════════════════════════════════════╣
║  Documentation = PRESENT_TENSE(current_implementation)                       ║
║                                                                              ║
║  FORBIDDEN:                                                                  ║
║  ✗ "was/were/previously/formerly/used to"                                    ║
║  ✗ "deprecated since version X" → just REMOVE it                             ║
║  ✗ "changed from X to Y" → only describe Y                                   ║
║  ✗ "in the old system..." → irrelevant, delete                               ║
║  ✗ inline changelogs → use CHANGELOG.md or git                               ║
║                                                                              ║
║  REQUIRED:                                                                   ║
║  ✓ Present tense: "The system uses..." not "The system used..."              ║
║  ✓ Current state only: Document what IS, not what WAS                        ║
║  ✓ Git for archaeology: History lives in version control, not docs           ║
╚══════════════════════════════════════════════════════════════════════════════╝

规则：

当您发现文档中包含历史描述时，删除它。 Git blame用于追溯历史，文档用于记录当前状态。

Available Commands

可用命令

1. Analyze Single File

1. 分析单个文件

Extract structure, dependencies, and usages for one file:

bash

python .claude/skills/deep-dive-analysis/scripts/analyze_file.py \
  --file src/utils/circuit_breaker.py \
  --output-format markdown

Parameters:

```
--file
```
/
```
-f
```
: Relative path to file to analyze - REQUIRED
```
--output-format
```
/
```
-o
```
: Output format (json, markdown, summary) - default: summary
```
--find-usages
```
/
```
-u
```
: Also find all usages of exported symbols - default: false
```
--update-progress
```
/
```
-p
```
: Update analysis_progress.json - default: false

Output includes:

File classification (Critical/High-Complexity/Standard/Utility)
Classes with methods and attributes
Functions with signatures
Internal imports (within project)
External imports (third-party)
External calls (database, network, filesystem, messaging, ipc)
State mutations identified
Error handling patterns

提取单个文件的结构、依赖关系和引用情况：

bash

python .claude/skills/deep-dive-analysis/scripts/analyze_file.py \
  --file src/utils/circuit_breaker.py \
  --output-format markdown

参数：

```
--file
```
/
```
-f
```
: 要分析的文件的相对路径 - 必填
```
--output-format
```
/
```
-o
```
: 输出格式（json、markdown、summary）- 默认值：summary
```
--find-usages
```
/
```
-u
```
: 同时查找导出符号的所有引用 - 默认值：false
```
--update-progress
```
/
```
-p
```
: 更新analysis_progress.json - 默认值：false

输出内容包括：

文件分类（关键/高复杂度/标准/工具类）
包含方法和属性的类
带签名的函数
内部导入（项目内）
外部导入（第三方）
外部调用（数据库、网络、文件系统、消息传递、进程间通信）
识别出的状态变更
错误处理模式

2. Check Progress

2. 查看进度

View analysis progress by phase:

bash

python .claude/skills/deep-dive-analysis/scripts/check_progress.py \
  --phase 1 \
  --status pending

Parameters:

```
--phase
```
/
```
-p
```
: Filter by phase number (1-7)
```
--status
```
/
```
-s
```
: Filter by status (pending, analyzing, done, blocked)
```
--classification
```
/
```
-c
```
: Filter by classification (critical, high-complexity, standard, utility)
```
--verification-needed
```
: Show only files needing runtime verification

按阶段查看分析进度：

bash

python .claude/skills/deep-dive-analysis/scripts/check_progress.py \
  --phase 1 \
  --status pending

参数：

```
--phase
```
/
```
-p
```
: 按阶段编号过滤（1-7）
```
--status
```
/
```
-s
```
: 按状态过滤（pending、analyzing、done、blocked）
```
--classification
```
/
```
-c
```
: 按分类过滤（critical、high-complexity、standard、utility）
```
--verification-needed
```
: 仅显示需要运行时验证的文件

3. Find Usages

3. 查找引用

Find all usages of a symbol across the codebase:

bash

python .claude/skills/deep-dive-analysis/scripts/analyze_file.py \
  --symbol CircuitBreaker \
  --file src/utils/circuit_breaker.py

查找符号在整个代码库中的所有引用：

bash

python .claude/skills/deep-dive-analysis/scripts/analyze_file.py \
  --symbol CircuitBreaker \
  --file src/utils/circuit_breaker.py

4. Generate Phase Report

4. 生成阶段报告

Generate documentation for an entire phase:

bash

python .claude/skills/deep-dive-analysis/scripts/analyze_file.py \
  --phase 1 \
  --output-format markdown \
  --output-file docs/01_domains/COMMON_LIBRARY.md

为整个阶段生成文档：

bash

python .claude/skills/deep-dive-analysis/scripts/analyze_file.py \
  --phase 1 \
  --output-format markdown \
  --output-file docs/01_domains/COMMON_LIBRARY.md

Phase 8: Documentation Review Commands

第8阶段：文档维护命令

5. Scan Documentation Health

5. 扫描文档健康状态

Discover all documentation files and generate health report:

bash

python .claude/skills/deep-dive-analysis/scripts/doc_review.py scan \
  --path docs/ \
  --output doc_health_report.json

Output includes:

Total file count per directory
Files with TODO/FIXME/TBD markers
Files missing last_updated metadata
Large files (>1500 lines) candidates for splitting

发现所有文档文件并生成健康报告：

bash

python .claude/skills/deep-dive-analysis/scripts/doc_review.py scan \
  --path docs/ \
  --output doc_health_report.json

输出内容包括：

每个目录的文件总数
包含TODO/FIXME/TBD标记的文件
缺少last_updated元数据的文件
大文件（>1500行），可考虑拆分

6. Validate Links

6. 验证链接

Find all broken links in documentation:

bash

python .claude/skills/deep-dive-analysis/scripts/doc_review.py validate-links \
  --path docs/ \
  --fix  # Optional: auto-remove broken links

Actions:

Extracts all relative markdown links
```
](../path/to/file.md)
```
Verifies target files exist
Reports broken links with source file and line number
With
```
--fix
```
: removes or updates broken references

查找文档中的所有失效链接：

bash

python .claude/skills/deep-dive-analysis/scripts/doc_review.py validate-links \
  --path docs/ \
  --fix  # 可选：自动移除失效链接

操作：

提取所有相对markdown链接
```
](../path/to/file.md)
```
验证目标文件是否存在
报告失效链接的源文件和行号
使用
```
--fix
```
参数：移除或更新失效引用

7. Verify Against Source Code

7. 对照源代码验证

Verify documentation accuracy against actual source code:

bash

python .claude/skills/deep-dive-analysis/scripts/doc_review.py verify \
  --doc docs/agents/lifecycle.md \
  --source src/agents/lifecycle.py

Verification includes:

Documented states vs actual enum values
Documented methods vs actual class methods
Documented constants vs actual values
Flags discrepancies as DRIFT

验证文档与实际源代码的一致性：

bash

python .claude/skills/deep-dive-analysis/scripts/doc_review.py verify \
  --doc docs/agents/lifecycle.md \
  --source src/agents/lifecycle.py

验证内容包括：

文档记录的状态与实际枚举值
文档记录的方法与实际类方法
文档记录的常量与实际值
将差异标记为DRIFT

8. Update Navigation Indexes

8. 更新导航索引

Refresh SEARCH_INDEX.md and BY_DOMAIN.md with current file counts:

bash

python .claude/skills/deep-dive-analysis/scripts/doc_review.py update-indexes \
  --search-index docs/00_navigation/SEARCH_INDEX.md \
  --by-domain docs/00_navigation/BY_DOMAIN.md

Updates:

Total file counts
Files per directory statistics
Version and last_updated timestamps
Removes references to deleted files

刷新SEARCH_INDEX.md和BY_DOMAIN.md以反映当前文件数量：

bash

python .claude/skills/deep-dive-analysis/scripts/doc_review.py update-indexes \
  --search-index docs/00_navigation/SEARCH_INDEX.md \
  --by-domain docs/00_navigation/BY_DOMAIN.md

更新内容：

文件总数
各目录文件统计
版本和last_updated时间戳
移除已删除文件的引用

9. Full Documentation Maintenance

9. 完整文档维护

Run complete Phase 8 workflow:

bash

python .claude/skills/deep-dive-analysis/scripts/doc_review.py full-maintenance \
  --path docs/ \
  --auto-fix \
  --output doc_health_report.json

Executes in order:

Scan documentation health
Validate and fix broken links
Identify obsolete files (no inbound links, references deleted code)
Update navigation indexes
Generate final health report

运行完整的第8阶段工作流：

bash

python .claude/skills/deep-dive-analysis/scripts/doc_review.py full-maintenance \
  --path docs/ \
  --auto-fix \
  --output doc_health_report.json

执行顺序：

扫描文档健康状态
验证并修复失效链接
识别过时文件（无入站链接、引用已删除代码）
更新导航索引
生成最终健康报告

Comment Quality Commands (Antirez Standards)

注释质量命令（遵循antirez标准）

These commands analyze and rewrite code comments following the antirez commenting standards.

这些命令用于分析和重写代码注释，遵循antirez注释标准。

10. Analyze Comment Quality

10. 分析注释质量

Analyze comments in a single file:

bash

python .claude/skills/deep-dive-analysis/scripts/rewrite_comments.py analyze \
  src/main.py \
  --report

Options:

```
--report
```
/
```
-r
```
: Generate detailed markdown report
```
--json
```
: Output as JSON for programmatic use
```
--issues-only
```
/
```
-i
```
: Show only problematic comments

Output includes:

Comment classification (function, design, why, teacher, checklist, guide)
Issue detection (trivial, debt, backup comments)
Suggested rewrites for problematic comments
Statistics and ratios

分析单个文件中的注释：

bash

python .claude/skills/deep-dive-analysis/scripts/rewrite_comments.py analyze \
  src/main.py \
  --report

选项：

```
--report
```
/
```
-r
```
: 生成详细的markdown报告
```
--json
```
: 以JSON格式输出，供程序使用
```
--issues-only
```
/
```
-i
```
: 仅显示有问题的注释

输出内容包括：

注释分类（function、design、why、teacher、checklist、guide）
问题检测（trivial、debt、backup comments）
有问题注释的重写建议
统计数据和比例

11. Scan Directory for Comment Issues

11. 扫描目录中的注释问题

Analyze all Python files in a directory:

bash

python .claude/skills/deep-dive-analysis/scripts/rewrite_comments.py scan \
  src/ \
  --recursive \
  --issues-only

Options:

```
--recursive
```
/
```
-r
```
: Include subdirectories
```
--issues-only
```
/
```
-i
```
: Show only files with issues
```
--json
```
: Output as JSON

分析目录中所有Python文件的注释：

bash

python .claude/skills/deep-dive-analysis/scripts/rewrite_comments.py scan \
  src/ \
  --recursive \
  --issues-only

选项：

```
--recursive
```
/
```
-r
```
: 包含子目录
```
--issues-only
```
/
```
-i
```
: 仅显示有问题的文件
```
--json
```
: 以JSON格式输出

12. Generate Comment Health Report

12. 生成注释健康报告

Create comprehensive markdown report for entire codebase:

bash

python .claude/skills/deep-dive-analysis/scripts/rewrite_comments.py report \
  src/ \
  --output comment_health.md

Report includes:

Executive summary with totals
Comment quality breakdown (keep/enhance/rewrite/delete)
Comment type distribution
Files needing attention (ranked by issue count)
Sample issues with file:line references
Actionable recommendations

为整个代码库创建全面的markdown报告：

bash

python .claude/skills/deep-dive-analysis/scripts/rewrite_comments.py report \
  src/ \
  --output comment_health.md

报告内容包括：

执行摘要与总计
注释质量分类（保留/增强/重写/删除）
注释类型分布
需要关注的文件（按问题数量排序）
包含文件:行号的问题示例
可操作的建议

13. Rewrite Comments

13. 重写注释

Apply comment improvements to a file:

bash

undefined

对文件应用注释改进：

bash

undefined

Dry run (preview changes)

试运行（预览变更）

python .claude/skills/deep-dive-analysis/scripts/rewrite_comments.py rewrite
src/main.py

Apply changes with backup

应用变更并创建备份

python .claude/skills/deep-dive-analysis/scripts/rewrite_comments.py rewrite
src/main.py
--apply
--backup


**Options:**
- `--apply` / `-a`: Actually modify the file (default: dry run)
- `--backup` / `-b`: Create .bak backup before modifying
- `--output` / `-o`: Write to different file instead of in-place

**Actions taken:**
- DELETE: Remove trivial comments and backup (commented-out code)
- REWRITE: Add suggested improvements for debt comments (TODO/FIXME)

python .claude/skills/deep-dive-analysis/scripts/rewrite_comments.py rewrite
src/main.py
--apply
--backup


**选项：**
- `--apply` / `-a`: 实际修改文件（默认：试运行）
- `--backup` / `-b`: 修改前创建.bak备份
- `--output` / `-o`: 写入到不同文件，而非原地修改

**执行的操作：**
- 删除：移除无意义注释和备份代码（被注释掉的代码）
- 重写：为债务注释（TODO/FIXME）添加改进建议

14. View Standards Reference

14. 查看标准参考

Display the antirez commenting standards:

bash

python .claude/skills/deep-dive-analysis/scripts/rewrite_comments.py standards

Shows the complete taxonomy of good vs bad comments with examples.

显示antirez注释标准：

bash

python .claude/skills/deep-dive-analysis/scripts/rewrite_comments.py standards

显示完整的优质/劣质注释分类及示例。

Comment Type Classification

注释类型分类

Type	Category	Description	Action
function	GOOD	API docs at function/class top	Keep/Enhance
design	GOOD	File-level algorithm explanations	Keep
why	GOOD	Explains reasoning behind code	Keep
teacher	GOOD	Educates about domain concepts	Keep
checklist	GOOD	Reminds of coordinated changes	Keep
guide	GOOD	Section dividers, structure	Keep sparingly
trivial	BAD	Restates what code says	Delete
debt	BAD	TODO/FIXME without plan	Rewrite/Resolve
backup	BAD	Commented-out code	Delete

类型	类别	描述	操作
function	优质	函数/类顶部的API文档	保留/增强
design	优质	文件级别的算法说明	保留
why	优质	解释代码背后的设计思路	保留
teacher	优质	讲解领域概念	保留
checklist	优质	提醒需要协同变更的内容	保留
guide	优质	章节分隔符、结构说明	适度保留
trivial	劣质	重复代码内容	删除
debt	劣质	无计划的TODO/FIXME	重写/解决
backup	劣质	被注释掉的代码	删除

Comment Quality Workflow

注释质量工作流

1. SCAN
   ├── Run: rewrite_comments.py scan <dir> --recursive
   ├── Review files with most issues
   └── Generate: rewrite_comments.py report <dir> --output report.md

2. TRIAGE
   ├── Identify high-priority files (critical modules)
   ├── Focus on DEBT comments (convert to issues or design docs)
   └── Plan bulk TRIVIAL/BACKUP deletions

3. REWRITE
   ├── Run: rewrite_comments.py rewrite <file> --apply --backup
   ├── Review changes in diff
   └── Verify no functional changes

4. VERIFY
   ├── Run tests to confirm no breakage
   ├── Re-scan to confirm improvements
   └── Update comment_health.md report

1. 扫描
   ├── 运行: rewrite_comments.py scan <dir> --recursive
   ├── 查看问题最多的文件
   └── 生成: rewrite_comments.py report <dir> --output report.md

2. 分类处理
   ├── 确定高优先级文件（关键模块）
   ├── 重点处理DEBT注释（转换为问题或设计文档）
   └── 计划批量删除TRIVIAL/BACKUP注释

3. 重写
   ├── 运行: rewrite_comments.py rewrite <file> --apply --backup
   ├── 查看差异中的变更
   └── 确认无功能变更

4. 验证
   ├── 运行测试确认无故障
   ├── 重新扫描确认改进
   └── 更新comment_health.md报告

File Classification Criteria

文件分类标准

Classification	Criteria	Verification
Critical	Handles authentication, security, encryption, sensitive data	Mandatory
High-Complexity	>300 LOC, >5 dependencies, state machines, async patterns	Mandatory
Standard	Normal business logic, data models, utilities	Recommended
Utility	Pure functions, helpers, constants	Optional

分类	标准	验证要求
关键	处理认证、安全、加密、敏感数据	必须验证
高复杂度	代码行数>300、依赖>5、包含状态机、异步模式	必须验证
标准	常规业务逻辑、数据模型、工具类	建议验证
工具类	纯函数、辅助函数、常量	可选验证

AI-Powered Semantic Analysis

基于AI的语义分析

This skill leverages Claude's code comprehension capabilities for deep semantic analysis beyond mechanical structure extraction.

该技能利用Claude的代码理解能力，进行超越机械结构提取的深度语义分析。

The Semantic Analysis Mandate

语义分析要求

╔══════════════════════════════════════════════════════════════════════════════╗
║                    STRUCTURE vs MEANING                                      ║
╠══════════════════════════════════════════════════════════════════════════════╣
║                                                                              ║
║  Scripts extract STRUCTURE:  "class Foo with method bar()"                   ║
║  Claude extracts MEANING:    "Foo implements Repository pattern for         ║
║                               caching user sessions with TTL expiration"     ║
║                                                                              ║
║  NEVER stop at structure. ALWAYS pursue understanding.                      ║
║                                                                              ║
╚══════════════════════════════════════════════════════════════════════════════╝

╔══════════════════════════════════════════════════════════════════════════════╗
║                    STRUCTURE vs MEANING                                      ║
╠══════════════════════════════════════════════════════════════════════════════╣
║                                                                              ║
║  Scripts extract STRUCTURE:  "class Foo with method bar()"                   ║
║  Claude extracts MEANING:    "Foo implements Repository pattern for         ║
║                               caching user sessions with TTL expiration"     ║
║                                                                              ║
║  NEVER stop at structure. ALWAYS pursue understanding.                      ║
║                                                                              ║
╚══════════════════════════════════════════════════════════════════════════════╝

脚本提取结构：“包含bar()方法的Foo类” Claude提取语义：“Foo实现了仓库模式，用于缓存用户会话并设置TTL过期时间” 绝不停留在结构层面。始终追求对代码的理解。

Five Layers of Understanding

五层理解模型

Layer	What	Who Does It
1. WHAT	Classes, functions, imports	Scripts (AST)
2. HOW	Algorithm details, data flow	Claude's first pass
3. WHY	Business purpose, design decisions	Claude's deep analysis
4. WHEN	Triggers, lifecycle, concurrency	Claude's behavioral analysis
5. CONSEQUENCES	Side effects, failure modes	Claude's systems thinking

层级	内容	执行主体
1. 做什么	类、函数、导入	脚本（AST）
2. 如何做	算法细节、数据流	Claude首次分析
3. 为什么	业务目的、设计决策	Claude深度分析
4. 何时执行	触发条件、生命周期、并发处理	Claude行为分析
5. 变更影响	副作用、故障模式	Claude系统思维分析

Semantic Analysis Questions

语义分析问题

For every code unit, Claude must answer:

Identity:

What is this code's single responsibility?
What abstraction does it represent?
What would break if this didn't exist?

Behavior:

What are ALL inputs and outputs (including side effects)?
What state does it read? What does it mutate?
What are preconditions and postconditions?

Integration:

Who calls this? Under what circumstances?
What does this call? Why those dependencies?
What contracts does it fulfill?

Quality:

What could go wrong? How is failure handled?
Are there implicit assumptions that could break?
Are there race conditions or timing dependencies?

对于每个代码单元，Claude必须回答：

身份定位：

该代码的单一职责是什么？
它代表什么抽象概念？
如果没有这段代码，哪些功能会失效？

行为特性：

所有输入和输出（包括副作用）是什么？
它读取哪些状态？修改哪些状态？
前置条件和后置条件是什么？

集成关系：

谁会调用它？在什么场景下调用？
它会调用哪些内容？为什么依赖这些内容？
它需要满足哪些契约？

质量评估：

可能会出现什么问题？如何处理故障？
是否存在可能失效的隐含假设？
是否存在竞态条件或时序依赖？

Pattern Recognition

模式识别

Claude should actively recognize and document common patterns:

Pattern Type	Examples	Documentation Focus
Architectural	Repository, Service, CQRS, Event-Driven	Responsibilities, boundaries
Behavioral	State Machine, Strategy, Observer, Chain	Transitions, variations
Resilience	Circuit Breaker, Retry, Bulkhead, Timeout	Thresholds, fallbacks
Data	DTO, Value Object, Aggregate	Invariants, relationships
Concurrency	Producer-Consumer, Worker Pool	Thread safety, backpressure

See

references/SEMANTIC_PATTERNS.md

for detailed recognition guides.

Claude应主动识别并记录常见模式：

模式类型	示例	文档重点
架构模式	仓库模式、服务模式、CQRS、事件驱动	职责、边界
行为模式	状态机、策略模式、观察者模式、责任链	转换逻辑、变体
弹性模式	断路器、重试、舱壁、超时	阈值、降级方案
数据模式	数据传输对象、值对象、聚合根	不变量、关系
并发模式	生产者-消费者、工作池	线程安全、背压处理

详见

references/SEMANTIC_PATTERNS.md

获取详细的识别指南。

Red Flags to Identify

需要识别的风险标记

Claude should actively flag these issues:

ARCHITECTURE:
⚠ GOD CLASS: >10 public methods or >500 LOC
⚠ CIRCULAR DEPENDENCY: A → B → C → A
⚠ LEAKY ABSTRACTION: Implementation details in interface

RELIABILITY:
⚠ SWALLOWED EXCEPTION: Empty catch blocks
⚠ MISSING TIMEOUT: Network calls without timeout
⚠ RACE CONDITION: Shared mutable state without sync

SECURITY:
⚠ HARDCODED SECRET: Passwords, API keys in code
⚠ SQL INJECTION: String concatenation in queries
⚠ MISSING VALIDATION: Unsanitized user input

Claude应主动标记以下问题：

架构问题：
⚠ 上帝类：>10个公共方法或>500行代码
⚠ 循环依赖：A → B → C → A
⚠ 抽象泄漏：接口中包含实现细节

可靠性问题：
⚠ 吞掉异常：空catch块
⚠ 缺少超时：网络调用未设置超时
⚠ 竞态条件：共享可变状态未同步

安全问题：
⚠ 硬编码密钥：代码中包含密码、API密钥
⚠ SQL注入：查询中使用字符串拼接
⚠ 缺少验证：用户输入未经过滤

Semantic Analysis Template

语义分析模板

Use

templates/semantic_analysis.md

for comprehensive per-file analysis that includes:

Executive summary (purpose, responsibility, patterns)
Behavioral analysis (triggers, processing, side effects)
Dependency analysis (why each dependency exists)
Quality assessment (strengths, concerns, red flags)
Contract documentation (full interface semantics)
Flow tracing (primary and error paths)
Testing implications (what must be tested)

使用

templates/semantic_analysis.md

进行全面的单文件分析，包括：

执行摘要（目的、职责、模式）
行为分析（触发条件、处理流程、副作用）
依赖分析（每个依赖存在的原因）
质量评估（优势、关注点、风险标记）
契约文档（完整接口语义）
流程追踪（主路径与错误路径）
测试影响（必须测试的内容）

AI Analysis Workflow

AI分析流程

1. SCRIPTS RUN FIRST
   ├── classifier.py → File classification
   ├── ast_parser.py → Structure extraction
   └── usage_finder.py → Cross-references

2. CLAUDE ANALYZES
   ├── Read actual source code
   ├── Apply semantic questions
   ├── Recognize patterns
   ├── Identify red flags
   └── Trace flows

3. CLAUDE DOCUMENTS
   ├── Use semantic_analysis.md template
   ├── Explain WHY, not just WHAT
   ├── Document contracts and invariants
   └── Flag concerns with severity

4. VERIFY
   ├── Check against runtime behavior
   ├── Validate with code traces
   └── Mark verification status

1. 脚本先运行
   ├── classifier.py → 文件分类
   ├── ast_parser.py → 结构提取
   └── usage_finder.py → 交叉引用

2. Claude分析
   ├── 阅读实际源代码
   ├── 应用语义分析问题
   ├── 识别模式
   ├── 标记风险
   └── 追踪流程

3. Claude生成文档
   ├── 使用semantic_analysis.md模板
   ├── 解释“为什么”，而非仅“做什么”
   ├── 记录契约与不变量
   └── 标记关注点及严重程度

4. 验证
   ├── 对照运行时行为检查
   ├── 用代码追踪验证
   └── 标记验证状态

Reference Documents

参考文档

```
references/AI_ANALYSIS_METHODOLOGY.md
```
- Complete analysis methodology
```
references/SEMANTIC_PATTERNS.md
```
- Pattern recognition guide
```
templates/semantic_analysis.md
```
- Per-file analysis template

```
references/AI_ANALYSIS_METHODOLOGY.md
```
- 完整的分析方法论
```
references/SEMANTIC_PATTERNS.md
```
- Claude的模式识别指南

Analysis Loop Workflow

分析循环流程

When analyzing a file, follow this sequence:

1. CLASSIFY
   ├── Count lines of code
   ├── Count dependencies
   ├── Check for critical patterns (auth, security, encryption)
   └── Assign classification

2. READ & MAP
   ├── Parse AST to extract structure
   ├── Identify classes and their methods
   ├── Identify standalone functions
   ├── Find global variables and constants
   └── Detect state mutations

3. DEPENDENCY CHECK
   ├── Internal imports (from project modules)
   ├── External imports (third-party)
   └── External calls (database, network, filesystem, messaging, ipc)

4. CONTEXT ANALYSIS
   ├── Where are exported symbols used?
   ├── What modules import this file?
   └── What message types flow through here?

5. RUNTIME VERIFICATION (if Critical/High-Complexity)
   ├── Use log analysis to trace actual behavior
   ├── Verify documented flow matches actual flow
   └── Note any discrepancies

6. DOCUMENTATION
   ├── Update analysis_progress.json
   ├── Generate module report section
   └── Cross-reference with CONTEXT.md

分析文件时，请遵循以下步骤：

1. 分类
   ├── 统计代码行数
   ├── 统计依赖数量
   ├── 检查关键模式（认证、安全、加密）
   └── 分配分类

2. 阅读与映射
   ├── 解析AST提取结构
   ├── 识别类及其方法
   ├── 识别独立函数
   ├── 查找全局变量和常量
   └── 检测状态变更

3. 依赖检查
   ├── 内部导入（项目模块内）
   ├── 外部导入（第三方）
   └── 外部调用（数据库、网络、文件系统、消息传递、进程间通信）

4. 上下文分析
   ├── 导出符号的引用位置
   ├── 哪些模块导入了该文件
   └── 哪些消息类型流经此处

5. 运行时验证（关键/高复杂度文件）
   ├── 使用日志分析追踪实际行为
   ├── 验证文档记录的流程与实际流程一致
   └── 记录差异

6. 生成文档
   ├── 更新analysis_progress.json
   ├── 生成模块报告章节
   └── 与CONTEXT.md交叉引用

Runtime Verification Integration

运行时验证集成

For runtime verification of critical/high-complexity files, use your project's log aggregation system:

Trace actual behavior through components using correlation IDs
Verify that documented flows match actual runtime behavior
Use distributed tracing or structured logs to follow request paths

The goal is to confirm that code paths match documented behavior through runtime evidence.

对于关键/高复杂度文件的运行时验证，请使用项目的日志聚合系统：

使用关联ID追踪组件的实际行为
验证文档记录的流程与实际运行时行为一致
使用分布式追踪或结构化日志跟踪请求路径

目标是通过运行时证据确认代码路径与文档记录的行为一致。

Output Interpretation

输出解读

JSON Output Structure

JSON输出结构

json

{
  "file": "src/utils/circuit_breaker.py",
  "classification": "critical",
  "metrics": {
    "lines_of_code": 245,
    "num_classes": 2,
    "num_functions": 8,
    "num_dependencies": 12
  },
  "structure": {
    "classes": [...],
    "functions": [...],
    "constants": [...]
  },
  "dependencies": {
    "internal": [...],
    "external": [...],
    "external_calls": [...]
  },
  "usages": [...],
  "verification_required": true
}

json

{
  "file": "src/utils/circuit_breaker.py",
  "classification": "critical",
  "metrics": {
    "lines_of_code": 245,
    "num_classes": 2,
    "num_functions": 8,
    "num_dependencies": 12
  },
  "structure": {
    "classes": [...],
    "functions": [...],
    "constants": [...]
  },
  "dependencies": {
    "internal": [...],
    "external": [...],
    "external_calls": [...]
  },
  "usages": [...],
  "verification_required": true
}

Markdown Output Format

Markdown输出格式

The markdown output follows the template in

templates/analysis_report.md

and produces sections suitable for inclusion in phase deliverable documents.

Markdown输出遵循

templates/analysis_report.md

中的模板，生成适合纳入阶段交付文档的内容。

Best Practices

最佳实践

Source Code Analysis (Phases 1-7)

源代码分析（第1-7阶段）

Start with Phase 1: Foundation modules inform understanding of everything else
Track Progress: Always use
```
--update-progress
```
when completing analysis
Verify Critical Files: Never skip runtime verification for critical/high-complexity
Cross-Reference: After analysis, update CONTEXT.md links
Document Drift: Note any discrepancies between existing docs and actual code

从第1阶段开始：基础模块的理解是后续所有分析的前提
追踪进度：完成分析时始终使用
```
--update-progress
```
参数
验证关键文件：绝不要跳过关键/高复杂度文件的运行时验证
交叉引用：分析完成后更新CONTEXT.md中的链接
记录偏差：记录现有文档与实际代码之间的任何差异

Documentation Maintenance (Phase 8)

文档维护（第8阶段）

Run scan first: Always start with
```
doc_review.py scan
```
to understand current state
Fix links before content: Broken links indicate structural issues to address first
Verify against code: Never update documentation without verifying against actual source
Update indexes last: Navigation indexes should reflect final state after all changes
Generate health report: Always produce
```
doc_health_report.json
```
as evidence of completion

先运行扫描：始终从
```
doc_review.py scan
```
开始，了解当前状态
先修复链接再处理内容：失效链接表明存在需要优先解决的结构问题
对照代码验证：绝不未对照实际源代码就更新文档
最后更新索引：导航索引应反映所有变更后的最终状态
生成健康报告：始终生成
```
doc_health_report.json
```
作为完成的证据

Documentation Maintenance Workflow

文档维护工作流

When invoking Phase 8 documentation maintenance, follow this sequence:

1. PLANNING
   ├── Run: doc_review.py scan --path docs/
   ├── Review health report
   ├── Identify priority fixes (broken links, obsolete files)
   └── Create todo list with specific actions

2. EXECUTION (in batches)
   ├── Batch 1: Fix broken links
   │   └── Run: doc_review.py validate-links --fix
   ├── Batch 2: Verify critical docs against source
   │   └── Run: doc_review.py verify --doc <file> --source <code>
   ├── Batch 3: Delete obsolete files
   │   └── Manual review + deletion
   ├── Batch 4: Update navigation indexes
   │   └── Run: doc_review.py update-indexes
   └── Batch 5: Update timestamps
       └── Set last_updated on verified files

3. VERIFICATION
   ├── Run: doc_review.py scan (confirm improvements)
   ├── Run: doc_review.py validate-links (confirm zero broken)
   └── Generate final doc_health_report.json

执行第8阶段文档维护时，请遵循以下步骤：

1. 规划
   ├── 运行: doc_review.py scan --path docs/
   ├── 查看健康报告
   ├── 确定优先修复项（失效链接、过时文件）
   └── 创建包含具体操作的待办列表

2. 执行（分批）
   ├── 批次1：修复失效链接
   │   └── 运行: doc_review.py validate-links --fix
   ├── 批次2：验证关键文档与源代码的一致性
   │   └── 运行: doc_review.py verify --doc <file> --source <code>
   ├── 批次3：删除过时文件
   │   └── 人工审查 + 删除
   ├── 批次4：更新导航索引
   │   └── 运行: doc_review.py update-indexes
   └── 批次5：更新时间戳
       └── 为已验证文件设置last_updated

3. 验证
   ├── 运行: doc_review.py scan（确认改进）
   ├── 运行: doc_review.py validate-links（确认无失效链接）
   └── 生成最终的doc_health_report.json

Resources

资源

Scripts:
```
scripts/
```
- Python analysis tools
- ```
analyze_file.py
```
  - Source code analysis (Phases 1-7)
- ```
check_progress.py
```
  - Progress tracking
- ```
doc_review.py
```
  - Documentation maintenance (Phase 8)
- ```
comment_rewriter.py
```
  - Comment analysis engine (antirez standards)
- ```
rewrite_comments.py
```
  - Comment quality CLI tool
Templates:
```
templates/
```
- Output templates
- ```
analysis_report.md
```
  - Module-level report template
- ```
semantic_analysis.md
```
  - AI-powered per-file analysis template
References:
```
references/
```
- Analysis methodology docs
- ```
DEEP_DIVE_PLAN.md
```
  - Master analysis plan with all phase definitions
- ```
ANTIREZ_COMMENTING_STANDARDS.md
```
  - Complete antirez comment taxonomy
- ```
AI_ANALYSIS_METHODOLOGY.md
```
  - AI semantic analysis methodology
- ```
SEMANTIC_PATTERNS.md
```
  - Pattern recognition guide for Claude
analysis_progress.json: Progress tracking state
doc_health_report.json: Documentation health metrics (generated)
comment_health.md: Comment quality report (generated)

脚本：
```
scripts/
```
- Python分析工具
- ```
analyze_file.py
```
  - 源代码分析（第1-7阶段）
- ```
check_progress.py
```
  - 进度追踪
- ```
doc_review.py
```
  - 文档维护（第8阶段）
- ```
comment_rewriter.py
```
  - 注释分析引擎（遵循antirez标准）
- ```
rewrite_comments.py
```
  - 注释质量CLI工具
模板：
```
templates/
```
- 输出模板
- ```
analysis_report.md
```
  - 模块级报告模板
- ```
semantic_analysis.md
```
  - 基于AI的单文件分析模板
参考文档：
```
references/
```
- 分析方法论文档
- ```
DEEP_DIVE_PLAN.md
```
  - 包含所有阶段定义的主分析计划
- ```
ANTIREZ_COMMENTING_STANDARDS.md
```
  - 完整的antirez注释分类体系
- ```
AI_ANALYSIS_METHODOLOGY.md
```
  - AI语义分析方法论
- ```
SEMANTIC_PATTERNS.md
```
  - Claude的模式识别指南
analysis_progress.json：进度追踪状态文件
doc_health_report.json：文档健康指标（生成的文件）
comment_health.md：注释质量报告（生成的文件）