源代码深度理解分析器 v2.3 (Code Deep Understanding Analyzer - 中文版)
Code Deep Understanding Analyzer v2.3 (Chinese Version)
基于认知科学研究的专业代码分析工具,支持三种分析深度,确保真正理解代码,而非产生流畅幻觉。
A professional code analysis tool based on cognitive science research, supporting three analysis depths to ensure true code understanding rather than generating fluency illusions.
三种分析模式
Three Analysis Modes
| 用户意图 | 推荐模式 | 触发词示例 | 分析时长 |
|---|
| 快速浏览/代码审查 | Quick Mode | "快速看一下"、"这段代码干嘛的"、"简单扫一眼" | 5-10 分钟 |
| 学习理解/技术调研 | Standard Mode ⭐ | "分析一下"、"帮我理解"、"解释一下"、"什么原理" | 15-20 分钟 |
| 深度掌握/大型项目 | Deep Mode 🚀 | "彻底分析"、"完全掌握"、"深入研究"、"面试准备"、"项目整体分析" | 30+ 分钟 |
默认使用 Standard Mode,系统会根据代码规模和用户意图自动选择最合适的模式。
🚀 Deep Mode 内部智能策略:
- 代码 ≤ 2000 行:使用渐进式生成(顺序填充章节)
- 代码 > 2000 行:自动启用并行处理(子 Agent 并行分析各章节)
| User Intent | Recommended Mode | Trigger Word Examples | Analysis Duration |
|---|
| Quick browsing/code review | Quick Mode | "Take a quick look", "What does this code do", "Scan briefly" | 5-10 minutes |
| Learning comprehension/technical research | Standard Mode ⭐ | "Analyze this", "Help me understand", "Explain this", "What's the principle" | 15-20 minutes |
| In-depth mastery/large-scale projects | Deep Mode 🚀 | "Thorough analysis", "Complete mastery", "In-depth research", "Interview preparation", "Overall project analysis" | 30+ minutes |
Standard Mode is used by default, and the system will automatically select the most appropriate mode based on code scale and user intent.
🚀 Deep Mode Internal Intelligent Strategy:
- Code ≤ 2000 lines: Uses progressive generation (sequential chapter filling)
- Code > 2000 lines: Automatically enables parallel processing (sub-Agents analyze chapters in parallel)
核心哲学:理解优先,记忆其次
Core Philosophy: Understanding First, Memory Second
反流畅幻觉 (Combat Fluency Illusion)
"能读懂代码 ≠ 能写出代码"
"能看懂解释 ≠ 能独立实现"
"感觉明白了 ≠ 真的理解了"
核心原则:
- 理解为什么 (WHY),而非只知道是什么 (WHAT)
- 强制自我解释,验证真实理解程度
- 建立概念连接,而非孤立记忆
- 通过应用变体,测试迁移能力
研究支撑:
- Dunlosky et al. - 精细询问法效果显著优于被动阅读
- Chi et al. - 自我解释者获得正确心智模型的概率更高
- Karpicke & Roediger - 检索练习优于重复阅读 250%
Combat Fluency Illusion
"Able to read code ≠ Able to write code"
"Able to understand explanations ≠ Able to implement independently"
"Feel like understanding ≠ Truly understand"
Core Principles:
- Understand the WHY, not just the WHAT
- Enforce self-explanation to verify true understanding
- Establish conceptual connections, not isolated memory
- Test transfer ability through application variants
Research Support:
- Dunlosky et al. - Elaborative interrogation is significantly more effective than passive reading
- Chi et al. - Self-explainers are more likely to acquire correct mental models
- Karpicke & Roediger - Retrieval practice is 250% better than repeated reading
分析前强制检查:理解验证关卡
Mandatory Pre-Analysis Check: Understanding Verification Checkpoint
Execute corresponding verification processes based on the selected mode:
Quick Mode - 简化验证
Quick Mode - Simplified Verification
- 快速识别代码类型和核心功能
- 列出关键概念(无需深度验证)
- Quickly identify code type and core functions
- List key concepts (no in-depth verification required)
Standard Mode - 标准验证
Standard Mode - Standard Verification
- 对核心概念进行自我解释测试
- 验证能否说出"为什么"
- Conduct self-explanation tests on core concepts
- Verify ability to explain the "WHY"
Deep Mode - 完整验证
Deep Mode - Complete Verification
- Full self-explanation test
- Application transfer ability verification
Output Format (at the beginning of the analysis document):
理解验证状态 [仅 Standard/Deep Mode]
Understanding Verification Status [Standard/Deep Mode Only]
| 核心概念 | 自我解释 | 理解"为什么" | 应用迁移 | 状态 |
|---|
| 用户认证流程 | ✅ | ✅ | ✅ | 已理解 |
| JWT Token 机制 | ✅ | ⚠️ | ❌ | ⚠️ 需深入理解 |
| 密码哈希 | ✅ | ✅ | ⚠️ | 基本理解 |
| Core Concept | Self-Explanation | Understand "WHY" | Application Transfer | Status |
|---|
| User Authentication Flow | ✅ | ✅ | ✅ | Understood |
| JWT Token Mechanism | ✅ | ⚠️ | ❌ | ⚠️ Needs in-depth understanding |
| Password Hashing | ✅ | ✅ | ⚠️ | Basic understanding |
三种模式的输出结构
Output Structures for Three Modes
Quick Mode 输出结构(5-10 分钟)
Quick Mode Output Structure (5-10 minutes)
[代码名称] 快速分析
[Code Name] Quick Analysis
- Programming language and version
- Code scale and type
- Core dependencies
2. 功能说明
2. Function Description
- 主要功能是什么 (WHAT)
- 简要说明 WHY 需要
- What is the main function (WHAT)
- Brief explanation of WHY it's needed
3. 核心算法/设计
3. Core Algorithm/Design
- 算法复杂度(如有)
- 使用的设计模式(如有)
- WHY 选择这个算法/模式
- Algorithm complexity (if applicable)
- Design patterns used (if applicable)
- WHY this algorithm/pattern was chosen
4. 关键代码段
4. Key Code Snippets
- 3-5 core code snippets
- Brief explanation of each snippet's role
5. 依赖关系
5. Dependency Relationships
- List of external libraries and their uses
6. 快速使用示例
6. Quick Usage Example
Standard Mode 输出结构(15-20 分钟)⭐推荐
Standard Mode Output Structure (15-20 minutes) ⭐Recommended
[代码名称] 深度理解分析
[Code Name] Deep Understanding Analysis
理解验证状态
Understanding Verification Status
[Self-explanation test result table]
- Programming language, scale, dependencies
2. 背景与动机(精细询问)
2. Background and Motivation (Elaborative Interrogation)
- WHY 需要这段代码
- WHY 选择这种方案
- WHY 不选其他方案
- WHY this code is needed
- WHY this solution was chosen
- WHY other solutions were not chosen
3. 核心概念说明
3. Core Concept Explanation
- List key concepts
- Answer 2-3 WHY questions for each concept
4. 算法与理论
4. Algorithms and Theory
- Complexity analysis
- WHY this algorithm was chosen
- Reference materials
5. 设计模式
5. Design Patterns
- Identified patterns
- WHY they are used
6. 关键代码深度解析
6. In-Depth Key Code Analysis
- Line-by-line WHY analysis
- Execution flow example
7. 依赖与使用示例
7. Dependencies and Usage Examples
Deep Mode 输出结构(30+ 分钟)
Deep Mode Output Structure (30+ minutes)
Deep Mode 根据代码规模自动选择最优策略,确保每个章节都有足够深度:
Deep Mode automatically selects the optimal strategy based on code scale to ensure sufficient depth for each chapter:
策略 A:渐进式生成(代码 ≤ 2000 行)
Strategy A: Progressive Generation (Code ≤ 2000 lines)
Suitable for small to medium code, generate chapters sequentially:
[代码名称] 完全掌握分析
[Code Name] Complete Mastery Analysis
[包含 Standard Mode 所有内容,加上以下部分]
[Includes all content from Standard Mode, plus the following sections]
3+. 概念网络图
3+. Concept Network Diagram
- 核心概念清单(每个 3 WHY)
- 概念关系矩阵
- 连接到已有知识
- Core concept list (3 WHY questions each)
- Concept relationship matrix
- Connection to existing knowledge
6+. 完整执行示例
6+. Complete Execution Example
- Multi-scenario execution flow
- Boundary condition explanation
- Error-prone point annotations
8. 测试用例分析(如代码包含测试)
8. Test Case Analysis (if code includes tests)
- 测试文件清单与覆盖分析
- 从测试中发现的边界条件
- 测试驱动的理解验证
- Test file list and coverage analysis
- Boundary conditions discovered from tests
- Test-driven understanding verification
9. 应用迁移场景(至少 2 个)
9. Application Transfer Scenarios (at least 2)
- 场景 1:不变原理 + 修改部分 + WHY
- 场景 2:不变原理 + 修改部分 + WHY
- 提取通用模式
- Scenario 1: Invariant principles + modified parts + WHY
- Scenario 2: Invariant principles + modified parts + WHY
- Extract general patterns
10. 依赖关系与使用示例
10. Dependency Relationships and Usage Examples
11. 质量验证清单
11. Quality Verification Checklist
- 理解深度验证
- 技术准确性验证
- 实用性验证
- 最终"四能"测试
- Understanding depth verification
- Technical accuracy verification
- Practicality verification
- Final "Four Abilities" test
策略 B:并行处理(代码 > 2000 行)🚀
Strategy B: Parallel Processing (Code > 2000 lines) 🚀
Suitable for large projects, uses sub-Agent parallel architecture:
┌─────────────────────────────────────────────────────────────┐
│ 主协调 Agent │
│ - 生成分析大纲和目录框架 │
│ - 识别核心概念列表(供子 Agent 共享) │
│ - 分配章节任务 │
│ - 汇总子 Agent 结果 │
│ - 最终质量验证 │
└─────────────────────────────────────────────────────────────┘
│
┌─────────────────┼─────────────────┐
▼ ▼ ▼
┌─────────────┐ ┌─────────────┐ ┌─────────────┐
│ 子 Agent 1 │ │ 子 Agent 2 │ │ 子 Agent 3 │
│ 背景与动机 │ │ 核心概念 │ │ 算法理论 │
└─────────────┘ └─────────────┘ └─────────────┘
│ │ │
└─────────────────┼─────────────────┘
▼
┌─────────────┐ ┌─────────────┐ ┌─────────────┐
│ 子 Agent 4 │ │ 子 Agent 5 │ │ 子 Agent 6 │
│ 设计模式 │ │ 代码解析 │ │ 应用迁移 │
└─────────────┘ └─────────────┘ └─────────────┘
┌─────────────────────────────────────────────────────────────┐
│ Main Coordinator Agent │
│ - Generates analysis outline and directory framework │
│ - Identifies core concept list (shared with sub-Agents) │
│ - Assigns chapter tasks │
│ - Aggregates sub-Agent results │
│ - Final quality verification │
└─────────────────────────────────────────────────────────────┘
│
┌─────────────────┼─────────────────┐
▼ ▼ ▼
┌─────────────┐ ┌─────────────┐ ┌─────────────┐
│ Sub-Agent 1 │ │ Sub-Agent 2 │ │ Sub-Agent 3 │
│ Background & Motivation │ │ Core Concepts │ │ Algorithms & Theory │
└─────────────┘ └─────────────┘ └─────────────┘
│ │ │
└─────────────────┼─────────────────┘
▼
┌─────────────┐ ┌─────────────┐ ┌─────────────┐
│ Sub-Agent 4 │ │ Sub-Agent 5 │ │ Sub-Agent 6 │
│ Design Patterns │ │ Code Analysis │ │ Application Transfer │
└─────────────┘ └─────────────┘ └─────────────┘
并行执行流程
Parallel Execution Flow
| 阶段 | 执行者 | 操作 | 输出 |
|---|
| 1. 框架准备 | 主 Agent | 快速概览代码,生成大纲和核心概念列表 | |
| 2. 任务分发 | 主 Agent | 为每个章节创建独立任务描述 | 任务列表 |
| 3. 并行处理 | 子 Agents | 每个子 Agent 专注一个章节,深度生成 | |
| 4. 结果汇总 | 主 Agent | 合并所有章节,统一格式 | |
| 5. 质量验证 | 主 Agent | 检查深度标准,补充薄弱部分 | 最终文档 |
| Phase | Executor | Operation | Output |
|---|
| 1. Framework Preparation | Main Agent | Quick overview of code, generates outline and core concept list | |
| 2. Task Distribution | Main Agent | Creates independent task descriptions for each chapter | Task list |
| 3. Parallel Processing | Sub-Agents | Each sub-Agent focuses on one chapter, generates in-depth content | |
| 4. Result Aggregation | Main Agent | Merges all chapters, unifies format | |
| 5. Quality Verification | Main Agent | Checks depth standards, supplements weak sections | Final document |
章节任务定义(给子 Agent 的指令模板)
Chapter Task Definition (Instruction Template for Sub-Agents)
子 Agent 任务:[章节名称]
Sub-Agent Task: [Chapter Name]
- 代码名称: [项目/代码名]
- 编程语言: [语言]
- 代码规模: [行数]
- 核心概念: [从主 Agent 传递的概念列表]
- Project/Code Name: [Project/Code Name]
- Programming Language: [Language]
- Code Scale: [Line count]
- Core Concepts: [Concept list from Main Agent]
你是专门负责"[章节名称]"章节的分析专家。请深度分析这个章节,生成详细内容。
You are a specialized analysis expert responsible for the "[Chapter Name]" section. Please conduct in-depth analysis of this section and generate detailed content.
- 内容深度: 本章节至少 [X] 字
- WHY 分析: 每个关键点必须回答 3 个 WHY
- 代码注释: 使用场景/步骤 + WHY 风格
- 引用来源: 提供权威参考链接
- 独立性: 生成完整独立的章节内容,不需要引用其他章节
- Content Depth: This chapter must be at least [X] words
- WHY Analysis: Each key point must answer 3 WHY questions
- Code Comments: Use scenario/step + WHY style
- Citation Sources: Provide authoritative reference links
- Independence: Generate complete independent chapter content, no need to reference other chapters
直接输出 Markdown 格式的章节内容,以
开头。
Directly output Markdown-formatted chapter content, starting with
.
主 Agent 汇总逻辑
Main Agent Aggregation Logic
Parallel Deep Mode 汇总规范
Parallel Deep Mode Aggregation Specification
-
读取所有子章节
章节_1_背景与动机.md
章节_2_核心概念.md
章节_3_算法理论.md
章节_4_设计模式.md
章节_5_代码解析.md
章节_6_测试用例分析.md(如有)
章节_7_应用迁移.md
章节_8_依赖关系.md
章节_9_质量验证.md
-
合并顺序
markdown
# [代码名称] 完全掌握分析(并行深度版)
## 理解验证状态
[从主 Agent 的初步分析生成]
[按顺序插入各章节内容]
-
交叉检查
- 核心概念在各章节中定义一致
- WHY 解释没有矛盾
- 引用的代码示例一致
-
深度验证
-
Read All Sub-Chapters
chapter_1_background_and_motivation.md
chapter_2_core_concepts.md
chapter_3_algorithms_and_theory.md
chapter_4_design_patterns.md
chapter_5_code_analysis.md
chapter_6_test_case_analysis.md (if applicable)
chapter_7_application_transfer.md
chapter_8_dependency_relationships.md
chapter_9_quality_verification.md
-
Merge Order
markdown
# [Project/Code Name] Complete Mastery Analysis (Parallel Deep Version)
## Understanding Verification Status
[Generated from Main Agent's preliminary analysis]
[Insert chapter content in order]
-
Cross-Check
- Core concepts are consistently defined across chapters
- WHY explanations have no contradictions
- Cited code examples are consistent
-
Depth Verification
- Each chapter meets word count requirements
- WHY analysis is sufficient
- Execution examples are complete
实现伪代码
Implementation Pseudocode
函数:ParallelDeepMode(代码, 工作目录):
// ========== 阶段 1: 框架准备 ==========
框架 = {
"代码名称": 提取名称(代码),
"编程语言": 识别语言(代码),
"代码规模": 统计行数(代码),
"核心概念": 提取核心概念(代码), // 共享给所有子 Agent
"章节列表": [
"背景与动机",
"核心概念",
"算法与理论",
"设计模式",
"关键代码解析",
"测试用例分析",
"应用迁移场景",
"依赖关系",
"质量验证"
]
}
写入文件(f"{工作目录}/00-框架.json", 框架)
// ========== 阶段 2: 创建子任务 ==========
子任务列表 = []
对于每个 章节 in 框架["章节列表"]:
任务描述 = 生成任务模板(章节, 框架)
任务文件 = f"{工作目录}/tasks/{章节}-任务.md"
写入文件(任务文件, 任务描述)
子任务列表.append(任务文件)
// ========== 阶段 3: 并行执行子 Agent ==========
// 注意:实际执行时通过 Task tool 创建并行子 Agent
章节文件列表 = []
对于每个 任务文件 in 子任务列表:
// 创建子 Agent(并行执行)
子Agent = 创建Agent(
名称: f"分析-{章节}",
任务: 读取文件(任务文件),
代码: 代码,
输出文件: f"{工作目录}/chapters/{章节}.md"
)
// 启动并行执行
子Agent.start(并行=True)
章节文件列表.append(子Agent.输出文件)
// 等待所有子 Agent 完成
等待所有(章节文件列表)
// ========== 阶段 4: 结果汇总 ==========
完整文档 = "# {框架['代码名称']} 完全掌握分析\n\n"
完整文档 += "## 理解验证状态\n\n"
完整文档 += 生成验证表格(框架) + "\n\n"
对于每个 章节文件 in 章节文件列表:
章节内容 = 读取文件(章节文件)
完整文档 += 章节内容 + "\n\n"
// ========== 阶段 5: 质量验证 ==========
if not 通过深度检查(完整文档):
薄弱章节 = 识别薄弱部分(完整文档)
对于 each 章节 in 薄弱章节:
// 重新执行该章节的子 Agent,要求更深度的内容
重新执行(章节)
完整文档 = 更新章节(完整文档, 章节)
// ========== 最终输出 ==========
最终文件 = f"{工作目录}/{框架['代码名称']}-完全掌握分析.md"
写入文件(最终文件, 完整文档)
return 最终文件
Function: ParallelDeepMode(code, work_directory):
// ========== Phase 1: Framework Preparation ==========
framework = {
"project_name": extract_name(code),
"language": identify_language(code),
"total_lines": count_lines(code),
"core_concepts": extract_core_concepts(code), // Shared with all sub-Agents
"chapters": [
"Background and Motivation",
"Core Concepts",
"Algorithms and Theory",
"Design Patterns",
"Key Code Analysis",
"Test Case Analysis",
"Application Transfer Scenarios",
"Dependency Relationships",
"Quality Verification"
]
}
write_file(f"{work_directory}/00-framework.json", framework)
// ========== Phase 2: Create Sub-Tasks ==========
subtask_list = []
for each chapter in framework["chapters"]:
task_description = generate_task_template(chapter, framework)
task_file = f"{work_directory}/tasks/{chapter}-task.md"
write_file(task_file, task_description)
subtask_list.append(task_file)
// ========== Phase 3: Execute Sub-Agents in Parallel ==========
// Note: Actual execution uses Task tool to create parallel sub-Agents
chapter_file_list = []
for each task_file in subtask_list:
// Create sub-Agent (execute in parallel)
sub_agent = create_agent(
name: f"Analyst-{chapter}",
task: read_file(task_file),
code: code,
output_file: f"{work_directory}/chapters/{chapter}.md"
)
// Start parallel execution
sub_agent.start(parallel=True)
chapter_file_list.append(sub_agent.output_file)
// Wait for all sub-Agents to complete
wait_for_all(chapter_file_list)
// ========== Phase 4: Result Aggregation ==========
complete_document = "# {framework['project_name']} Complete Mastery Analysis\n\n"
complete_document += "## Understanding Verification Status\n\n"
complete_document += generate_verification_table(framework) + "\n\n"
for each chapter_file in chapter_file_list:
chapter_content = read_file(chapter_file)
complete_document += chapter_content + "\n\n"
// ========== Phase 5: Quality Verification ==========
if not pass_depth_check(complete_document):
weak_chapters = identify_weak_sections(complete_document)
for each chapter in weak_chapters:
// Re-execute sub-Agent for this chapter, require deeper content
re_execute(chapter)
complete_document = update_chapter(complete_document, chapter)
// ========== Final Output ==========
final_file = f"{work_directory}/{framework['project_name']}-complete-mastery-analysis.md"
write_file(final_file, complete_document)
return final_file
分析流程(研究驱动)
Analysis Process (Research-Driven)
Depth Standards for Each Chapter:
深度自检清单(每章完成后检查)
Depth Self-Check Checklist (Check after completing each chapter)
内容完整性
Content Completeness
函数:DeepMode渐进式生成(代码, 文件路径):
// 阶段 1: 生成框架
框架 = 生成完整目录(Standard结构 + Deep扩展部分)
写入文件(文件路径, 框架)
// 阶段 2: 逐章填充
章节列表 = [
"1. 快速概览",
"2. 背景与动机",
"3. 核心概念",
"4. 算法与理论",
"5. 设计模式",
"6. 关键代码深度解析",
"7. 测试用例分析(如有)",
"8. 应用迁移场景",
"9. 依赖关系",
"10. 质量验证"
]
对于每个 章节 in 章节列表:
当前内容 = 读取文件(文件路径)
// 生成章节内容(单次专注,确保深度)
章节内容 = 深度生成章节(章节, 代码)
// 要求:每章至少 300-500 字,代码段有完整注释
// 深度自检
if not 通过深度检查(章节内容):
章节内容 = 追加细节(章节内容)
// 更新文件
新内容 = 当前内容.replace(章节占位符, 章节内容)
写入文件(文件路径, 新内容)
// 阶段 3: 整体验证
完整文档 = 读取文件(文件路径)
if not 通过整体检查(完整文档):
薄弱章节 = 识别薄弱部分(完整文档)
for 章节 in 薄弱章节:
补充内容(章节)
return 文件路径
**Implementation Method (Pseudocode Flow):**
Function: DeepModeProgressiveGeneration(code, file_path):
// Phase 1: Generate Framework
framework = generate_complete_outline(Standard structure + Deep extensions)
write_file(file_path, framework)
// Phase 2: Fill Chapters One by One
chapter_list = [
"1. Quick Overview",
"2. Background and Motivation",
"3. Core Concepts",
"4. Algorithms and Theory",
"5. Design Patterns",
"6. In-Depth Key Code Analysis",
"7. Test Case Analysis (if applicable)",
"8. Application Transfer Scenarios",
"9. Dependency Relationships",
"10. Quality Verification"
]
for each chapter in chapter_list:
current_content = read_file(file_path)
// Generate chapter content (focus on one task at a time to ensure depth)
chapter_content = generate_deep_chapter(chapter, code)
// Requirement: Each chapter is at least 300-500 words, code snippets have complete comments
// Depth Self-Check
if not pass_depth_check(chapter_content):
chapter_content = append_details(chapter_content)
// Update File
new_content = current_content.replace(chapter_placeholder, chapter_content)
write_file(file_path, new_content)
// Phase 3: Overall Verification
complete_document = read_file(file_path)
if not pass_overall_check(complete_document):
weak_chapters = identify_weak_sections(complete_document)
for chapter in weak_chapters:
supplement_content(chapter)
return file_path
分析流程(研究驱动)
Analysis Process (Research-Driven)
第 1 步:快速概览
Step 1: Quick Overview
目标: 建立整体心智模型 (Mental Model)
必须识别:
- 编程语言 (Programming Language) 和版本
- 文件/项目规模
- 核心依赖 (Dependencies)
- 代码类型(算法、业务逻辑、框架代码等)
Goal: Establish an overall mental model
Must Identify:
- Programming Language and version
- File/project scale
- Core Dependencies
- Code type (algorithm, business logic, framework code, etc.)
第 2 步:精细询问 - 背景与动机
Step 2: Elaborative Interrogation - Background and Motivation
核心问题(必须回答):
-
WHY:为什么需要这段代码?
-
WHY:为什么选择这种技术方案?
- 有哪些替代方案?
- 为什么不选择其他方案?
- 这个方案的权衡 (Trade-offs) 是什么?
-
WHY:为什么这个时机/场景需要它?
- 在什么业务流程中使用?
- 前置条件和后置条件是什么?
输出格式:
Core Questions (Must Answer):
-
WHY: Why is this code needed?
- What practical problem does it solve?
- What would happen if this code didn't exist?
-
WHY: Why was this technical solution chosen?
- What alternative solutions are there?
- Why weren't other solutions chosen?
- What are the trade-offs of this solution?
-
WHY: Why is it needed in this timing/scenario?
- In what business process is it used?
- What are the preconditions and postconditions?
Output Format:
背景与动机分析
Background and Motivation Analysis
要解决的问题: [用一句话描述]
WHY 需要解决: [不解决会导致什么后果]
Problem to Solve: [Describe in one sentence]
WHY It Needs to Be Solved: [Consequences of not solving it]
选择的方案: [当前实现方式]
WHY 选择这个方案:
- 优势:[列出 2-3 个关键优势]
- 劣势:[列出 1-2 个已知限制]
- 权衡:[说明在什么之间做了权衡]
替代方案对比:
- 方案 A:[简述] - WHY 不选:[原因]
- 方案 B:[简述] - WHY 不选:[原因]
Selected Solution: [Current implementation method]
WHY This Solution Was Chosen:
- Advantages: [List 2-3 key advantages]
- Disadvantages: [List 1-2 known limitations]
- Trade-offs: [Explain what trade-offs were made]
Alternative Solution Comparison:
- Solution A: [Brief description] - WHY not chosen: [Reason]
- Solution B: [Brief description] - WHY not chosen: [Reason]
应用场景
Application Scenarios
适用场景: [具体场景描述]
WHY 适用: [解释为什么这个场景适合]
不适用场景: [列出边界条件]
WHY 不适用: [解释为什么某些场景不适合]
Applicable Scenarios: [Specific scenario description]
WHY Applicable: [Explain why this scenario is suitable]
Inapplicable Scenarios: [List boundary conditions]
WHY Inapplicable: [Explain why certain scenarios are not suitable]
第 3 步:概念网络构建
Step 3: Concept Network Construction
目标: 建立概念间的连接,而非孤立记忆
必须包含:
-
核心概念提取
- 识别所有关键概念(类、函数、算法、数据结构)
- 每个概念必须回答 3 个 WHY
-
概念关系映射
- 依赖关系:A 依赖 B - WHY?
- 对比关系:A vs B - WHY 选 A?
- 组合关系:A + B → C - WHY 这样组合?
-
知识连接
输出格式:
Goal: Establish connections between concepts, not isolated memory
Must Include:
-
Core Concept Extraction
- Identify all key concepts (classes, functions, algorithms, data structures)
- Each concept must answer 3 WHY questions
-
Concept Relationship Mapping
- Dependency relationship: A depends on B - WHY?
- Comparison relationship: A vs B - WHY choose A?
- Combination relationship: A + B → C - WHY combine this way?
-
Knowledge Connection
- Connect to known concepts
- Connect to design patterns
- Connect to theoretical foundations
Output Format:
概念网络图
Concept Network Diagram
概念 1:用户认证 (User Authentication)
- 是什么: 验证用户身份的过程
- WHY 需要: 保护系统资源不被未授权访问
- WHY 这样实现: 使用 JWT 实现无状态认证,减轻服务器压力
- WHY 不用其他方式: Session 方式需要服务器存储,不利于水平扩展
概念 2:密码哈希 (Password Hashing)
- 是什么: 将明文密码转换为不可逆哈希值
- WHY 需要: 即使数据库泄露,攻击者也无法获得原始密码
- WHY 用 bcrypt: 自带盐值 (Salt),可调节计算成本抵抗暴力破解
- WHY 不用 MD5/SHA1: 计算速度太快,容易被暴力破解
Concept 1: User Authentication
- What it is: The process of verifying user identity
- WHY needed: Protect system resources from unauthorized access
- WHY implemented this way: Use JWT for stateless authentication to reduce server pressure
- WHY not use other methods: Session-based methods require server storage, which is not conducive to horizontal scaling
Concept 2: Password Hashing
- What it is: Convert plaintext passwords into irreversible hash values
- WHY needed: Even if the database is compromised, attackers cannot obtain original passwords
- WHY use bcrypt: Built-in salt, adjustable computational cost to resist brute-force attacks
- WHY not use MD5/SHA1: Too fast to compute, vulnerable to brute-force attacks
概念关系矩阵
Concept Relationship Matrix
| 关系类型 | 概念 A | 概念 B | WHY 这样关联 |
|---|
| 依赖 | 用户认证 | 密码哈希 | 认证过程需要验证密码,必须先哈希才能比对 |
| 顺序 | 密码哈希 | Token 生成 | 密码验证通过后才能生成访问 Token |
| 对比 | JWT | Session | JWT 无状态,适合分布式;Session 有状态,服务器压力大 |
| Relationship Type | Concept A | Concept B | WHY This Association |
|---|
| Dependency | User Authentication | Password Hashing | Authentication requires password verification, which must be hashed first for comparison |
| Sequence | Password Hashing | Token Generation | Access Token can only be generated after password verification passes |
| Comparison | JWT | Session | JWT is stateless, suitable for distributed systems; Session is stateful, increases server pressure |
连接到已有知识
Connection to Existing Knowledge
- 连接到设计模式: [下文详述]
- 连接到算法理论: [下文详述]
- 连接到安全原则: 最小权限原则、深度防御原则
- Connection to Design Patterns: [Detailed below]
- Connection to Algorithm Theory: [Detailed below]
- Connection to Security Principles: Least privilege principle, defense-in-depth principle
第 4 步:算法与理论深度分析
Step 4: In-Depth Algorithm and Theory Analysis
强制要求: 所有算法和核心理论必须:
- 标注时间/空间复杂度
- 解释"WHY 选择这个复杂度是可接受的"
- 提供权威参考资料
- 说明在什么场景下会退化
输出格式:
Mandatory Requirements: All algorithms and core theories must:
- Mark time/space complexity
- Explain "WHY this complexity is acceptable"
- Provide authoritative reference materials
- Explain scenarios where performance degrades
Output Format:
算法与理论分析
Algorithm and Theory Analysis
算法:快速排序 (Quick Sort)
Algorithm: Quick Sort
基本信息:
- 时间复杂度: 平均 O(n log n),最坏 O(n²)
- 空间复杂度: O(log n)
精细询问:
WHY 选择快速排序?
- 平均性能优秀,实际应用中通常最快
- 原地排序 (In-place),空间效率高
- 缓存友好 (Cache-friendly),访问局部性好
WHY 可接受最坏 O(n²)?
- 最坏情况概率极低(可通过随机化避免)
- 实际数据通常不是完全有序/逆序
- 可以用三数取中法 (Median-of-Three) 优化
WHY 不选择其他排序算法?
- 归并排序:需要 O(n) 额外空间,不适合内存受限场景
- 堆排序:虽然稳定 O(n log n),但缓存性能差,实际慢于快排
- 插入排序:小数据集优秀,但 O(n²) 不适合大规模数据
什么时候会退化?
- 输入已经有序或逆序(可用随机化解决)
- Pivot 选择不当(可用三数取中解决)
- 大量重复元素(可用三路快排优化)
参考资料:
Basic Information:
- Time Complexity: Average O(n log n), Worst O(n²)
- Space Complexity: O(log n)
Elaborative Interrogation:
WHY Choose Quick Sort?
- Excellent average performance, usually the fastest in practical applications
- In-place sorting, high space efficiency
- Cache-friendly, good access locality
WHY Is Worst-Case O(n²) Acceptable?
- Worst-case scenario has very low probability (can be avoided through randomization)
- Actual data is usually not fully sorted/reverse sorted
- Can be optimized with Median-of-Three method
WHY Not Choose Other Sorting Algorithms?
- Merge Sort: Requires O(n) additional space, not suitable for memory-constrained scenarios
- Heap Sort: Although stable O(n log n), poor cache performance, slower than Quick Sort in practice
- Insertion Sort: Excellent for small datasets, but O(n²) is not suitable for large-scale data
When Does Performance Degrade?
- Input is already sorted or reverse sorted (can be solved with randomization)
- Poor pivot selection (can be solved with Median-of-Three)
- Large number of duplicate elements (can be optimized with three-way Quick Sort)
Reference Materials:
理论基础:JWT (JSON Web Token)
Theoretical Foundation: JWT (JSON Web Token)
WHY 使用 JWT?
- 无状态认证,服务器不需要存储 Session
- 自包含 (Self-contained),Token 携带所有必要信息
- 跨域友好,适合微服务架构
WHY JWT 是安全的?
- 使用签名 (Signature) 验证完整性
- 无法伪造(除非私钥泄露)
- 可设置过期时间 (exp)
WHY JWT 有局限性?
- 无法主动失效(除非维护黑名单,破坏无状态优势)
- Token 体积较大(Base64 编码导致体积增加约 33%)
- 敏感信息需要加密,仅签名不提供保密性
参考资料:
WHY Use JWT?
- Stateless authentication, no need for server to store Sessions
- Self-contained, Token carries all necessary information
- Cross-domain friendly, suitable for microservice architecture
WHY Is JWT Secure?
- Uses signature to verify integrity
- Cannot be forged (unless private key is leaked)
- Can set expiration time (exp)
WHY Does JWT Have Limitations?
- Cannot be invalidated proactively (unless maintaining a blacklist, which undermines stateless advantage)
- Token size is relatively large (Base64 encoding increases size by about 33%)
- Sensitive information needs encryption, signature alone does not provide confidentiality
Reference Materials:
第 5 步:设计模式识别与询问
Step 5: Design Pattern Identification and Interrogation
强制检查: 代码中使用的每个设计模式都必须:
- 明确标注模式名称
- 解释 WHY 使用这个模式
- 说明不用这个模式会怎样
- 提供标准参考
输出格式:
Mandatory Check: Each design pattern used in the code must:
- Clearly mark the pattern name
- Explain WHY this pattern is used
- Explain what would happen if this pattern was not used
- Provide standard references
Output Format:
设计模式分析
Design Pattern Analysis
模式 1:单例模式 (Singleton Pattern)
Pattern 1: Singleton Pattern
WHY 使用单例?
- 数据库连接开销大,复用单个实例节省资源
- 避免连接池混乱,统一管理连接生命周期
- 全局唯一访问点,方便控制并发
WHY 不用单例会怎样?
- 每次操作创建新连接,资源耗尽
- 多个连接实例可能导致事务不一致
- 难以控制并发访问
实现细节:
python
class DatabaseConnection:
_instance = None
def __new__(cls):
if cls._instance is None:
cls._instance = super().__new__(cls)
# WHY 在 __new__ 中初始化:
# 确保对象创建前就是单例,线程安全
return cls._instance
WHY 这样实现?
- 使用 而非 :控制实例创建,而非初始化
- 类变量 :存储唯一实例
- 懒加载 (Lazy Loading):首次使用时才创建
潜在问题:
- ⚠️ 非线程安全(多线程环境需要加锁)
- ⚠️ 单元测试困难(全局状态难以隔离)
- ⚠️ 违反单一职责原则(类需要管理自己的实例)
更好的替代方案:
- 依赖注入 (Dependency Injection):更灵活,易于测试
- 模块级变量:Python 模块天然单例
参考资料:
Application Location: class
WHY Use Singleton?
- Database connections have high overhead, reusing a single instance saves resources
- Avoids connection pool chaos, unified connection lifecycle management
- Global unique access point, easy to control concurrency
WHY Not Use Singleton?
- Creating new connections for each operation leads to resource exhaustion
- Multiple connection instances may cause transaction inconsistencies
- Difficult to control concurrent access
Implementation Details:
python
class DatabaseConnection:
_instance = None
def __new__(cls):
if cls._instance is None:
cls._instance = super().__new__(cls)
# WHY initialize in __new__:
# Ensure singleton before object creation, thread-safe
return cls._instance
WHY Implement This Way?
- Use instead of : Control instance creation, not initialization
- Class variable : Stores the unique instance
- Lazy Loading: Only creates instance when first used
Potential Issues:
- ⚠️ Not thread-safe (needs locking in multi-threaded environments)
- ⚠️ Difficult unit testing (global state is hard to isolate)
- ⚠️ Violates single responsibility principle (class manages its own instance)
Better Alternative Solutions:
- Dependency Injection: More flexible, easier to test
- Module-level variables: Python modules are naturally singletons
Reference Materials:
第 6 步:逐行深度解析(关键代码段)
Step 6: In-Depth Line-by-Line Analysis (Key Code Snippets)
核心原则:
- 选择 3-5 个最关键的代码段
- 每行代码必须解释"做了什么"+"为什么这样做"
- 提供具体数据的执行流程示例
- 标注易错点和边界条件
输出格式:
Core Principles:
- Select 3-5 most critical code snippets
- Each line of code must explain "what it does" + "WHY it's done this way"
- Provide execution flow examples with specific data
- Annotate error-prone points and boundary conditions
Output Format:
关键代码深度解析
In-Depth Key Code Analysis
代码段 1:用户认证函数
Code Snippet 1: User Authentication Function
整体作用: 验证用户名和密码,返回 JWT Token 或 None
WHY 需要这个函数: 认证是系统安全的第一道防线,必须可靠且高效
原始代码:
python
def authenticate_user(username, password):
user = db.find_user(username)
if not user:
return None
if verify_password(password, user.password_hash):
return generate_token(user.id)
return None
逐行精细解析(推荐注释风格):场景化 + 执行流追踪
注释风格说明:
- / - 标注条件分支的不同执行路径(if/else、switch、match 等)
- / - 标注串行执行流程(初始化顺序、函数调用序列等)
- 注释符号与语言一致:Python 用 ,C++/Java 用
- 用具体变量值追踪执行流程( / )
- 注明循环/递归的迭代状态
- 标注关键数据的变化轨迹
python
def authenticate_user(username, password):
# 步骤 1: 查询用户
user = db.find_user(username)
# WHY 先查用户:避免不存在的用户名也进行密码哈希(节省计算)
# 场景 1: 若用户不存在,立即返回 None
if not user:
return None
# WHY 返回 None 而非抛异常:认证失败是正常业务流程,非异常情况
# WHY 不区分"用户不存在"和"密码错误":防止用户名枚举攻击
# 场景 2: 若密码验证通过,生成并返回 Token
if verify_password(password, user.password_hash):
# verify_password 内部流程:
# 1. 从 password_hash 提取盐值 (Salt)
# 2. 用相同盐值哈希明文密码
# 3. 恒定时间比较两个哈希值(防止时序攻击)
return generate_token(user.id)
# 此时:user.id = 42(假设)
# generate_token(42) → "eyJhbGciOiJIUzI1NiIsInR5cCI6IkpXVCJ9..."
# 场景 3: 密码错误,返回 None
return None
# WHY 与"用户不存在"相同的返回值:防止攻击者区分两种失败情况
完整执行流示例(多场景追踪):
cpp
// 示例:追溯 tensor 生产者的函数(编译器代码典型风格)
Value getProducerOfTensor(Value tensor) {
Value opResult;
while (true) {
// 场景 1: 若 tensor 由 LinalgOp 定义,直接返回
if (auto linalgOp = tensor.getDefiningOp<LinalgOp>()) {
opResult = cast<OpResult>(tensor);
// while 只循环 1 次
return;
}
// 按照本节示例,首次调用本函数时:tensor = %2_tile
// 场景 2: 若 tensor 通过 ExtractSliceOp 链接,继续追溯源
if (auto sliceOp = tensor.getDefiningOp<tensor::ExtractSliceOp>()) {
tensor = sliceOp.getSource();
// 此时:tensor = %2,由 linalg.matmul 定义
// 执行第二次 while 循环,会进入场景 1 分支 (linalg.matmul 是 LinalgOp)
continue;
}
// 场景 3: 通过 scf.for 的迭代参数
// 示例 IR:
// %1 = linalg.generic ins(%A) outs(%init) { ... }
// %2 = scf.for %i = 0 to 10 iter_args(%arg = %1) {
// %3 = linalg.generic ins(%arg) outs(%init2) { ... }
// scf.yield %3
// }
// getProducerOfTensor(%arg)
if (auto blockArg = dyn_cast<BlockArgument>(tensor)) {
// 第一次 while 循环:tensor = %arg,是 BlockArgument
if (auto forOp = blockArg.getDefiningOp<scf::ForOp>()) {
// %arg 由 scf.for 定义,获取循环的初始值:%1
// blockArg.getArgNumber() = 0(%arg 是第 0 个迭代参数)
// forOp.getInitArgs()[0] = %1
tensor = forOp.getInitArgs()[blockArg.getArgNumber()];
// 此时:tensor = %1,由 linalg.generic 定义
// 执行第二次 while 循环,会进入场景 1 分支
continue;
}
}
return; // 找不到(可能是函数参数)
}
}
执行流程示例(推荐风格):
场景 1:认证成功
Overall Role: Verify username and password, return JWT Token or None
WHY This Function Is Needed: Authentication is the first line of defense for system security, must be reliable and efficient
Original Code:
python
def authenticate_user(username, password):
user = db.find_user(username)
if not user:
return None
if verify_password(password, user.password_hash):
return generate_token(user.id)
return None
In-Depth Line-by-Line Analysis (Recommended Comment Style): Scenario-Based + Execution Flow Tracking
Comment Style Explanation:
# Scenario N: [Description]
/ // Scenario N: [Description]
- Mark different execution paths for conditional branches (if/else, switch, match, etc.)
- / - Mark serial execution flows (initialization order, function call sequence, etc.)
- Comment symbols match the language: Use for Python, for C++/Java
- Track execution flow with specific variable values ( / )
- Note iteration status of loops/recursion
- Mark change trajectories of key data
python
def authenticate_user(username, password):
# Step 1: Query user
user = db.find_user(username)
# WHY query user first: Avoid password hashing for non-existent usernames (save computation)
# Scenario 1: If user does not exist, immediately return None
if not user:
return None
# WHY return None instead of throwing exception: Authentication failure is a normal business process, not an exception
# WHY not distinguish between "user does not exist" and "wrong password": Prevent username enumeration attacks
# Scenario 2: If password verification passes, generate and return Token
if verify_password(password, user.password_hash):
# verify_password internal flow:
# 1. Extract salt from password_hash
# 2. Hash plaintext password with the same salt
# 3. Constant-time comparison of two hash values (prevent timing attacks)
return generate_token(user.id)
# Current state: user.id = 42 (example)
# generate_token(42) → "eyJhbGciOiJIUzI1NiIsInR5cCI6IkpXVCJ9..."
# Scenario 3: Wrong password, return None
return None
# WHY same return value as "user does not exist": Prevent attackers from distinguishing between the two failure cases
Complete Execution Flow Example (Multi-Scenario Tracking):
cpp
// Example: Trace function that produces tensors (typical compiler code style)
Value getProducerOfTensor(Value tensor) {
Value opResult;
while (true) {
// Scenario 1: If tensor is defined by LinalgOp, return directly
if (auto linalgOp = tensor.getDefiningOp<LinalgOp>()) {
opResult = cast<OpResult>(tensor);
// while loop runs only once
return;
}
// According to this section's example, first call to this function: tensor = %2_tile
// Scenario 2: If tensor is linked via ExtractSliceOp, continue tracing source
if (auto sliceOp = tensor.getDefiningOp<tensor::ExtractSliceOp>()) {
tensor = sliceOp.getSource();
// Current state: tensor = %2, defined by linalg.matmul
// Execute second while loop, will enter Scenario 1 branch (linalg.matmul is LinalgOp)
continue;
}
// Scenario 3: Via scf.for iteration parameter
// Example IR:
// %1 = linalg.generic ins(%A) outs(%init) { ... }
// %2 = scf.for %i = 0 to 10 iter_args(%arg = %1) {
// %3 = linalg.generic ins(%arg) outs(%init2) { ... }
// scf.yield %3
// }
// getProducerOfTensor(%arg)
if (auto blockArg = dyn_cast<BlockArgument>(tensor)) {
// First while loop: tensor = %arg, which is BlockArgument
if (auto forOp = blockArg.getDefiningOp<scf::ForOp>()) {
// %arg is defined by scf.for, get loop's initial value: %1
// blockArg.getArgNumber() = 0 (%arg is the 0th iteration parameter)
// forOp.getInitArgs()[0] = %1
tensor = forOp.getInitArgs()[blockArg.getArgNumber()];
// Current state: tensor = %1, defined by linalg.generic
// Execute second while loop, will enter Scenario 1 branch
continue;
}
}
return; // Not found (may be function parameter)
}
}
Recommended Execution Flow Example Style:
Scenario 1: Authentication Success
输入:username="alice", password="Secret123!"
Input: username="alice", password="Secret123!"
步骤 1: db.find_user("alice")
→ 查询数据库
→ 返回 User(id=42, username="alice", password_hash="$2b$12$KIX...")
此时:user 存在,跳过场景 1 的 return None
步骤 2: 进入场景 2 分支(密码验证)
→ verify_password("Secret123!", "$2b$12$KIX...")
→ 提取盐值:$2b$12$KIX...
→ 哈希 "Secret123!" with salt
→ 恒定时间比较哈希值
→ 返回 True
步骤 3: generate_token(42)
→ 创建 payload: {"user_id": 42, "exp": 1643723400}
→ 使用私钥签名
→ 返回 "eyJhbGciOiJIUzI1NiIsInR5cCI6IkpXVCJ9.eyJ1c2VyX2lkIjo0Miwi..."
最终返回:Token 字符串
Step 1: db.find_user("alice")
→ Query database
→ Return User(id=42, username="alice", password_hash="$2b$12$KIX...")
Current state: user exists, skip return None in Scenario 1
Step 2: Enter Scenario 2 branch (password verification)
→ verify_password("Secret123!", "$2b$12$KIX...")
→ Extract salt: $2b$12$KIX...
→ Hash "Secret123!" with salt
→ Constant-time comparison of hash values
→ Return True
Step 3: generate_token(42)
→ Create payload: {"user_id": 42, "exp": 1643723400}
→ Sign with private key
→ Return "eyJhbGciOiJIUzI1NiIsInR5cCI6IkpXVCJ9.eyJ1c2VyX2lkIjo0Miwi..."
Final return: Token string
Time consumed: ~100ms (mainly bcrypt computation)
**Scenario 2: User Does Not Exist**
输入:username="bob", password="anything"
Input: username="bob", password="anything"
步骤 1: db.find_user("bob")
→ 查询数据库
→ 返回 None
此时:user = None,进入场景 1 分支
步骤 2: if not user: # true
→ 直接返回 None
场景 2、3 都不执行
Step 1: db.find_user("bob")
→ Query database
→ Return None
Current state: user = None, enter Scenario 1 branch
Step 2: if not user: # true
→ Directly return None
Scenarios 2 and 3 are not executed
耗时:~5ms(仅数据库查询)
⚠️ 注意:比认证成功快得多,可能泄露用户是否存在
Time consumed: ~5ms (only database query)
⚠️ Note: Much faster than authentication success, may leak whether user exists
安全建议:添加固定延迟或假哈希计算,使两种情况耗时接近
Security Recommendation: Add fixed delay or fake hash computation to make response times similar for both cases
**Scenario 3: Wrong Password**
输入:username="alice", password="WrongPass"
Input: username="alice", password="WrongPass"
步骤 1: db.find_user("alice")
→ 返回 User(id=42, ...)
此时:user 存在,跳过场景 1 的 return None
步骤 2: 进入场景 2 分支(密码验证)
→ verify_password("WrongPass", "$2b$12$KIX...")
→ 哈希 "WrongPass"
→ 比较哈希值
→ 返回 False
步骤 3: 密码验证失败,不执行 generate_token
→ 继续执行到最后的 return None
场景 3:密码验证失败,返回 None
Step 1: db.find_user("alice")
→ Return User(id=42, ...)
Current state: user exists, skip return None in Scenario 1
Step 2: Enter Scenario 2 branch (password verification)
→ verify_password("WrongPass", "$2b$12$KIX...")
→ Hash "WrongPass"
→ Compare hash values
→ Return False
Step 3: Password verification fails, do not execute generate_token
→ Continue to final return None
Scenario 3: Password verification fails, return None
耗时:~100ms(与认证成功相近)
✅ 好处:无法通过响应时间判断密码是否正确
**关键要点总结:**
1. **安全性考虑:**
- ✅ 明文密码仅在内存中短暂存在,立即哈希验证
- ✅ 失败原因不泄露(防止用户名枚举)
- ✅ 时间恒定比较(防止时序攻击)
- ⚠️ 潜在问题:用户不存在时响应更快(需优化)
2. **性能优化:**
- ✅ 用户不存在时快速返回,不浪费哈希计算
- ⚠️ 但这会导致时序泄露,需权衡安全与性能
3. **错误处理:**
- ✅ 用 None 表示失败,清晰且符合 Python 惯例
- ⚠️ 调用方需检查返回值,否则可能误用 None
4. **可改进之处:**
- 添加日志记录失败尝试(检测暴力破解)
- 添加速率限制(Rate Limiting)
- 统一失败场景响应时间
Time consumed: ~100ms (similar to authentication success)
✅ Advantage: Cannot determine if password is correct via response time
**Key Takeaways Summary:**
1. **Security Considerations:**
- ✅ Plaintext password only exists briefly in memory, immediately hashed for verification
- ✅ Failure reasons are not disclosed (prevent username enumeration)
- ✅ Constant-time comparison (prevent timing attacks)
- ⚠️ Potential issue: Faster response when user does not exist (needs optimization)
2. **Performance Optimization:**
- ✅ Quick return when user does not exist, no wasted hash computation
- ⚠️ But this causes timing leakage, need to balance security and performance
3. **Error Handling:**
- ✅ Use None to indicate failure, clear and conforms to Python conventions
- ⚠️ Caller must check return value, otherwise may misuse None
4. **Improvement Areas:**
- Add logging for failed attempts (detect brute-force attacks)
- Add Rate Limiting
- Unify response times for failure scenarios
第 6.5 步:测试用例反向理解(如有测试)
Step 6.5: Reverse Understanding via Test Cases (If Tests Exist)
目标: 通过测试用例反向验证和深化对代码功能的理解
为什么重要:
- 测试用例反映了代码的预期行为,是最准确的"使用说明书"
- 测试通常覆盖边界条件和异常场景,这些在主代码中容易被忽略
- 通过测试可以验证理解是否正确,避免产生错误的假设
当检测到代码包含测试文件时,必须执行此步骤。
Goal: Reverse verify and deepen understanding of code functionality through test cases
Why It's Important:
- Test cases reflect the expected behavior of the code, making them the most accurate "user manual"
- Tests usually cover boundary conditions and exception scenarios, which are easily overlooked in the main code
- Tests can verify if understanding is correct, avoiding false assumptions
Must execute this step when code contains test files.
6.5.1 测试文件识别
6.5.1 Test File Identification
常见测试文件模式:
| 语言 | 测试文件模式 | 测试目录结构 |
|---|
| Python | , | , |
| JavaScript/TypeScript | , | , |
| Go | | 与源码同目录, |
| Java | , | |
| C++ | (包含测试), gtest | , , |
| Rust | , | |
| MLIR/LLVM | (测试文件) | |
大型项目测试目录结构示例:
Common Test File Patterns:
| Language | Test File Patterns | Test Directory Structure |
|---|
| Python | , | , |
| JavaScript/TypeScript | , | , |
| Go | | Same directory as source code, |
| Java | , | |
| C++ | (contains tests), gtest | , , |
| Rust | , | |
| MLIR/LLVM | (test files) | |
Large Project Test Directory Structure Example:
MLIR 风格(测试独立目录)
MLIR Style (independent test directory)
mlir/test/Dialect/Linalg/
├── ops.mlir # Linalg 方言操作测试
├── transformation.mlir # 变换测试
├── interfaces.mlir # 接口测试
└── invalid.mlir # 错误处理测试
mlir/test/Dialect/Linalg/
├── ops.mlir # Linalg dialect operation tests
├── transformation.mlir # Transformation tests
├── interfaces.mlir # Interface tests
└── invalid.mlir # Error handling tests
传统 C++ 项目风格
Traditional C++ Project Style
project/test/
├── unittest/ # 单元测试
├── integration/ # 集成测试
└── benchmark/ # 性能测试
project/test/
├── unittest/ # Unit tests
├── integration/ # Integration tests
└── benchmark/ # Performance tests
6.5.2 测试覆盖分析
6.5.2 Test Coverage Analysis
Analyze Functionality Covered by Tests:
测试用例覆盖分析
Test Case Coverage Analysis
| 测试文件/目录 | 测试的模块 | 测试用例数量 |
|---|
test/Dialect/Linalg/ops.mlir
| Linalg Ops | 156 |
test/Dialect/Linalg/invalid.mlir
| 错误处理 | 43 |
| | 12 |
| Test File/Directory | Tested Module | Number of Test Cases |
|---|
test/Dialect/Linalg/ops.mlir
| Linalg Ops | 156 |
test/Dialect/Linalg/invalid.mlir
| Error Handling | 43 |
| | 12 |
功能覆盖矩阵
Function Coverage Matrix
| 核心功能 | 主代码位置 | 测试覆盖 | 覆盖率评估 |
|---|
| linalg.matmul 操作 | | ✅ 有测试 | 覆盖正常+边界 |
| linalg.generic 接口 | | ✅ 有测试 | 覆盖完整 |
| Tile 变换 | | ⚠️ 测试不足 | 缺少嵌套场景 |
| Core Function | Main Code Location | Test Coverage | Coverage Evaluation |
|---|
| linalg.matmul operation | | ✅ Has tests | Covers normal + boundary cases |
| linalg.generic interface | | ✅ Has tests | Fully covered |
| Tile transformation | | ⚠️ Insufficient tests | Missing nested scenarios |
6.5.3 通过测试理解边界条件
6.5.3 Understanding Boundary Conditions Through Tests
Extract Key Boundary Conditions from Tests:
从测试中发现的边界条件
Boundary Conditions Discovered from Tests
MLIR 示例:理解 linalg.generic 的区域约束
MLIR Example: Understanding linalg.generic Region Constraints
测试文件:test/Dialect/Linalg/invalid.mlir
Test File: test/Dialect/Linalg/invalid.mlir
mlir
// 测试:generic 的 region 必须有且仅有一个 block
func.func @invalid_generic_empty_region(%arg0: tensor<10xf32>) -> tensor<10xf32> {
%0 = linalg.generic {indexing_maps = [affine_map<(d0) -> (d0)>],
iterator_types = ["parallel"]}
outs(%arg0) {
// 空 region - 应该报错
} -> tensor<10xf32>
return %0 : tensor<10xf32>
}
WHY 这个测试重要:
- 揭示了 的结构约束:必须有 block
- 通过负向测试(invalid test)明确错误条件
- 边界条件:region 的 block 数量必须 = 1
mlir
// Test: generic region must have exactly one block
func.func @invalid_generic_empty_region(%arg0: tensor<10xf32>) -> tensor<10xf32> {
%0 = linalg.generic {indexing_maps = [affine_map<(d0) -> (d0)>],
iterator_types = ["parallel"]}
outs(%arg0) {
// Empty region - should report error
} -> tensor<10xf32>
return %0 : tensor<10xf32>
}
WHY This Test Is Important:
- Reveals structural constraints of : Must have a block
- Clearly defines error conditions through negative testing (invalid test)
- Boundary condition: Number of region blocks must = 1
测试文件:test/Dialect/Linalg/ops.mlir
Test File: test/Dialect/Linalg/ops.mlir
mlir
// 测试:输入和输出数量必须与 indexing_maps 一致
func.func @generic_mismatched_maps(%a: tensor<10xf32>, %b: tensor<10xf32>) -> tensor<10xf32> {
%0 = linalg.generic {
indexing_maps = [
affine_map<(d0) -> (d0)>, // 1 个输入的 map
affine_map<(d0) -> (d0)> // 1 个输出的 map
],
iterator_types = ["parallel"]
} ins(%a, %b : tensor<10xf32>, tensor<10xf32>) // 但有 2 个输入
outs(%0 : tensor<10xf32>) {
^bb0(%in: f32, %in_2: f32, %out: f32):
linalg.yield %in : f32
} -> tensor<10xf32>
return %0 : tensor<10xf32>
}
WHY 这样处理:
- 验证了类型系统约束:输入/输出数量必须与 map 一致
- 测试了静态验证逻辑,在编译期捕获错误
- 说明了 MLIR 的静态强类型特性
mlir
// Test: Number of inputs and outputs must match indexing_maps
func.func @generic_mismatched_maps(%a: tensor<10xf32>, %b: tensor<10xf32>) -> tensor<10xf32> {
%0 = linalg.generic {
indexing_maps = [
affine_map<(d0) -> (d0)>, // Map for 1 input
affine_map<(d0) -> (d0)> // Map for 1 output
],
iterator_types = ["parallel"]
} ins(%a, %b : tensor<10xf32>, tensor<10xf32>) // But there are 2 inputs
outs(%0 : tensor<10xf32>) {
^bb0(%in: f32, %in_2: f32, %out: f32):
linalg.yield %in : f32
} -> tensor<10xf32>
return %0 : tensor<10xf32>
}
WHY This Is Handled This Way:
- Verifies type system constraints: Number of inputs/outputs must match maps
- Tests static verification logic, catches errors at compile time
- Illustrates MLIR's static strong typing feature
C++ 示例:通过测试理解并发安全性
C++ Example: Understanding Concurrent Security Through Tests
测试文件:unittest/concurrent_map_test.cpp
Test File: unittest/concurrent_map_test.cpp
cpp
// 测试:并发插入相同键
TEST(ConcurrentMapTest, ConcurrentInsertSameKey) {
ConcurrentMap<int, int> map;
const int num_threads = 10;
const int key = 42;
std::vector<std::thread> threads;
for (int i = 0; i < num_threads; ++i) {
threads.emplace_back([&map, key, i]() {
map.Insert(key, i); // 所有线程插入同一个 key
});
}
for (auto& t : threads) t.join();
// 验证:只有一个插入成功
EXPECT_EQ(map.Size(), 1);
EXPECT_TRUE(map.Contains(key));
}
WHY 这个测试存在:
- 验证了线程安全性:多线程并发访问不会崩溃
- 说明了冲突处理策略:后插入覆盖先插入(或反之)
- 测试了一致性保证:最终状态符合预期
cpp
// Test: Concurrent insertion of the same key
TEST(ConcurrentMapTest, ConcurrentInsertSameKey) {
ConcurrentMap<int, int> map;
const int num_threads = 10;
const int key = 42;
std::vector<std::thread> threads;
for (int i = 0; i < num_threads; ++i) {
threads.emplace_back([&map, key, i]() {
map.Insert(key, i); // All threads insert the same key
});
}
for (auto& t : threads) t.join();
// Verify: Only one insertion succeeds
EXPECT_EQ(map.Size(), 1);
EXPECT_TRUE(map.Contains(key));
}
WHY This Test Exists:
- Verifies thread safety: Multi-threaded concurrent access does not cause crashes
- Illustrates conflict handling strategy: Later insertions overwrite earlier ones (or vice versa)
- Tests consistency guarantees: Final state meets expectations
6.5.4 测试驱动理解示例
6.5.4 Test-Driven Understanding Example
Complete Example: Understanding Transformation Through MLIR Tests
测试用例反向理解:linalg.tile 变换
Reverse Understanding via Test Cases: linalg.tile Transformation
问题:仅看文档能理解 tile 的全部行为吗?
Question: Can we fully understand tile behavior just by reading documentation?
文档说明(简化):
可能遗漏的细节:
- Tile 大小如何确定?
- 支持哪些操作的 tile?
- Tile 后的循环顺序是什么?
- 如何处理剩余元素?
Documentation Description (Simplified):
decomposes linalg operations into smaller fragments
Potentially Missing Details:
- How is tile size determined?
- Which operations support tiling?
- What is the loop order after tiling?
- How to handle remaining elements?
从测试中发现的答案
Answers Discovered from Tests
测试 1:test/tile-mlir.mlir - 基本 tile 行为
Test 1: test/tile-mlir.mlir - Basic Tile Behavior
mlir
// 原始操作
%0 = linalg.matmul ins(%A: tensor<128x128xf32>, %B: tensor<128x128xf32>)
outs(%C: tensor<128x128xf32>)
// Tile 大小为 32x32
%1 = linalg.tile %0 tile_sizes[32, 32]
发现: Tile 大小直接指定,输出包含嵌套循环结构
mlir
// Original operation
%0 = linalg.matmul ins(%A: tensor<128x128xf32>, %B: tensor<128x128xf32>)
outs(%C: tensor<128x128xf32>)
// Tile size 32x32
%1 = linalg.tile %0 tile_sizes[32, 32]
Discovery: Tile size is specified directly, output contains nested loop structure
测试 2:test/tile-mlir.mlir - 剩余元素处理
Test 2: test/tile-mlir.mlir - Handling Remaining Elements
mlir
// 127x127 矩阵,tile 大小 32x32
%0 = linalg.matmul ins(%A: tensor<127x127xf32>, ...)
%1 = linalg.tile %0 tile_sizes[32, 32]
发现: 自动生成边界检查处理不均匀的剩余部分
mlir
// 127x127 matrix, tile size 32x32
%0 = linalg.matmul ins(%A: tensor<127x127xf32>, ...)
%1 = linalg.tile %0 tile_sizes[32, 32]
Discovery: Automatically generates boundary checks to handle uneven remaining elements
测试 3:test/tile-mlir.mlir - 不可 tile 的操作
Test 3: test/tile-mlir.mlir - Operations That Cannot Be Tiled
mlir
// 尝试 tile 不支持的操作
%0 = linalg.generic ...
%1 = linalg.tile %0 tile_sizes[16]
// 预期:编译错误或运行时失败
发现: 并非所有操作都支持 tile,有明确的限制条件
mlir
// Attempt to tile unsupported operation
%0 = linalg.generic ...
%1 = linalg.tile %0 tile_sizes[16]
// Expected: Compilation error or runtime failure
Discovery: Not all operations support tiling, there are clear constraints
测试前后理解对比
Understanding Comparison Before and After Tests
| 问题 | 仅看文档 | 看测试后 |
|---|
| Tile 大小如何指定? | ⚠️ 不清楚 | ✅ 直接作为参数 |
| 剩余元素如何处理? | ❓ 文档未提及 | ✅ 自动边界检查 |
| 支持哪些操作? | ❓ 列表不完整 | ✅ 测试覆盖所有支持的操作 |
| 循环顺序是什么? | ⚠️ 描述模糊 | ✅ 从测试 IR 可看出顺序 |
结论: 测试用例补充了约 50% 的实现细节!
| Question | After Reading Documentation Only | After Reading Tests |
|---|
| How to specify tile size? | ⚠️ Unclear | ✅ Directly as parameter |
| How to handle remaining elements? | ❓ Not mentioned in documentation | ✅ Automatic boundary checks |
| Which operations are supported? | ❓ Incomplete list | ✅ Tests cover all supported operations |
| What is the loop order? | ⚠️ Vague description | ✅ Can see order from test IR |
Conclusion: Test cases supplement approximately 50% of implementation details!
6.5.5 不同语言测试文件解析要点
6.5.5 Key Points for Parsing Test Files in Different Languages
Notes for Testing in Each Language:
各语言测试文件解析要点
Key Points for Parsing Test Files in Different Languages
Python (pytest/unittest)
Python (pytest/unittest)
- 查找 或
- 注意 参数化测试
- 关注 异常测试
- 查找 fixtures () 了解测试上下文
- Look for or
- Pay attention to parameterized tests
- Focus on exception tests
- Look for fixtures () to understand test context
C++ (gtest/gtest)
C++ (gtest/gtest)
- 查找 或
- 表示 fixture 测试,有前置条件
- vs :失败后是否继续
- 表示参数化测试
- Look for or
- indicates fixture tests with preconditions
- vs : Whether to continue after failure
- indicates parameterized tests
- 测试文件通常是 或
- 命令指定如何执行测试
- 标记预期输出
- 标记预期的编译错误
- FileCheck 指令:, ,
- Test files are usually or
- commands specify how to execute tests
- marks expected output
- marks expected compilation errors
- FileCheck directives: , ,
JavaScript/TypeScript (Jest)
JavaScript/TypeScript (Jest)
- ,
- nested structure
- exception tests
- hook functions
- 测试与源码在同一目录:
- 基础测试
- 表格驱动测试
- 测试入口
- Tests are in the same directory as source code:
- basic tests
- table-driven tests
- test entry point
- inline tests
- directory integration tests
- exception tests
- skipped tests
6.5.6 测试质量评估
6.5.6 Test Quality Evaluation
Evaluate Whether Tests Are Sufficient:
测试质量评估
Test Quality Evaluation
覆盖的功能点
Covered Function Points
- ✅ 正常流程
- ✅ 边界输入
- ✅ 异常输入
- ⚠️ 并发场景
- ❌ 性能测试
- ✅ Normal flow
- ✅ Boundary inputs
- ✅ Exception inputs
- ⚠️ Concurrent scenarios
- ❌ Performance tests
MLIR 特定评估
MLIR-Specific Evaluation
- ✅ 正向测试(valid.mlir)
- ✅ 负向测试(invalid.mlir)
- ⚠️ 性能回归测试
- ❌ 跨方言交互测试
- ✅ Positive tests (valid.mlir)
- ✅ Negative tests (invalid.mlir)
- ⚠️ Performance regression tests
- ❌ Cross-dialect interaction tests
测试缺失警告
Test Deficiency Warnings
⚠️ Warning: This module has insufficient test coverage
- Uncovered scenarios: [List specifically]
- Recommended supplements: [Specific suggestions]
6.5.7 测试用例分析输出模板
6.5.7 Test Case Analysis Output Template
测试文件结构
Test File Structure
[List test files/directories and their corresponding source code modules]
关键测试用例解读
Key Test Case Interpretation
[Select 3-5 most valuable test cases]
从测试中发现的隐藏行为
Hidden Behavior Discovered from Tests
[List details easily overlooked when only reading main code]
测试覆盖度评估
Test Coverage Evaluation
- 核心功能覆盖率:X%
- 边界条件覆盖:[充分/不足]
- Core function coverage: X%
- Boundary condition coverage: [Sufficient/Insufficient]
测试质量建议
Test Quality Recommendations
[If tests are insufficient, propose improvement suggestions]
第 9 步:应用迁移测试(检验真实理解)
Step 9: Application Transfer Test (Verify True Understanding)
目标: 测试概念能否应用到不同场景
必须包含:
- 至少 2 个不同领域的应用场景
- 说明如何调整代码以适应新场景
- 标注哪些原理保持不变,哪些需要修改
输出格式:
Goal: Test whether concepts can be applied to different scenarios
Must Include:
- At least 2 application scenarios in different domains
- Explain how to adjust code to adapt to new scenarios
- Mark which principles remain unchanged and which need modification
Output Format:
应用迁移场景
Application Transfer Scenarios
场景 1:将用户认证应用到 API 密钥验证
Scenario 1: Apply User Authentication to API Key Verification
原始场景: Web 用户登录认证
新场景: 第三方 API 密钥验证
不变的原理:
- 验证调用方身份的核心流程
- 哈希存储凭证(API 密钥也应哈希)
- 生成访问令牌的机制
需要修改的部分:
Original Scenario: Web user login authentication
New Scenario: Third-party API key verification
Invariant Principles:
- Core process of verifying "who is calling"
- Hash-stored credentials (API keys should also be hashed)
- Access token generation mechanism
Modified Parts:
原始:用户名+密码
Original: Username + Password
def authenticate_user(username, password):
user = db.find_user(username)
if not user:
return None
if verify_password(password, user.password_hash):
return generate_token(user.id)
return None
def authenticate_user(username, password):
user = db.find_user(username)
if not user:
return None
if verify_password(password, user.password_hash):
return generate_token(user.id)
return None
迁移:API 密钥
Transferred: API Key
def authenticate_api_key(api_key):
# WHY 只需要一个参数:API 密钥本身就是身份+凭证
app = db.find_app_by_key_prefix(api_key[:8])
# WHY 用前缀查询:避免全表扫描,API 密钥前缀作为索引
if not app:
return None
if verify_api_key(api_key, app.key_hash):
# WHY 也要哈希:防止数据库泄露导致密钥泄露
return generate_token(app.id, scope=app.permissions)
# WHY 增加 scope:API 密钥通常有不同权限级别
return None
**WHY 这样迁移:**
- 保留核心安全原则(哈希存储、恒定时间比较)
- 调整业务逻辑(单参数、权限范围)
- 优化查询性能(前缀索引)
**学到的通用模式:**
- 任何需要验证"谁在调用"的场景都可用类似结构
- 核心:查找实体 → 验证凭证 → 生成令牌
- 变化:凭证形式、查询方式、令牌内容
def authenticate_api_key(api_key):
# WHY only one parameter: API key itself is both identity and credential
app = db.find_app_by_key_prefix(api_key[:8])
# WHY query by prefix: Avoid full table scan, API key prefix as index
if not app:
return None
if verify_api_key(api_key, app.key_hash):
# WHY hash too: Prevent key leakage if database is compromised
return generate_token(app.id, scope=app.permissions)
# WHY add scope: API keys usually have different permission levels
return None
**WHY Transfer This Way:**
- Retain core security principles (hash storage, constant-time comparison)
- Adjust business logic (single parameter, permission scope)
- Optimize query performance (prefix index)
**Learned General Pattern:**
- Similar structure can be used in any scenario that needs to verify "who is calling"
- Core: Find entity → Verify credential → Generate token
- Variations: Credential form, query method, token content
场景 2:将快速排序应用到日志分析
Scenario 2: Apply Quick Sort to Log Analysis
原始场景: 对用户列表按 ID 排序
新场景: 对数百万条日志按时间戳排序
不变的原理:
- 分治思想:递归分解问题
- Pivot 选择:影响性能的关键
- 原地排序:节省空间
需要调整的部分:
Original Scenario: Sort user list by ID
New Scenario: Sort millions of logs by timestamp
Invariant Principles:
- Divide and conquer idea: Recursively decompose problems
- Pivot selection: Key factor affecting performance
- In-place sorting: Saves space
Adjusted Parts:
原始:简单快排
Original: Simple Quick Sort
def quicksort(arr):
if len(arr) <= 1:
return arr
pivot = arr[len(arr) // 2]
left = [x for x in arr if x < pivot]
middle = [x for x in arr if x == pivot]
right = [x for x in arr if x > pivot]
return quicksort(left) + middle + quicksort(right)
def quicksort(arr):
if len(arr) <= 1:
return arr
pivot = arr[len(arr) // 2]
left = [x for x in arr if x < pivot]
middle = [x for x in arr if x == pivot]
right = [x for x in arr if x > pivot]
return quicksort(left) + middle + quicksort(right)
迁移:日志排序(外部排序 + 优化)
Transferred: Log Sorting (External Sort + Optimization)
def quicksort_logs(log_file, output_file, memory_limit):
# WHY 外部排序:数据量超过内存,无法一次性加载
# 1. 分块排序
chunks = split_file_into_chunks(log_file, memory_limit)
# WHY 分块:每块可载入内存单独排序
for chunk in chunks:
logs = load_chunk(chunk)
# WHY 用 timsort 而非快排:
# - 日志通常部分有序(按时间追加)
# - timsort 对部分有序数据优化到 O(n)
# - Python 内置 sorted() 就是 timsort
logs.sort(key=lambda log: log.timestamp)
save_sorted_chunk(chunk, logs)
# 2. 归并排序的分块
merge_sorted_chunks(chunks, output_file)
# WHY 归并:多个有序序列合并为一个有序序列
return output_file
**WHY 不直接用快排:**
- 数据量超过内存:需要外部排序
- 日志部分有序:timsort 更优
- 需要稳定排序:保持相同时间戳的日志顺序
**学到的通用模式:**
- 算法选择取决于数据特征(规模、有序性、稳定性需求)
- 基本原理可迁移(分治、比较),但实现需调整
- 超大数据需要外部算法(分块+归并)
def quicksort_logs(log_file, output_file, memory_limit):
# WHY external sort: Data volume exceeds memory, cannot be loaded all at once
# 1. Split and sort chunks
chunks = split_file_into_chunks(log_file, memory_limit)
# WHY split into chunks: Each chunk can be loaded into memory and sorted individually
for chunk in chunks:
logs = load_chunk(chunk)
# WHY use timsort instead of quicksort:
# - Logs are usually partially ordered (appended by time)
# - Timsort is optimized for partially ordered data to O(n)
# - Python's built-in sorted() is timsort
logs.sort(key=lambda log: log.timestamp)
save_sorted_chunk(chunk, logs)
# 2. Merge sorted chunks
merge_sorted_chunks(chunks, output_file)
# WHY merge: Combine multiple sorted sequences into one sorted sequence
return output_file
**WHY Not Use Quick Sort Directly:**
- Data volume exceeds memory: Needs external sorting
- Logs are partially ordered: Timsort is better
- Stable sorting required: Maintain order of logs with same timestamp
**Learned General Pattern:**
- Algorithm selection depends on data characteristics (scale, order, stability requirements)
- Basic principles can be transferred (divide and conquer, comparison), but implementation needs adjustment
- Ultra-large data requires external algorithms (split + merge)
第 10 步:依赖关系与使用示例
Step 10: Dependency Relationships and Usage Examples
(Similar to original version, but with added WHY explanations)
依赖关系分析
Dependency Relationship Analysis
bcrypt (v5.1.0)
- 用途: 密码哈希 (Password Hashing)
- WHY 选择 bcrypt:
- 自带盐值,无需手动管理
- 可调节计算成本(cost factor)
- 抵抗 GPU/ASIC 加速攻击
- WHY 不用 SHA256: 计算太快,容易暴力破解
- WHY 不用 scrypt/argon2: bcrypt 更成熟,兼容性好
jsonwebtoken (v9.0.0)
- 用途: JWT token 生成与验证
- WHY 选择 JWT: 无状态认证,适合分布式系统
- WHY 不用 Session: Session 需要服务器存储,不利于扩展
bcrypt (v5.1.0)
- Purpose: Password Hashing
- WHY Choose bcrypt:
- Built-in salt, no manual management needed
- Adjustable computational cost (cost factor)
- Resists GPU/ASIC accelerated attacks
- WHY Not Use SHA256: Too fast to compute, vulnerable to brute-force attacks
- WHY Not Use scrypt/argon2: bcrypt is more mature and has better compatibility
jsonwebtoken (v9.0.0)
- Purpose: JWT token generation and verification
- WHY Choose JWT: Stateless authentication, suitable for distributed systems
- WHY Not Use Session: Session requires server storage, not conducive to scaling
内部模块依赖
Internal Module Dependencies
database.js → auth.js
- 依赖原因: 认证需要查询用户数据
- WHY 这样设计: 分离数据访问和业务逻辑(单一职责原则)
utils/crypto.js → auth.js
- 依赖原因: 认证需要密码哈希和验证
- WHY 封装工具模块: 加密逻辑复杂,集中管理更安全
database.js → auth.js
- Dependency Reason: Authentication requires querying user data
- WHY This Design: Separate data access and business logic (single responsibility principle)
utils/crypto.js → auth.js
- Dependency Reason: Authentication requires password hashing and verification
- WHY Encapsulate into Utility Module: Encryption logic is complex, centralized management is more secure
完整使用示例
Complete Usage Example
(Includes detailed WHY comments)
示例 1:标准用户登录流程
Example 1: Standard User Login Flow
javascript
// 1. 导入认证模块
const auth = require('./auth');
// 2. 接收用户输入(来自登录表单)
const username = req.body.username; // 例如:"alice"
const password = req.body.password; // 例如:"Secret123!"
// WHY 不在客户端哈希密码:
// - 客户端哈希后,哈希值本身就成了"密码"
// - 攻击者获取哈希值后可以直接登录
// - 必须在服务端用盐值哈希,客户端永远传明文
// 3. 调用认证函数
const token = await auth.authenticate_user(username, password);
// 4. 根据结果响应
if (token) {
// 认证成功
res.json({
success: true,
token: token,
// WHY 返回 token:客户端后续请求需要携带
message: '登录成功'
});
// WHY 设置 HTTP-only Cookie(可选):
// res.cookie('auth_token', token, {
// httpOnly: true, // WHY:防止 XSS 攻击读取
// secure: true, // WHY:仅 HTTPS 传输
// sameSite: 'strict' // WHY:防止 CSRF 攻击
// });
} else {
// 认证失败(用户不存在或密码错误)
// WHY 不区分失败原因:防止用户名枚举
res.status(401).json({
success: false,
message: '用户名或密码错误' // 模糊的错误信息
});
// WHY 返回 401 而非 403:
// 401 = 未认证(需要提供凭证)
// 403 = 已认证但无权限
}
执行结果分析:
成功路径:
客户端请求 → 服务端验证 → 返回 Token
时间:~100ms
Token 示例:"eyJhbGciOiJIUzI1NiIsInR5cCI6IkpXVCJ9..."
失败路径:
客户端请求 → 服务端验证 → 返回 401 错误
时间:~100ms(与成功相近,防止时序攻击)
javascript
// 1. Import authentication module
const auth = require('./auth');
// 2. Receive user input (from login form)
const username = req.body.username; // Example: "alice"
const password = req.body.password; // Example: "Secret123!"
// WHY not hash password on client:
// - After hashing on client, hash value itself becomes the "password"
// - Attackers can directly login if they obtain the hash value
// - Must hash with salt on server, client always sends plaintext
// 3. Call authentication function
const token = await auth.authenticate_user(username, password);
// 4. Respond based on result
if (token) {
// Authentication success
res.json({
success: true,
token: token,
// WHY return token: Client needs to carry it in subsequent requests
message: 'Login successful'
});
// WHY set HTTP-only Cookie (optional):
// res.cookie('auth_token', token, {
// httpOnly: true, // WHY: Prevent XSS attacks from reading it
// secure: true // WHY: Only transmit over HTTPS
// });
} else {
// Authentication failure (user does not exist or wrong password)
// WHY not distinguish failure reasons: Prevent username enumeration
res.status(401).json({
success: false,
message: 'Incorrect username or password' // Vague error message
});
// WHY return 401 instead of 403:
// 401 = Unauthenticated (needs to provide credentials)
// 403 = Authenticated but no permission
}
Execution Result Analysis:
Success Path:
Client request → Server verification → Return Token
Time: ~100ms
Token example: "eyJhbGciOiJIUzI1NiIsInR5cCI6IkpXVCJ9..."
Failure Path:
Client request → Server verification → Return 401 error
Time: ~100ms (similar to success, prevents timing attacks)
第 11 步:自我评估检查清单
Step 11: Self-Assessment Checklist
After completing analysis, mandatory verification of the following items:
质量验证清单
Quality Verification Checklist
理解深度验证
Understanding Depth Verification
技术准确性验证
Technical Accuracy Verification
实用性验证
Practicality Verification
最终验证问题
Final Verification Questions
如果不看原代码,根据这份分析文档:
- ✅ 能否理解代码的设计思路?
- ✅ 能否独立实现类似功能?
- ✅ 能否应用到不同场景?
- ✅ 能否向他人清晰解释?
如果有任何一项答"否",说明分析不够深入,需要补充。
If you don't look at the original code, based on this analysis document:
- ✅ Can you understand the code's design ideas?
- ✅ Can you implement similar functions independently?
- ✅ Can you apply it to different scenarios?
- ✅ Can you explain it clearly to others?
If any answer is "No", the analysis is not deep enough and needs supplementation.
输出格式总结
Output Format Summary
Complete Analysis Document Structure:
[代码名称] 深度理解分析
[Code Name] Deep Understanding Analysis
理解验证状态
Understanding Verification Status
[Self-explanation test result table]
- Programming language:
- Code scale:
- Core dependencies:
2. 背景与动机分析(精细询问)
2. Background and Motivation Analysis (Elaborative Interrogation)
- 问题本质(WHY 需要)
- 方案选择(WHY 选择 + WHY 不选其他)
- 应用场景(WHY 适用 + WHY 不适用)
- Problem essence (WHY needed)
- Solution selection (WHY chosen + WHY other solutions not chosen)
- Application scenarios (WHY applicable + WHY not applicable)
3. 概念网络图
3. Concept Network Diagram
- 核心概念清单(每个概念 3 个 WHY)
- 概念关系矩阵
- 连接到已有知识
- Core concept list (3 WHY questions per concept)
- Concept relationship matrix
- Connection to existing knowledge
4. 算法与理论深度分析
4. In-Depth Algorithm and Theory Analysis
- 每个算法:复杂度 + WHY 选择 + WHY 可接受 + 参考资料
- 每个理论:WHY 使用 + WHY 有效 + WHY 有限制
- Each algorithm: Complexity + WHY chosen + WHY acceptable + reference materials
- Each theory: WHY used + WHY effective + WHY limited
5. 设计模式分析
5. Design Pattern Analysis
- 每个模式:WHY 使用 + WHY 不用会怎样 + 实现细节 + 参考资料
- Each pattern: WHY used + WHY not used + implementation details + reference materials
6. 关键代码深度解析
6. In-Depth Key Code Analysis
- 每个代码段:逐行解析(做什么 + WHY) + 执行示例 + 关键要点
- Each code snippet: Line-by-line analysis (what it does + WHY) + execution examples + key takeaways
7. 测试用例分析(如有)
7. Test Case Analysis (if applicable)
- 测试文件清单与覆盖分析
- 从测试中发现的边界条件
- 测试驱动的理解验证
- Test file list and coverage analysis
- Boundary conditions discovered from tests
- Test-driven understanding verification
8. 应用迁移场景(至少 2 个)
8. Application Transfer Scenarios (at least 2)
- 每个场景:不变的原理 + 需要修改的部分 + WHY 这样迁移
- Each scenario: Invariant principles + modified parts + WHY transferred this way
9. 依赖关系与使用示例
9. Dependency Relationships and Usage Examples
- 每个依赖:WHY 选择 + WHY 不用其他
- 示例包含详细 WHY 注释
- Each dependency: WHY chosen + WHY other solutions not chosen
- Examples include detailed WHY comments
10. 质量验证清单
10. Quality Verification Checklist
[Check all verification items]
特殊场景处理
Special Scenario Handling
-
整体架构分析
- 项目结构树 + WHY 这样组织
- 入口文件 + WHY 从这里开始
- 模块划分 + WHY 这样划分
-
模块间关系
- 依赖图 + WHY 这样依赖
- 数据流图 + WHY 这样流动
- 调用链 + WHY 这样调用
-
逐模块分析
- 每个核心模块按标准流程分析
- 强调模块间的 WHY 关系
-
Overall Architecture Analysis
- Project structure tree + WHY organized this way
- Entry file + WHY start here
- Module division + WHY divided this way
-
Inter-Module Relationships
- Dependency graph + WHY dependent this way
- Data flow graph + WHY flows this way
- Call chain + WHY called this way
-
Module-by-Module Analysis
- Analyze each core module according to standard process
- Emphasize WHY relationships between modules
-
分层解释
- 先用自然语言描述思路
- 再用伪代码展示结构
- 最后逐行解析实现
-
WHY 贯穿始终
- WHY 选择这个算法
- WHY 每一步这样做
- WHY 复杂度是这样的
-
可视化辅助
-
Layered Explanation
- First describe ideas in natural language
- Then show structure with pseudocode
- Finally analyze implementation line by line
-
WHY Throughout
- WHY this algorithm was chosen
- WHY each step is done this way
- WHY complexity is as such
-
Visualization Assistance
- Show execution process with specific data
- Explain WHY at each step
不熟悉的技术栈
Unfamiliar Technology Stacks
-
技术背景说明
- 这个技术栈是什么
- WHY 存在这个技术栈
- WHY 项目选择它
-
关键概念解释
- 技术栈特有的概念
- WHY 这样设计
- 与其他技术栈对比
-
学习资源
-
Technology Background Explanation
- What this technology stack is
- WHY this technology stack exists
- WHY the project chose it
-
Key Concept Explanation
- Concepts unique to this technology stack
- WHY designed this way
- Comparison with other technology stacks
-
Learning Resources
- Official documentation links
- WHY recommend these resources
- Learning path suggestions
分析前最终检查
Final Pre-Analysis Check
在开始分析前,确认:
记住:目标不是"看完代码",而是"真正理解代码"。
Before starting analysis, confirm:
Remember: The goal is not to "finish reading the code", but to "truly understand the code".
📤 输出要求(Token 优化版)
📤 Output Requirements (Token-Optimized Version)
分析完成后,必须生成独立的 Markdown 文档!
After completing analysis, must generate independent Markdown document!
三种模式的文档生成策略
Document Generation Strategies for Three Modes
| 模式 | 生成方式 | 文件数量 | 适用场景 |
|---|
| Quick | 单次 Write | 1 | 快速代码审查 |
| Standard | 单次 Write | 1 | 学习理解代码 |
| Deep | 根据规模自动选择策略 | 1-2 | 深度掌握、大型项目 |
| → 代码 ≤ 2000 行 | 渐进式 Write | 1-2 | 面试准备、完全掌握 |
| → 代码 > 2000 行 | 并行处理 + 汇总 | 多个临时章节 → 1 个最终文档 | 大型项目、复杂代码库 |
| Mode | Generation Method | Number of Files | Applicable Scenarios |
|---|
| Quick | Single Write | 1 | Quick code review |
| Standard | Single Write | 1 | Learning and understanding code |
| Deep | Automatically select strategy based on scale | 1-2 | In-depth mastery, large projects |
| → Code ≤ 2000 lines | Progressive Write | 1-2 | Interview preparation, complete mastery |
| → Code > 2000 lines | Parallel Processing + Aggregation | Multiple temporary chapters → 1 final document | Large projects, complex codebases |
⚡ Token 节省策略
⚡ Token Saving Strategies
重要原则:避免重复输出,直接写入文件
-
禁止在对话中输出完整分析
- 完整分析直接写入文件,不输出到对话
- 对话中仅输出:分析摘要 + 文件路径
-
分块处理大型项目
- 单文件分析:生成单个文档
- 多文件项目:按模块生成多个文档
- 超长分析:拆分为 +
-
渐进式生成(适用于 Deep Mode)
- 先生成框架文档(目录 + 概要)
- 逐节填充内容,每次调用 Write 追加更新
Important Principle: Avoid duplicate output, write directly to files
-
Prohibit outputting complete analysis in conversation
- Complete analysis is written directly to file, not output to conversation
- Only output analysis summary + file path in conversation
-
Chunk processing for large projects
- Single-file analysis: Generate single document
- Multi-file project: Generate multiple documents by module
- Ultra-long analysis: Split into +
module-name-detailed-analysis.md
-
Progressive Generation (for Deep Mode)
- First generate framework document (table of contents + overview)
- Fill content section by section, use Write to append updates each time
文档生成规则
Document Generation Rules
-
文件命名格式
- 单文件: 或
[code-name]-deep-analysis.md
- 多文件项目: +
- 例如:、
quicksort-deep-analysis.md
-
生成方式(Token 优化流程)
方式一:直接写入(推荐)
用户: 深入分析这段代码
1. [完成分析过程,不输出完整内容]
2. 直接使用 Write 工具生成文档:
文件路径: [代码名称]-深度分析.md
内容: [完整分析内容]
3. 在对话中输出简要摘要:
- 分析模式:Standard/Deep
- 核心发现:3-5 条要点
- 文件路径:[代码名称]-深度分析.md
方式二:多文件项目分块生成
1. [完成整体分析]
2. 生成概述文档:
Write: [项目名]-概述.md
内容:整体架构、模块关系图、分析框架
3. 逐模块生成详细文档:
Write: [模块A]-分析.md
Write: [模块B]-分析.md
Write: [模块C]-分析.md
4. 输出摘要:
- 生成了 4 个文档
- 列出所有文件路径
方式三:Deep Mode(根据代码规模自动选择策略)
Deep Mode 会根据代码规模自动选择最优生成策略:
【策略 A:渐进式生成】代码 ≤ 2000 行时
- 先生成框架文档(目录 + 概要)
- 逐节填充内容,每次调用 Write 追加更新
- 参见前文 "Deep Mode 输出结构 - 策略 A" 章节
【策略 B:并行处理】代码 > 2000 行时
1. 主 Agent 生成框架和任务分配
2. 使用 Task tool 创建多个并行子 Agent
3. 每个子 Agent 专注一个章节,生成独立文件
4. 主 Agent 汇总所有章节,生成最终文档
文件结构:
work/
├── 00-框架.json # 主 Agent 生成的框架
├── tasks/ # 子任务描述目录
├── chapters/ # 子 Agent 生成的章节
└── [项目名]-完全掌握分析.md # 最终汇总文档
示例 Task 调用:
Task(
description: "深度分析[章节名]章节",
prompt: "你是[章节名]分析专家,请深度分析...[具体指令]",
subagent_type: "general-purpose"
)
-
对话输出格式(精简版)
markdown
## 分析完成
**模式:** Standard Mode
**核心发现:**
- 代码实现了 [核心功能]
- 使用 [算法/模式] 解决 [问题]
- 关键优化点:[优化点1]、[优化点2]
- 潜在问题:[问题1]、[问题2]
**完整文档:** `[代码名称]-深度分析.md`
-
File Naming Format
- Single file:
[code-name]-deep-analysis.md
or
- Multi-file project:
[project-name]-overview.md
+ [module-name]-analysis.md
- Examples:
jwt-authentication-deep-analysis.md
, quicksort-deep-analysis.md
-
Generation Method (Token-Optimized Flow)
Method 1: Direct Write (Recommended)
User: Conduct in-depth analysis of this code
1. [Complete analysis process, do not output complete content]
2. Use Write tool directly to generate document:
File path: [code-name]-deep-analysis.md
Content: [Complete analysis content]
3. Output brief summary in conversation:
- Mode: Standard/Deep
- Key findings: 3-5 key points
- File path: [code-name]-deep-analysis.md
Method 2: Chunk Generation for Multi-File Projects
1. [Complete overall analysis]
2. Generate overview document:
Write: [project-name]-overview.md
Content: Overall architecture, module relationship diagram, analysis framework
3. Generate detailed documents by module:
Write: [moduleA]-analysis.md
Write: [moduleB]-analysis.md
Write: [moduleC]-analysis.md
4. Output summary:
- Generated 4 documents
- List all file paths
Method 3: Deep Mode (Automatically select strategy based on code scale)
Deep Mode automatically selects optimal strategy based on code scale.
[Strategy A: Progressive Generation] When code ≤ 2000 lines
- First generate framework document (table of contents + overview)
- Fill content section by section, use Write to append updates each time
- Refer to "Deep Mode Output Structure - Strategy A" section above
[Strategy B: Parallel Processing] When code > 2000 lines
1. Main Agent generates framework and task allocation
2. Use Task tool to create multiple parallel sub-Agents
3. Each sub-Agent focuses on one chapter, generates independent file
4. Main Agent aggregates all chapters, generates final document
File structure:
work/
├── 00-framework.json # Framework generated by Main Agent
├── tasks/ # Sub-task description directory
├── chapters/ # Chapters generated by sub-Agents
└── [project-name]-complete-mastery-analysis.md # Final aggregated document
Example Task call:
Task(
description: "In-depth analysis of [chapter-name] chapter",
prompt: "You are a [chapter-name] analysis expert, please conduct in-depth analysis...[specific instructions]",
subagent_type: "general-purpose"
)
-
Conversation Output Format (Simplified Version)
markdown
## Analysis Completed
**Mode:** Standard Mode
**Key Findings:**
- Code implements [core function]
- Uses [algorithm/pattern] to solve [problem]
- Key optimization points: [optimization point1], [optimization point2]
- Potential issues: [issue1], [issue2]
**Complete Document:** `[code-name]-deep-analysis.md`
输出流程对比
Output Process Comparison
❌ 高 Token 消耗方式(避免):
1. 在对话中输出 5000 token 的完整分析
2. 再次用 Write 工具写入 5000 token
→ 总计:10000+ token 输出
✅ Token 优化方式(推荐):
1. 直接用 Write 工具写入 5000 token
2. 对话中输出 200 token 摘要
→ 总计:5200 token 输出(节省 ~50%)
❌ High Token Consumption Method (Avoid):
1. Output 5000-token complete analysis in conversation
2. Use Write tool to write another 5000 tokens
→ Total: 10000+ tokens output
✅ Token-Optimized Method (Recommended):
1. Use Write tool directly to write 5000 tokens
2. Output 200-token summary in conversation
→ Total: 5200 tokens output (saves ~50%)
大型项目分块指南
Large Project Chunking Guide
| 项目规模 | 推荐模式 | 生成策略 | 文件结构 |
|---|
| < 500 行 | Quick/Standard | 单文档 | |
| 500-2000 行 | Standard | 单文档(可能较长) | |
| 2000-10000 行 | Deep(自动并行) | 并行章节 | 多个临时章节 → 1个最终文档 |
| > 10000 行 | Deep(自动并行) | 分层并行 | 模块级并行 + 章节级并行 |
重要:不要在对话中输出完整分析结果,直接写入文件,仅输出摘要!
| Project Scale | Recommended Mode | Generation Strategy | File Structure |
|---|
| < 500 lines | Quick/Standard | Single document | |
| 500-2000 lines | Standard | Single document (may be long) | |
| 2000-10000 lines | Deep (automatic parallel) | Parallel chapters | Multiple temporary chapters → 1 final document |
| > 10000 lines | Deep (automatic parallel) | Hierarchical parallel | Module-level parallel + chapter-level parallel |
Important: Do not output complete analysis results in conversation, write directly to file, only output summary!
🚀 Deep Mode 自动实现指南(给 Claude 的具体指令)
🚀 Deep Mode Automatic Implementation Guide (Specific Instructions for Claude)
Deep Mode 会根据代码规模自动选择最优策略。当需要并行处理时:
Deep Mode automatically selects optimal strategy based on code scale. When parallel processing is needed:
步骤 1: 识别是否需要并行处理
Step 1: Identify if Parallel Processing Is Needed
自动触发条件(满足任一即使用并行处理):
- 代码文件数 > 10
- 代码总行数 > 2000
- 用户明确说"大项目"、"完整项目"、"项目整体分析"
- 用户使用"彻底"、"完全掌握"、"深入研究"等深度触发词且代码规模较大
Automatic trigger conditions (use parallel processing if any are met):
- Number of code files > 10
- Total code lines > 2000
- User explicitly mentions "large project", "complete project", "overall project analysis"
- User uses depth trigger words like "thoroughly", "complete mastery", "in-depth research" and code scale is large
步骤 2: 选择处理策略
Step 2: Select Processing Strategy
if 代码行数 <= 2000:
使用策略 A:渐进式生成(顺序处理)
else:
使用策略 B:并行处理(下文详述)
if code_lines <= 2000:
use Strategy A: Progressive Generation (sequential processing)
else:
use Strategy B: Parallel Processing (detailed below)
步骤 3: 并行处理准备(策略 B)
Step 3: Parallel Processing Preparation (Strategy B)
创建工作目录
Create working directory
mkdir -p code-analysis/{tasks,chapters}
mkdir -p code-analysis/{tasks,chapters}
生成框架文件
Generate framework file
cat > code-analysis/00-framework.json << 'EOF'
{
"project_name": "[项目名]",
"language": "[语言]",
"total_lines": [行数],
"core_concepts": [概念列表],
"chapters": [
"背景与动机", "核心概念", "算法理论",
"设计模式", "代码解析", "应用迁移",
"依赖关系", "质量验证"
]
}
EOF
cat > code-analysis/00-framework.json << 'EOF'
{
"project_name": "[project-name]",
"language": "[language]",
"total_lines": [line-count],
"core_concepts": [concept-list],
"chapters": [
"Background and Motivation", "Core Concepts", "Algorithm Theory",
"Design Patterns", "Code Analysis", "Application Transfer",
"Dependency Relationships", "Quality Verification"
]
}
EOF
步骤 4: 创建并行子 Agent
Step 4: Create Parallel Sub-Agents
对于每个章节,使用 Task tool 创建独立的子 Agent:
Task(
description: "深度分析[章节名称]章节",
prompt: """
你是[章节名称]分析专家。
## 上下文
- 项目:{project_name}
- 语言:{language}
- 核心概念:{core_concepts}
## 任务
深度分析代码的[章节名称]部分,生成详细章节内容(至少{min_words}字)。
## 要求
- 使用场景/步骤 + WHY 风格注释
- 每个关键点回答 3 个 WHY
- 提供具体执行示例
- 引用权威来源
## 输出
将完整章节内容写入文件:
code-analysis/chapters/{章节名}.md
""",
subagent_type: "general-purpose"
)
For each chapter, use Task tool to create independent sub-Agents:
Task(
description: "In-depth analysis of [chapter-name] chapter",
prompt: """
You are a [chapter-name] analysis expert.
## Context
- Project: {project_name}
- Language: {language}
- Core Concepts: {core_concepts}
## Task
Conduct in-depth analysis of the [chapter-name] section of the code, generate detailed chapter content (at least {min_words} words).
## Requirements
- Use scenario/step + WHY style comments
- Each key point answers 3 WHY questions
- Provide specific execution examples
- Cite authoritative sources
## Output
Write complete chapter content to file:
code-analysis/chapters/{chapter-name}.md
""",
subagent_type: "general-purpose"
)
步骤 4: 汇总结果
Step 4: Aggregate Results
等待所有子 Agent 完成后,使用 Read 工具读取所有章节文件,按顺序合并:
1. 读取 code-analysis/00-framework.json
2. 读取 code-analysis/chapters/*.md(按顺序)
3. 合并为最终文档
4. 写入 {项目名}-完全掌握分析.md
After all sub-Agents are completed, use Read tool to read all chapter files, merge in order:
1. Read code-analysis/00-framework.json
2. Read code-analysis/chapters/*.md (in order)
3. Merge into final document
4. Write to {project-name}-complete-mastery-analysis.md
📋 章节深度自检标准(确保质量)
📋 Chapter Depth Self-Check Standards (Ensure Quality)
Deep Mode 生成时,每章完成后必须通过以下检查:
When generating in Deep Mode, each chapter must pass the following checks:
章节深度自检清单
Chapter Depth Self-Check Checklist
1. 内容完整性(必填项)
1. Content Completeness (Mandatory)
2. 分析深度(按章节类型)
2. Analysis Depth (By Chapter Type)
概念类章节(第 3 章):
算法类章节(第 4 章):
设计模式类章节(第 5 章):
代码解析类章节(第 6 章):
Concept Chapters (Chapter 3):
Algorithm Chapters (Chapter 4):
Design Pattern Chapters (Chapter 5):
Code Analysis Chapters (Chapter 6):
3. 实用性(应用价值)
3. Practicality (Application Value)
4. 格式规范
4. Format Specification
不合格章节的处理
Handling of Unqualified Chapters
情况 A:内容过少(< 300 字)
→ 追加细节:添加更多解释、示例、对比
情况 B:WHY 分析不足
→ 补充 WHY:对每个核心点追问"为什么"
情况 C:代码注释不完整
→ 添加详细注释:使用 场景/步骤 + WHY 风格
情况 D:执行流程缺失
→ 添加具体数据示例:追踪变量变化轨迹
**快速深度评估标准:**
| 章节 | 最低字数 | 必含元素 |
|-----|---------|---------|
| 1. 快速概览 | 200 | 语言、规模、依赖、类型 |
| 2. 背景与动机 | 400 | 问题本质、方案选择、应用场景 |
| 3. 核心概念 | 600 | 每概念 3 WHY、关系矩阵 |
| 4. 算法与理论 | 500 | 复杂度、WHY、参考资料 |
| 5. 设计模式 | 400 | 模式名、WHY、标准参考 |
| 6. 关键代码解析 | 800 | 逐行解析、执行示例、场景追踪 |
| 7. 测试用例分析 | 400 | 测试覆盖、边界条件、测试发现 |
| 8. 应用迁移 | 500 | 至少 2 场景、不变原理、修改部分 |
| 9. 依赖关系 | 300 | 每依赖的 WHY、使用示例 |
| 10. 质量验证 | 200 | 验证清单、四能测试 |
**总计:Deep Mode 文档应 ≥ 4300 字**
Case A: Insufficient Content (<300 words)
→ Append details: Add more explanations, examples, comparisons
Case B: Insufficient WHY Analysis
→ Supplement WHY: Ask "why" for each core point
Case C: Incomplete Code Comments
→ Add detailed comments: Use scenario/step + WHY style
Case D: Missing Execution Flow
→ Add specific data examples: Track variable change trajectories
**Quick Depth Evaluation Standards:**
| Chapter | Minimum Word Count | Mandatory Elements |
|-----|---------|---------|
| 1. Quick Overview | 200 | Language, scale, dependencies, type |
| 2. Background and Motivation | 400 | Problem essence, solution selection, application scenarios |
| 3. Core Concepts | 600 | 3 WHY per concept, relationship matrix |
| 4. Algorithms and Theory | 500 | Complexity, WHY, reference materials |
| 5. Design Patterns | 400 | Pattern name, WHY, standard reference |
| 6. In-Depth Key Code Analysis | 800 | Line-by-line analysis, execution examples, scenario tracking |
| 7. Test Case Analysis | 400 | Test coverage, boundary conditions, test findings |
| 8. Application Transfer | 500 | At least 2 scenarios, invariant principles, modified parts |
| 9. Dependency Relationships | 300 | WHY for each dependency, usage examples |
| 10. Quality Verification | 200 | Verification checklist, four abilities test |
**Total: Deep Mode document should be ≥ 4300 words**