re-structure-analysis

Compare original and translation side by side

🇺🇸

Original

English
🇨🇳

Translation

Chinese

Structure Analysis — Reverse Engineering Phase 1

结构分析 — 逆向工程第一阶段

Analyze the target codebase structure exhaustively. Identify ALL entry points, dependencies, modules, and components within the user-specified scope. Produce a structure map document and initialize
manifest.json
for pipeline chaining.
全面分析目标代码库的结构。识别用户指定范围内的所有入口点、依赖项、模块和组件。生成结构映射文档并初始化
manifest.json
用于流水线衔接。

Three Principles

三大原则

These principles are non-negotiable. Violating any principle invalidates the analysis.
这些原则是不可妥协的。违反任何一条原则都会导致分析结果无效。

1. Code is Truth

1. 代码即事实

  • Document what IS implemented, not what SHOULD be implemented
  • If code appears buggy, document the behavior as-is and note it in the Question List
  • Prioritize actual code over comments, variable names, or documentation
  • 记录实际已实现的内容,而非应该实现的内容
  • 如果代码存在疑似bug,如实记录其行为并在问题列表中注明
  • 优先依据实际代码,而非注释、变量名或文档

2. Traceability to Line

2. 行级可追溯性

  • Every finding MUST include a
    file:line
    reference
  • If a line number cannot be determined, exclude the finding and note it in the Question List
  • References without line numbers are considered hallucination
  • 所有分析结果必须包含
    file:line
    引用
  • 如果无法确定行号,需排除该结果并在问题列表中注明
  • 无行号的引用视为无效内容

3. Behavior over Intent

3. 行为优先于意图

  • Focus on observable behavior: inputs, outputs, side effects
  • Do NOT infer business intent or "why" from code
  • Document "what" and "how" only
  • 聚焦可观察的行为:输入、输出、副作用
  • 不得从代码中推断业务意图或“为什么这么做”
  • 仅记录“是什么”和“怎么做”

Execution

执行步骤

Step 1: Setup

步骤1:准备工作

Determine analysis name:
  • Use the
    name
    argument if provided
  • Otherwise derive from target: class name, directory name, or file stem
Detect language:
  • If
    language
    argument is provided, use it
  • Otherwise detect from file extensions and project files:
    • .cs
      /
      *.csproj
      csharp
    • .py
      /
      pyproject.toml
      /
      setup.py
      python
    • .ts
      /
      .tsx
      /
      package.json
      with typescript →
      typescript
    • .java
      /
      pom.xml
      /
      build.gradle
      java
    • Other →
      generic
Load language reference:
  • Check if
    references/{language}.md
    exists in this skill's directory
  • If it exists, read and apply the language-specific patterns
  • If not, proceed with language-agnostic analysis
Detect tool availability:
  • Check if Serena MCP tools are available (
    mcp__serena__find_symbol
    , etc.)
  • If available, use Serena as primary analysis tools
  • If not, use Read, Grep, Glob, Bash exclusively
Create output directory and manifest:
bash
mkdir -p docs/reverse/{name}
mkdir -p docs/reverse/{name}/verification
Initialize
manifest.json
:
json
{
  "name": "{name}",
  "language": "{detected-language}",
  "created": "{YYYY-MM-DD}",
  "updated": "{YYYY-MM-DD}",
  "targets": {
    "entry_points": ["{user-specified targets}"],
    "classes": ["{extracted class names}"],
    "specified_by": "user"
  },
  "phase1": {
    "status": "in_progress",
    "output": null,
    "verification": null,
    "targets_for_phase2": []
  },
  "phase2": {
    "status": "pending",
    "completed": [],
    "remaining": [],
    "targets_for_phase3": []
  },
  "phase3": {
    "status": "pending",
    "completed": [],
    "remaining": []
  },
  "phase4": {
    "status": "pending",
    "output": null,
    "verification": null
  }
}
确定分析名称:
  • 如果提供了
    name
    参数则直接使用
  • 否则从目标中推导:类名、目录名或文件名前缀
检测语言:
  • 如果提供了
    language
    参数则直接使用
  • 否则从文件扩展名和项目文件中检测:
    • .cs
      /
      *.csproj
      csharp
    • .py
      /
      pyproject.toml
      /
      setup.py
      python
    • .ts
      /
      .tsx
      或包含typescript的
      package.json
      typescript
    • .java
      /
      pom.xml
      /
      build.gradle
      java
    • 其他 →
      generic
加载语言参考文件:
  • 检查当前技能目录下是否存在
    references/{language}.md
    文件
  • 如果存在,读取并应用该语言的特定规则
  • 如果不存在,使用通用语言分析方式
检测工具可用性:
  • 检查Serena MCP工具是否可用(如
    mcp__serena__find_symbol
    等)
  • 如果可用,将Serena作为主要分析工具
  • 如果不可用,仅使用Read、Grep、Glob、Bash工具
创建输出目录和清单文件:
bash
mkdir -p docs/reverse/{name}
mkdir -p docs/reverse/{name}/verification
初始化
manifest.json
json
{
  "name": "{name}",
  "language": "{detected-language}",
  "created": "{YYYY-MM-DD}",
  "updated": "{YYYY-MM-DD}",
  "targets": {
    "entry_points": ["{user-specified targets}"],
    "classes": ["{extracted class names}"],
    "specified_by": "user"
  },
  "phase1": {
    "status": "in_progress",
    "output": null,
    "verification": null,
    "targets_for_phase2": []
  },
  "phase2": {
    "status": "pending",
    "completed": [],
    "remaining": [],
    "targets_for_phase3": []
  },
  "phase3": {
    "status": "pending",
    "completed": [],
    "remaining": []
  },
  "phase4": {
    "status": "pending",
    "output": null,
    "verification": null
  }
}

Step 2: Entry Point Detection

步骤2:入口点检测

Identify all entry points within the target scope.
With Serena:
mcp__serena__get_symbols_overview(relative_path="{target}", depth=1)
mcp__serena__find_symbol(name_path_pattern="Main|main|__main__|app")
Without Serena:
  • Use Grep to search for entry point patterns from the language reference
  • Use Read to examine candidate files
  • Use Glob to find project configuration files
For each entry point, record:
  • File path (relative to workspace root)
  • Line number
  • Function/method name
  • Purpose (derived from code, not inferred)
识别目标范围内的所有入口点。
使用Serena工具:
mcp__serena__get_symbols_overview(relative_path="{target}", depth=1)
mcp__serena__find_symbol(name_path_pattern="Main|main|__main__|app")
不使用Serena工具:
  • 使用Grep从语言参考文件中搜索入口点模式
  • 使用Read检查候选文件
  • 使用Glob查找项目配置文件
对于每个入口点,记录:
  • 文件路径(相对于工作区根目录)
  • 行号
  • 函数/方法名称
  • 用途(从代码中推导,而非推断)

Step 3: Dependency Mapping

步骤3:依赖关系映射

Identify dependencies within the target scope.
Package/library dependencies:
  • Read project configuration files (language-specific:
    *.csproj
    ,
    pyproject.toml
    ,
    package.json
    ,
    pom.xml
    )
  • List external dependencies with versions
Internal module dependencies:
  • Trace imports/using statements from the target files
  • Build a dependency graph
Produce Mermaid diagram:
mermaid
graph TD
    ModuleA --> ModuleB
    ModuleA --> ExternalLib
    ModuleB --> ModuleC
Mermaid syntax rules (prevent parse errors):
  • No special characters in text (
    !=
    ,
    >=
    ,
    []
    ,
    ()
    )
  • No array/generic syntax (
    byte[]
    ,
    List<string>
    byte array
    ,
    string list
    )
  • No method call parentheses (
    Method()
    Method
    )
  • Use quotes for node text with spaces:
    A["Node text (L45)"]
识别目标范围内的依赖项。
包/库依赖项:
  • 读取项目配置文件(各语言特定文件:
    *.csproj
    pyproject.toml
    package.json
    pom.xml
  • 列出外部依赖项及其版本
内部模块依赖项:
  • 跟踪目标文件中的import/using语句
  • 构建依赖关系图
生成Mermaid图表:
mermaid
graph TD
    ModuleA --> ModuleB
    ModuleA --> ExternalLib
    ModuleB --> ModuleC
Mermaid语法规则(避免解析错误):
  • 文本中不得包含特殊字符(如
    !=
    >=
    []
    ()
  • 不得使用数组/泛型语法(如
    byte[]
    List<string>
    需改为
    byte array
    string list
  • 不得包含方法调用括号(如
    Method()
    需改为
    Method
  • 包含空格的节点文本需使用引号:
    A["Node text (L45)"]

Step 4: Module and Component Listing

步骤4:模块与组件清单

List ALL classes, interfaces, functions, and significant components within the target scope.
With Serena:
mcp__serena__search_for_pattern(
    substring_pattern="class |interface |def |function ",
    restrict_search_to_code_files=true
)
mcp__serena__get_symbols_overview(relative_path="{directory}", depth=2)
Without Serena:
  • Use Grep to find class/function definitions
  • Use Read to examine each file
  • Use Glob to enumerate source files
For each component, record:
ComponentTypeFile:LineDescription
OrderServiceclasssrc/services/order.py:15Order processing service
CRITICAL: List ALL components. Selecting only "important" ones is prohibited. Partial listing invalidates the analysis for downstream phases.
列出目标范围内的所有类、接口、函数和重要组件。
使用Serena工具:
mcp__serena__search_for_pattern(
    substring_pattern="class |interface |def |function ",
    restrict_search_to_code_files=true
)
mcp__serena__get_symbols_overview(relative_path="{directory}", depth=2)
不使用Serena工具:
  • 使用Grep查找类/函数定义
  • 使用Read检查每个文件
  • 使用Glob枚举源代码文件
对于每个组件,记录:
组件类型文件:行号描述
OrderServicesrc/services/order.py:15订单处理服务
重要提示:必须列出所有组件。禁止仅选择“重要”组件。部分组件的清单会导致后续阶段的分析无效。

Step 5: Build Phase 2 Target List

步骤5:构建第二阶段目标列表

From the component list, identify classes/modules that contain methods requiring logic visualization:
  • Classes with business logic methods
  • Handlers/controllers with processing flows
  • Services with complex operations
For each, record:
json
{
  "class": "OrderService",
  "file": "src/services/order.py",
  "methods": ["process_order", "validate_order", "calculate_total"]
}
从组件清单中,识别包含需要逻辑可视化的方法的类/模块:
  • 包含业务逻辑方法的类
  • 包含处理流程的处理器/控制器
  • 包含复杂操作的服务
对于每个符合条件的组件,记录:
json
{
  "class": "OrderService",
  "file": "src/services/order.py",
  "methods": ["process_order", "validate_order", "calculate_total"]
}

Step 6: Generate Output

步骤6:生成输出结果

Write structure map to
docs/reverse/{name}/01-structure-map.md
:
markdown
undefined
编写结构映射文档
docs/reverse/{name}/01-structure-map.md
markdown
undefined

Structure Map: {name}

结构映射文档:{name}

Analysis Date: {YYYY-MM-DD} Target: {relative path} Language: {language} Framework: {detected framework and version} Confidence: {High / Medium / Low}
Important: All paths are relative to workspace root.
分析日期:{YYYY-MM-DD} 目标:{相对路径} 语言:{language} 框架:{检测到的框架及版本} 置信度:{高 / 中 / 低}
注意:所有路径均相对于工作区根目录。

Technology Stack

技术栈

Framework and Language

框架与语言

ComponentVersionSource
{language}{version}{file:line}
组件版本来源
{language}{version}{file:line}

Dependencies

依赖项

PackageVersionPurposeSource
{package}{version}{purpose}{config-file:line}
版本用途来源
{package}{version}{purpose}{config-file:line}

Directory Tree

目录树

{tree output with functional annotations}
{带功能注释的目录树输出}

Entry Points

入口点

Entry PointFile:LinePurposeEvidence
{name}{file}:{line}{purpose}Line {N}
入口点文件:行号用途证据
{name}{file}:{line}{purpose}第{N}行

Dependency Diagram

依赖关系图

mermaid
graph TD
    ...
mermaid
graph TD
    ...

Module and Component List

模块与组件清单

{Category} (e.g., Services, Handlers, Models)

{分类}(如:服务、处理器、模型)

ComponentTypeFile:LineDescription
{name}{class/interface/function}{file}:{line}{description}
组件类型文件:行号描述
{name}{类/接口/函数}{file}:{line}{description}

Question List

问题列表

Unconfirmed Findings

未确认的发现

  • [Unconfirmed] {description} — {file}:{line}
  • [未确认] {description} — {file}:{line}

Suspected Issues

疑似问题

  • [Suspected Bug] {description} — {file}:{line}
  • [疑似Bug] {description} — {file}:{line}

Analysis Constraints

分析约束

Confidence Factors

置信度因素

  • Code comments: {sparse/moderate/rich}
  • Test coverage: {estimated %}
  • Architecture patterns: {observed patterns}
  • 代码注释:{稀疏/适中/丰富}
  • 测试覆盖率:{估计百分比}
  • 架构模式:{观察到的模式}

Evidence Strength

证据强度

  • Strong: Implementation + tests + clear behavior
  • ⚠️ Medium: Implementation + partial tests
  • Weak: Implementation only, inferred from structure

**Update manifest:**
- Set `phase1.status` to `"completed"`
- Set `phase1.output` to `"01-structure-map.md"`
- Populate `phase1.targets_for_phase2` with the Phase 2 target list
- Update `phase2.remaining` with all component names from targets_for_phase2
- Update `updated` timestamp
  • :实现代码 + 测试用例 + 清晰行为
  • ⚠️ :实现代码 + 部分测试用例
  • :仅实现代码,从结构推断

**更新清单文件:**
- 将`phase1.status`设置为`"completed"`
- 将`phase1.output`设置为`"01-structure-map.md"`
- 填充`phase1.targets_for_phase2`为第二阶段目标列表
- 将`phase2.remaining`更新为`targets_for_phase2`中的所有组件名称
- 更新`updated`时间戳

Validation Before Completion

完成前验证

Before writing output, verify:
  • Every finding has a
    file:line
    reference
  • No speculative content (all claims verifiable in source)
  • Mermaid diagrams use valid syntax (no special characters)
  • All file paths are relative to workspace root
  • Component list is exhaustive within target scope
  • manifest.json is valid JSON with all required fields
  • targets_for_phase2
    contains entries for all components with analyzable methods
在写入输出结果前,需验证:
  • 所有分析结果均包含
    file:line
    引用
  • 无推测性内容(所有声明均可在源代码中验证)
  • Mermaid图表使用有效语法(无特殊字符)
  • 所有文件路径均相对于工作区根目录
  • 组件清单在目标范围内是完整的
  • manifest.json是有效的JSON且包含所有必填字段
  • targets_for_phase2
    包含所有带有可分析方法的组件

Prohibited Actions

禁止操作

  • Do NOT execute code or run build commands
  • Do NOT infer business logic from variable names alone
  • Do NOT add components not present in the source code
  • Do NOT modify any source files
  • Do NOT include absolute file paths in output
  • 不得执行代码或运行构建命令
  • 不得仅从变量名推断业务逻辑
  • 不得添加源代码中不存在的组件
  • 不得修改任何源代码文件
  • 不得在输出中使用绝对文件路径