reverse-engineering-specs

Compare original and translation side by side

🇺🇸

Original

English

🇨🇳

Translation

Chinese

Reverse Engineering Specifications

逆向工程规范

Overview

概述

For brownfield/legacy projects without documentation, this skill generates implementation-free specifications by exhaustively analyzing existing code. The output is a complete behavioral description that drives autonomous development on top of the existing codebase — enabling safe refactoring, feature addition, and modernization.

Key principle: Document actual behavior, including bugs. Bugs are "documented features" until explicitly marked for fixing.

This is a RIGID skill. Every code path must be traced. No assumptions, no skipping.

对于没有文档的棕地/遗留项目，该技能通过全面分析现有代码生成不涉及具体实现的规范。输出是完整的行为描述，可驱动在现有代码库之上的自主开发——支持安全重构、功能新增和系统现代化。

核心原则： 记录实际行为，包括bug。在被明确标记为需要修复之前，bug都属于“已记录特性”。

这是一项刚性技能。 必须追溯每一条代码路径，不得做任何假设，不得遗漏任何路径。

Phase 1: Exhaustive Code Investigation

阶段1：全面代码调研

[HARD-GATE] Every code path must be traced. No assumptions, no skipping.

Deploy parallel subagents via the

Agent

tool (up to 500, with

subagent_type="Explore"

) to analyze:

Analysis Target	What to Document	Priority
Entry points	All ways the system can be invoked (HTTP, CLI, events, cron)	P0
Code paths	Every branch, loop, conditional, early return	P0
Data flows	Input → transformation → output for every pipeline	P0
State mutations	Every place state is read, written, or deleted	P0
Error handling	Try/catch blocks, error codes, fallback behaviors	P0
Side effects	External calls, file I/O, database writes, event emissions	P1
Configuration	Environment variables, config files, feature flags	P1
Dependencies	External services, libraries, APIs consumed	P1
Concurrency	Async operations, race conditions, locking mechanisms	P2
Implicit behavior	Convention-based routing, middleware chains, decorators	P2

[硬性门槛(HARD-GATE)] 必须追溯每一条代码路径，不得做任何假设，不得遗漏任何路径。

通过

Agent

工具部署并行子Agent（最多500个，

subagent_type="Explore"

）来分析以下内容：

分析目标	需要记录的内容	优先级
入口点	系统可被调用的所有方式（HTTP、CLI、事件、定时任务）	P0
代码路径	每一个分支、循环、条件判断、提前返回	P0
数据流	每条管线的输入→转换→输出流程	P0
状态变更	所有状态被读取、写入、删除的位置	P0
错误处理	Try/catch块、错误码、降级行为	P0
副作用	外部调用、文件I/O、数据库写入、事件触发	P1
配置	环境变量、配置文件、功能开关	P1
依赖	外部服务、类库、调用的API	P1
并发	异步操作、竞态条件、锁机制	P2
隐式行为	基于约定的路由、中间件链、装饰器	P2

Investigation Strategy Decision Table

调研策略决策表

Codebase Size	Strategy	Subagent Count
Small (<50 files)	Single-pass full scan	5-10
Medium (50-500 files)	Module-by-module scan	50-100
Large (500+ files)	Entry-point-first, then depth scan	200-500

STOP after investigation — present a summary of discovered entry points, data flows, and behaviors. Get confirmation before generating specs.

代码库规模	策略	子Agent数量
小型（<50个文件）	单轮全量扫描	5-10
中型（50-500个文件）	逐模块扫描	50-100
大型（500+个文件）	先按入口点分析，再深度扫描	200-500

调研完成后停止——输出发现的入口点、数据流和行为的摘要。生成规范前先获得用户确认。

Phase 2: Behavioral Specification Generation

阶段2：行为规范生成

Transform code analysis into implementation-free specs following the

spec-writing

skill format.

遵循

spec-writing

技能的格式，将代码分析结果转换为不涉及具体实现的规范。

Transformation Rules

转换规则

Rule	Explanation
Strip ALL implementation details	No function names, variable names, technology references
Describe WHAT, never HOW	Observable behavior only
Document actual behavior (bugs included)	Bugs become "current behavior" in specs
Use Given/When/Then format	For all acceptance criteria
Include data contracts	Input shapes, output shapes, invariants
Separate known issues	Bugs go in KNOWN_ISSUES.md, not inline

规则	说明
移除所有实现细节	不包含函数名、变量名、技术栈相关引用
只描述“是什么”，绝不描述“怎么做”	仅记录可观测的行为
记录实际行为（包括bug）	bug在规范中被标记为“当前行为”
使用Given/When/Then格式	所有验收标准都采用该格式
包含数据契约	输入结构、输出结构、不变量
已知问题单独归档	bug放在KNOWN_ISSUES.md中，不写在规范正文里

Implementation Detail Stripping

实现细节移除示例

Code Artifact	What You See	What You Write in Spec
`jwt.verify(token, secret)`	Token validation with JWT	"Credentials are validated against the authentication system"
`redis.get(cacheKey)`	Redis cache lookup	"Previously computed results are retrieved from cache"
`if (user.role === 'admin')`	Role check	"Privileged operations require administrator access"
`res.status(429).json(...)`	Rate limiting response	"Excessive requests receive a rate limit error"
`bcrypt.hash(pw, 12)`	Password hashing	"Passwords are stored in a non-reversible format"

STOP after spec generation — run the completeness checklist before organizing.

代码片段	实际含义	你需要写在规范里的内容
`jwt.verify(token, secret)`	使用JWT做token校验	“凭据会通过认证系统进行校验”
`redis.get(cacheKey)`	Redis缓存查询	“从缓存中获取之前计算好的结果”
`if (user.role === 'admin')`	角色校验	“特权操作需要管理员权限”
`res.status(429).json(...)`	限流响应	“请求过多时会返回限流错误”
`bcrypt.hash(pw, 12)`	密码哈希	“密码以不可逆格式存储”

规范生成后停止——整理前先运行完整性检查清单。

Phase 3: Specification Organization

阶段3：规范整理

Create spec files following the naming convention:

specs/
├── 01-[first-capability].md
├── 02-[second-capability].md
├── ...
├── NN-[last-capability].md
└── KNOWN_ISSUES.md

按照以下命名约定创建规范文件：

specs/
├── 01-[first-capability].md
├── 02-[second-capability].md
├── ...
├── NN-[last-capability].md
└── KNOWN_ISSUES.md

KNOWN_ISSUES.md Format

KNOWN_ISSUES.md格式

markdown

undefined

markdown

undefined

Known Issues

已知 Issues

[Issue Title]

[Issue 标题]

Current behavior: [What actually happens]
Expected behavior: [What should happen, if known]
Affected specs: [Which spec files reference this behavior]
Severity: [Critical | High | Medium | Low]
Notes: [Additional context]

undefined

当前行为： [实际运行的行为]
预期行为： [如果已知，填写应该出现的行为]
影响的规范： [哪些规范文件引用了该行为]
严重程度： [Critical | High | Medium | Low]
备注： [其他上下文信息]

undefined

Severity Classification

严重程度分类

Severity	Criteria	Action
Critical	Data loss, security vulnerability, system crash	Fix before any new features
High	Incorrect results, broken workflow	Fix in next release
Medium	Poor UX, performance issue	Plan for future fix
Low	Cosmetic, minor inconsistency	Fix opportunistically

STOP after organization — present the spec file list and KNOWN_ISSUES for review.

严重程度	判定标准	处理方式
Critical（致命）	数据丢失、安全漏洞、系统崩溃	新增任何功能前必须修复
High（高）	结果错误、工作流中断	下一个版本修复
Medium（中）	用户体验差、性能问题	规划未来修复
Low（低）	样式问题、微小不一致	有空再修复

整理完成后停止——输出规范文件列表和KNOWN_ISSUES供用户审核。

Phase 4: Quality Verification

阶段4：质量校验

[HARD-GATE] All checks must pass before this phase is complete.

#	Check	Question	Status
1	Entry points	Are ALL entry points documented?	[ ]
2	Code paths	Are ALL branches and conditionals traced?	[ ]
3	Data flows	Are ALL input→output pipelines described?	[ ]
4	State mutations	Are ALL state changes captured?	[ ]
5	Error handling	Are ALL error paths documented?	[ ]
6	Side effects	Are ALL external interactions noted?	[ ]
7	Edge cases	Are boundary conditions described?	[ ]
8	Concurrency	Are async behaviors documented?	[ ]
9	Configuration	Are ALL config options listed?	[ ]
10	Dependencies	Are ALL external dependencies identified?	[ ]
11	Implementation-free	Zero code, tech names, or architecture in specs?	[ ]
12	Given/When/Then	All acceptance criteria in correct format?	[ ]

[硬性门槛(HARD-GATE)] 本阶段完成前必须通过所有检查。

序号	检查项	检查问题	状态
1	入口点	所有入口点都被记录了吗？	[ ]
2	代码路径	所有分支和条件判断都被追溯了吗？	[ ]
3	数据流	所有输入→输出管线都被描述了吗？	[ ]
4	状态变更	所有状态变化都被捕获了吗？	[ ]
5	错误处理	所有错误路径都被记录了吗？	[ ]
6	副作用	所有外部交互都被标注了吗？	[ ]
7	边界情况	边界条件都被描述了吗？	[ ]
8	并发	异步行为都被记录了吗？	[ ]
9	配置	所有配置项都被列出了吗？	[ ]
10	依赖	所有外部依赖都被识别了吗？	[ ]
11	无实现细节	规范中没有代码、技术名称或架构相关内容吗？	[ ]
12	Given/When/Then	所有验收标准都使用了正确格式吗？	[ ]

Concrete Example: Code to Spec Transformation

具体示例：代码到规范的转换

Code (input — what you analyze):

代码（输入——你需要分析的内容）：

javascript

function checkAuth(req, res, next) {
  const token = req.headers.authorization?.split(' ')[1];
  if (!token) return res.status(401).json({ error: 'No token' });
  try {
    const decoded = jwt.verify(token, process.env.JWT_SECRET);
    req.user = decoded;
    next();
  } catch (e) {
    return res.status(403).json({ error: 'Invalid token' });
  }
}

javascript

function checkAuth(req, res, next) {
  const token = req.headers.authorization?.split(' ')[1];
  if (!token) return res.status(401).json({ error: 'No token' });
  try {
    const decoded = jwt.verify(token, process.env.JWT_SECRET);
    req.user = decoded;
    next();
  } catch (e) {
    return res.status(403).json({ error: 'Invalid token' });
  }
}

Spec (output — what you produce):

规范（输出——你需要生成的内容）：

markdown

undefined

markdown

undefined

Request Authentication

请求认证

Job to Be Done

要完成的任务

When a request arrives at a protected endpoint, I want to verify the caller's identity, so I can ensure only authorized users access the system.

当受保护的端点收到请求时，我需要校验调用方的身份，从而确保只有授权用户可以访问系统。

Acceptance Criteria

验收标准

Valid Credentials

有效凭据

Given a request with valid credentials in the authorization header
When the request is processed
Then the request proceeds to the next handler
And the authenticated user identity is available to downstream handlers

Given 授权头中包含有效凭据的请求
When 请求被处理时
Then 请求会被传递给下一个处理程序
并且下游处理程序可以获取到已认证的用户身份

Missing Credentials

缺少凭据

Given a request without credentials
When the request is processed
Then a 401 status is returned
And an error message indicates missing credentials

Given 没有携带凭据的请求
When 请求被处理时
Then 返回401状态码
并且错误信息提示缺少凭据

Invalid Credentials

无效凭据

Given a request with invalid or expired credentials
When the request is processed
Then a 403 status is returned
And an error message indicates invalid credentials

Given 携带无效或过期凭据的请求
When 请求被处理时
Then 返回403状态码
并且错误信息提示凭据无效

Edge Cases

边界情况

Malformed authorization header (missing "Bearer" prefix): treated as missing credentials
Expired credentials: treated as invalid credentials

格式错误的授权头（缺少"Bearer"前缀）：视为缺少凭据
过期凭据：视为无效凭据

Data Contracts

数据契约

Input: Authorization header in "Bearer <credential>" format
Output on success: User identity object attached to request context
Output on failure: JSON error response with appropriate status code


Notice: No mention of JWT, middleware, Express, environment variables, or any implementation detail.

输入：格式为"Bearer <credential>"的授权头
成功输出：附加到请求上下文的用户身份对象
失败输出：携带对应状态码的JSON错误响应


注意：规范中没有提到JWT、中间件、Express、环境变量或任何实现细节。

Anti-Patterns / Common Mistakes

反模式/常见错误

Mistake	Why It Is Wrong	What To Do Instead
Skipping "boring" code paths	Undocumented behavior causes bugs during refactoring	Trace EVERY path, even error handlers
Leaking implementation details into specs	Defeats the purpose of behavioral specs	Strip all tech names, function names, code
Marking bugs as "correct behavior"	Loses the information that it is a bug	Document in KNOWN_ISSUES.md with severity
Skipping async/concurrency analysis	Race conditions are the hardest bugs to find	Document all async behavior
Analyzing only happy paths	Most bugs live in error paths	Document ALL error handling paths
Guessing behavior instead of tracing code	Spec becomes fiction	Read every line — no assumptions
Generating specs without user review	Misunderstandings propagate	Present for review after each phase

错误	错误原因	正确做法
跳过“无聊”的代码路径	未记录的行为会在重构时导致bug	追溯每一条路径，哪怕是错误处理逻辑
规范中泄露实现细节	违背了行为规范的设计目的	移除所有技术名称、函数名、代码
将bug标记为“正确行为”	丢失了bug相关信息	在KNOWN_ISSUES.md中按严重程度记录
跳过异步/并发分析	竞态条件是最难发现的bug	记录所有异步行为
只分析正常路径	大多数bug都存在于错误路径中	记录所有错误处理路径
猜测行为而不是追溯代码	规范会失去参考价值	逐行阅读代码——不做任何假设
未经用户审核就生成规范	误解会被传递放大	每个阶段完成后都提交给用户审核

Anti-Rationalization Guards

反合理性防护规则

[HARD-GATE] Do NOT skip any code path — every branch, conditional, and error handler must be traced
[HARD-GATE] Do NOT include ANY implementation details in specs — no code, tech names, or architecture
[HARD-GATE] Do NOT mark the completeness checklist as done until ALL 12 items pass
Do NOT skip concurrency analysis — even if the code "looks synchronous"
Do NOT skip configuration analysis — env vars and feature flags change behavior
Do NOT assume behavior from function names — read the actual code
Do NOT fix bugs while reverse-engineering — document them in KNOWN_ISSUES.md

[硬性门槛(HARD-GATE)] 不得跳过任何代码路径——必须追溯每一个分支、条件判断和错误处理程序
[硬性门槛(HARD-GATE)] 规范中不得包含任何实现细节——不得有代码、技术名称或架构相关内容
[硬性门槛(HARD-GATE)] 12项检查全部通过前，不得标记完整性检查清单为完成
不得跳过 并发分析——哪怕代码“看起来是同步的”
不得跳过 配置分析——环境变量和功能开关会改变行为
不得从函数名猜测行为——阅读实际代码
不得在逆向工程过程中修复bug——在KNOWN_ISSUES.md中记录即可

Integration Points

集成点

Skill	Relationship
`spec-writing`	Output follows spec-writing format; use for audit after generation
`autonomous-loop`	Specs feed into planning mode for gap analysis
`acceptance-testing`	Tests derived from reverse-engineered acceptance criteria
`self-learning`	Populate memory files with discovered project context
`planning`	After specs exist, plan improvements or new features
`systematic-debugging`	Known issues inform debugging priorities

技能	关系
`spec-writing`	输出遵循spec-writing格式；生成后可用于审计
`autonomous-loop`	规范会输入到规划模式用于缺口分析
`acceptance-testing`	测试用例可从逆向得到的验收标准派生
`self-learning`	将发现的项目上下文存入内存文件
`planning`	规范生成后，可规划改进或新功能
`systematic-debugging`	已知问题可用于确定调试优先级

Workflow After Reverse Engineering

逆向工程后的工作流

Step	Skill	Purpose
1	`reverse-engineering-specs` (this)	Generate behavioral specs from code
2	`spec-writing` (audit mode)	Verify quality and completeness
3	`planning`	Identify gaps, plan improvements
4	`autonomous-loop`	Implement features or fixes with specs as guide

步骤	技能	目的
1	`reverse-engineering-specs` （本技能）	从代码生成行为规范
2	`spec-writing` （审计模式）	校验质量和完整性
3	`planning`	识别缺口，规划改进
4	`autonomous-loop`	以规范为指导实现功能或修复

Verification Gate

校验门槛

Before claiming reverse engineering is complete:

VERIFY the completeness checklist (all 12 items) passes
VERIFY zero implementation details in any spec file
VERIFY all acceptance criteria use Given/When/Then format
VERIFY KNOWN_ISSUES.md exists and categorizes all discovered bugs
VERIFY the user has reviewed the spec set and KNOWN_ISSUES

宣布逆向工程完成前，需确认：

完整性检查清单（全部12项）已通过
所有规范文件中没有任何实现细节
所有验收标准都使用Given/When/Then格式
KNOWN_ISSUES.md存在，且对所有发现的bug做了分类
用户已审核规范集和KNOWN_ISSUES

Skill Type

技能类型

Flexible — Adapt investigation depth and subagent count to codebase size while preserving the exhaustive-investigation and implementation-free output rules. No code paths may be skipped.

灵活型——可根据代码库大小调整调研深度和子Agent数量，同时保留全面调研和无实现细节输出的规则。不得跳过任何代码路径。