reverse-engineering-specs

Compare original and translation side by side

🇺🇸

Original

English
🇨🇳

Translation

Chinese

Reverse Engineering Specifications

逆向工程规范

Overview

概述

For brownfield/legacy projects without documentation, this skill generates implementation-free specifications by exhaustively analyzing existing code. The output is a complete behavioral description that drives autonomous development on top of the existing codebase — enabling safe refactoring, feature addition, and modernization.
Key principle: Document actual behavior, including bugs. Bugs are "documented features" until explicitly marked for fixing.
This is a RIGID skill. Every code path must be traced. No assumptions, no skipping.
对于没有文档的棕地/遗留项目,该技能通过全面分析现有代码生成不涉及具体实现的规范。输出是完整的行为描述,可驱动在现有代码库之上的自主开发——支持安全重构、功能新增和系统现代化。
核心原则: 记录实际行为,包括bug。在被明确标记为需要修复之前,bug都属于“已记录特性”。
这是一项刚性技能。 必须追溯每一条代码路径,不得做任何假设,不得遗漏任何路径。

Phase 1: Exhaustive Code Investigation

阶段1:全面代码调研

[HARD-GATE] Every code path must be traced. No assumptions, no skipping.
Deploy parallel subagents via the
Agent
tool (up to 500, with
subagent_type="Explore"
) to analyze:
Analysis TargetWhat to DocumentPriority
Entry pointsAll ways the system can be invoked (HTTP, CLI, events, cron)P0
Code pathsEvery branch, loop, conditional, early returnP0
Data flowsInput → transformation → output for every pipelineP0
State mutationsEvery place state is read, written, or deletedP0
Error handlingTry/catch blocks, error codes, fallback behaviorsP0
Side effectsExternal calls, file I/O, database writes, event emissionsP1
ConfigurationEnvironment variables, config files, feature flagsP1
DependenciesExternal services, libraries, APIs consumedP1
ConcurrencyAsync operations, race conditions, locking mechanismsP2
Implicit behaviorConvention-based routing, middleware chains, decoratorsP2
[硬性门槛(HARD-GATE)] 必须追溯每一条代码路径,不得做任何假设,不得遗漏任何路径。
通过
Agent
工具部署并行子Agent(最多500个,
subagent_type="Explore"
)来分析以下内容:
分析目标需要记录的内容优先级
入口点系统可被调用的所有方式(HTTP、CLI、事件、定时任务)P0
代码路径每一个分支、循环、条件判断、提前返回P0
数据流每条管线的输入→转换→输出流程P0
状态变更所有状态被读取、写入、删除的位置P0
错误处理Try/catch块、错误码、降级行为P0
副作用外部调用、文件I/O、数据库写入、事件触发P1
配置环境变量、配置文件、功能开关P1
依赖外部服务、类库、调用的APIP1
并发异步操作、竞态条件、锁机制P2
隐式行为基于约定的路由、中间件链、装饰器P2

Investigation Strategy Decision Table

调研策略决策表

Codebase SizeStrategySubagent Count
Small (<50 files)Single-pass full scan5-10
Medium (50-500 files)Module-by-module scan50-100
Large (500+ files)Entry-point-first, then depth scan200-500
STOP after investigation — present a summary of discovered entry points, data flows, and behaviors. Get confirmation before generating specs.
代码库规模策略子Agent数量
小型(<50个文件)单轮全量扫描5-10
中型(50-500个文件)逐模块扫描50-100
大型(500+个文件)先按入口点分析,再深度扫描200-500
调研完成后停止——输出发现的入口点、数据流和行为的摘要。生成规范前先获得用户确认。

Phase 2: Behavioral Specification Generation

阶段2:行为规范生成

Transform code analysis into implementation-free specs following the
spec-writing
skill format.
遵循
spec-writing
技能的格式,将代码分析结果转换为不涉及具体实现的规范。

Transformation Rules

转换规则

RuleExplanation
Strip ALL implementation detailsNo function names, variable names, technology references
Describe WHAT, never HOWObservable behavior only
Document actual behavior (bugs included)Bugs become "current behavior" in specs
Use Given/When/Then formatFor all acceptance criteria
Include data contractsInput shapes, output shapes, invariants
Separate known issuesBugs go in KNOWN_ISSUES.md, not inline
规则说明
移除所有实现细节不包含函数名、变量名、技术栈相关引用
只描述“是什么”,绝不描述“怎么做”仅记录可观测的行为
记录实际行为(包括bug)bug在规范中被标记为“当前行为”
使用Given/When/Then格式所有验收标准都采用该格式
包含数据契约输入结构、输出结构、不变量
已知问题单独归档bug放在KNOWN_ISSUES.md中,不写在规范正文里

Implementation Detail Stripping

实现细节移除示例

Code ArtifactWhat You SeeWhat You Write in Spec
jwt.verify(token, secret)
Token validation with JWT"Credentials are validated against the authentication system"
redis.get(cacheKey)
Redis cache lookup"Previously computed results are retrieved from cache"
if (user.role === 'admin')
Role check"Privileged operations require administrator access"
res.status(429).json(...)
Rate limiting response"Excessive requests receive a rate limit error"
bcrypt.hash(pw, 12)
Password hashing"Passwords are stored in a non-reversible format"
STOP after spec generation — run the completeness checklist before organizing.
代码片段实际含义你需要写在规范里的内容
jwt.verify(token, secret)
使用JWT做token校验“凭据会通过认证系统进行校验”
redis.get(cacheKey)
Redis缓存查询“从缓存中获取之前计算好的结果”
if (user.role === 'admin')
角色校验“特权操作需要管理员权限”
res.status(429).json(...)
限流响应“请求过多时会返回限流错误”
bcrypt.hash(pw, 12)
密码哈希“密码以不可逆格式存储”
规范生成后停止——整理前先运行完整性检查清单。

Phase 3: Specification Organization

阶段3:规范整理

Create spec files following the naming convention:
specs/
├── 01-[first-capability].md
├── 02-[second-capability].md
├── ...
├── NN-[last-capability].md
└── KNOWN_ISSUES.md
按照以下命名约定创建规范文件:
specs/
├── 01-[first-capability].md
├── 02-[second-capability].md
├── ...
├── NN-[last-capability].md
└── KNOWN_ISSUES.md

KNOWN_ISSUES.md Format

KNOWN_ISSUES.md格式

markdown
undefined
markdown
undefined

Known Issues

已知 Issues

[Issue Title]

[Issue 标题]

  • Current behavior: [What actually happens]
  • Expected behavior: [What should happen, if known]
  • Affected specs: [Which spec files reference this behavior]
  • Severity: [Critical | High | Medium | Low]
  • Notes: [Additional context]
undefined
  • 当前行为: [实际运行的行为]
  • 预期行为: [如果已知,填写应该出现的行为]
  • 影响的规范: [哪些规范文件引用了该行为]
  • 严重程度: [Critical | High | Medium | Low]
  • 备注: [其他上下文信息]
undefined

Severity Classification

严重程度分类

SeverityCriteriaAction
CriticalData loss, security vulnerability, system crashFix before any new features
HighIncorrect results, broken workflowFix in next release
MediumPoor UX, performance issuePlan for future fix
LowCosmetic, minor inconsistencyFix opportunistically
STOP after organization — present the spec file list and KNOWN_ISSUES for review.
严重程度判定标准处理方式
Critical(致命)数据丢失、安全漏洞、系统崩溃新增任何功能前必须修复
High(高)结果错误、工作流中断下一个版本修复
Medium(中)用户体验差、性能问题规划未来修复
Low(低)样式问题、微小不一致有空再修复
整理完成后停止——输出规范文件列表和KNOWN_ISSUES供用户审核。

Phase 4: Quality Verification

阶段4:质量校验

[HARD-GATE] All checks must pass before this phase is complete.
#CheckQuestionStatus
1Entry pointsAre ALL entry points documented?[ ]
2Code pathsAre ALL branches and conditionals traced?[ ]
3Data flowsAre ALL input→output pipelines described?[ ]
4State mutationsAre ALL state changes captured?[ ]
5Error handlingAre ALL error paths documented?[ ]
6Side effectsAre ALL external interactions noted?[ ]
7Edge casesAre boundary conditions described?[ ]
8ConcurrencyAre async behaviors documented?[ ]
9ConfigurationAre ALL config options listed?[ ]
10DependenciesAre ALL external dependencies identified?[ ]
11Implementation-freeZero code, tech names, or architecture in specs?[ ]
12Given/When/ThenAll acceptance criteria in correct format?[ ]
[硬性门槛(HARD-GATE)] 本阶段完成前必须通过所有检查。
序号检查项检查问题状态
1入口点所有入口点都被记录了吗?[ ]
2代码路径所有分支和条件判断都被追溯了吗?[ ]
3数据流所有输入→输出管线都被描述了吗?[ ]
4状态变更所有状态变化都被捕获了吗?[ ]
5错误处理所有错误路径都被记录了吗?[ ]
6副作用所有外部交互都被标注了吗?[ ]
7边界情况边界条件都被描述了吗?[ ]
8并发异步行为都被记录了吗?[ ]
9配置所有配置项都被列出了吗?[ ]
10依赖所有外部依赖都被识别了吗?[ ]
11无实现细节规范中没有代码、技术名称或架构相关内容吗?[ ]
12Given/When/Then所有验收标准都使用了正确格式吗?[ ]

Concrete Example: Code to Spec Transformation

具体示例:代码到规范的转换

Code (input — what you analyze):

代码(输入——你需要分析的内容):

javascript
function checkAuth(req, res, next) {
  const token = req.headers.authorization?.split(' ')[1];
  if (!token) return res.status(401).json({ error: 'No token' });
  try {
    const decoded = jwt.verify(token, process.env.JWT_SECRET);
    req.user = decoded;
    next();
  } catch (e) {
    return res.status(403).json({ error: 'Invalid token' });
  }
}
javascript
function checkAuth(req, res, next) {
  const token = req.headers.authorization?.split(' ')[1];
  if (!token) return res.status(401).json({ error: 'No token' });
  try {
    const decoded = jwt.verify(token, process.env.JWT_SECRET);
    req.user = decoded;
    next();
  } catch (e) {
    return res.status(403).json({ error: 'Invalid token' });
  }
}

Spec (output — what you produce):

规范(输出——你需要生成的内容):

markdown
undefined
markdown
undefined

Request Authentication

请求认证

Job to Be Done

要完成的任务

When a request arrives at a protected endpoint, I want to verify the caller's identity, so I can ensure only authorized users access the system.
当受保护的端点收到请求时,我需要校验调用方的身份,从而确保只有授权用户可以访问系统。

Acceptance Criteria

验收标准

Valid Credentials

有效凭据

  • Given a request with valid credentials in the authorization header
  • When the request is processed
  • Then the request proceeds to the next handler
  • And the authenticated user identity is available to downstream handlers
  • Given 授权头中包含有效凭据的请求
  • When 请求被处理时
  • Then 请求会被传递给下一个处理程序
  • 并且下游处理程序可以获取到已认证的用户身份

Missing Credentials

缺少凭据

  • Given a request without credentials
  • When the request is processed
  • Then a 401 status is returned
  • And an error message indicates missing credentials
  • Given 没有携带凭据的请求
  • When 请求被处理时
  • Then 返回401状态码
  • 并且错误信息提示缺少凭据

Invalid Credentials

无效凭据

  • Given a request with invalid or expired credentials
  • When the request is processed
  • Then a 403 status is returned
  • And an error message indicates invalid credentials
  • Given 携带无效或过期凭据的请求
  • When 请求被处理时
  • Then 返回403状态码
  • 并且错误信息提示凭据无效

Edge Cases

边界情况

  • Malformed authorization header (missing "Bearer" prefix): treated as missing credentials
  • Expired credentials: treated as invalid credentials
  • 格式错误的授权头(缺少"Bearer"前缀):视为缺少凭据
  • 过期凭据:视为无效凭据

Data Contracts

数据契约

  • Input: Authorization header in "Bearer <credential>" format
  • Output on success: User identity object attached to request context
  • Output on failure: JSON error response with appropriate status code

Notice: No mention of JWT, middleware, Express, environment variables, or any implementation detail.
  • 输入:格式为"Bearer <credential>"的授权头
  • 成功输出:附加到请求上下文的用户身份对象
  • 失败输出:携带对应状态码的JSON错误响应

注意:规范中没有提到JWT、中间件、Express、环境变量或任何实现细节。

Anti-Patterns / Common Mistakes

反模式/常见错误

MistakeWhy It Is WrongWhat To Do Instead
Skipping "boring" code pathsUndocumented behavior causes bugs during refactoringTrace EVERY path, even error handlers
Leaking implementation details into specsDefeats the purpose of behavioral specsStrip all tech names, function names, code
Marking bugs as "correct behavior"Loses the information that it is a bugDocument in KNOWN_ISSUES.md with severity
Skipping async/concurrency analysisRace conditions are the hardest bugs to findDocument all async behavior
Analyzing only happy pathsMost bugs live in error pathsDocument ALL error handling paths
Guessing behavior instead of tracing codeSpec becomes fictionRead every line — no assumptions
Generating specs without user reviewMisunderstandings propagatePresent for review after each phase
错误错误原因正确做法
跳过“无聊”的代码路径未记录的行为会在重构时导致bug追溯每一条路径,哪怕是错误处理逻辑
规范中泄露实现细节违背了行为规范的设计目的移除所有技术名称、函数名、代码
将bug标记为“正确行为”丢失了bug相关信息在KNOWN_ISSUES.md中按严重程度记录
跳过异步/并发分析竞态条件是最难发现的bug记录所有异步行为
只分析正常路径大多数bug都存在于错误路径中记录所有错误处理路径
猜测行为而不是追溯代码规范会失去参考价值逐行阅读代码——不做任何假设
未经用户审核就生成规范误解会被传递放大每个阶段完成后都提交给用户审核

Anti-Rationalization Guards

反合理性防护规则

  • [HARD-GATE] Do NOT skip any code path — every branch, conditional, and error handler must be traced
  • [HARD-GATE] Do NOT include ANY implementation details in specs — no code, tech names, or architecture
  • [HARD-GATE] Do NOT mark the completeness checklist as done until ALL 12 items pass
  • Do NOT skip concurrency analysis — even if the code "looks synchronous"
  • Do NOT skip configuration analysis — env vars and feature flags change behavior
  • Do NOT assume behavior from function names — read the actual code
  • Do NOT fix bugs while reverse-engineering — document them in KNOWN_ISSUES.md
  • [硬性门槛(HARD-GATE)] 不得跳过任何代码路径——必须追溯每一个分支、条件判断和错误处理程序
  • [硬性门槛(HARD-GATE)] 规范中不得包含任何实现细节——不得有代码、技术名称或架构相关内容
  • [硬性门槛(HARD-GATE)] 12项检查全部通过前,不得标记完整性检查清单为完成
  • 不得跳过 并发分析——哪怕代码“看起来是同步的”
  • 不得跳过 配置分析——环境变量和功能开关会改变行为
  • 不得 从函数名猜测行为——阅读实际代码
  • 不得 在逆向工程过程中修复bug——在KNOWN_ISSUES.md中记录即可

Integration Points

集成点

SkillRelationship
spec-writing
Output follows spec-writing format; use for audit after generation
autonomous-loop
Specs feed into planning mode for gap analysis
acceptance-testing
Tests derived from reverse-engineered acceptance criteria
self-learning
Populate memory files with discovered project context
planning
After specs exist, plan improvements or new features
systematic-debugging
Known issues inform debugging priorities
技能关系
spec-writing
输出遵循spec-writing格式;生成后可用于审计
autonomous-loop
规范会输入到规划模式用于缺口分析
acceptance-testing
测试用例可从逆向得到的验收标准派生
self-learning
将发现的项目上下文存入内存文件
planning
规范生成后,可规划改进或新功能
systematic-debugging
已知问题可用于确定调试优先级

Workflow After Reverse Engineering

逆向工程后的工作流

StepSkillPurpose
1
reverse-engineering-specs
(this)
Generate behavioral specs from code
2
spec-writing
(audit mode)
Verify quality and completeness
3
planning
Identify gaps, plan improvements
4
autonomous-loop
Implement features or fixes with specs as guide
步骤技能目的
1
reverse-engineering-specs
(本技能)
从代码生成行为规范
2
spec-writing
(审计模式)
校验质量和完整性
3
planning
识别缺口,规划改进
4
autonomous-loop
以规范为指导实现功能或修复

Verification Gate

校验门槛

Before claiming reverse engineering is complete:
  1. VERIFY the completeness checklist (all 12 items) passes
  2. VERIFY zero implementation details in any spec file
  3. VERIFY all acceptance criteria use Given/When/Then format
  4. VERIFY KNOWN_ISSUES.md exists and categorizes all discovered bugs
  5. VERIFY the user has reviewed the spec set and KNOWN_ISSUES
宣布逆向工程完成前,需确认:
  1. 完整性检查清单(全部12项)已通过
  2. 所有规范文件中没有任何实现细节
  3. 所有验收标准都使用Given/When/Then格式
  4. KNOWN_ISSUES.md存在,且对所有发现的bug做了分类
  5. 用户已审核规范集和KNOWN_ISSUES

Skill Type

技能类型

Flexible — Adapt investigation depth and subagent count to codebase size while preserving the exhaustive-investigation and implementation-free output rules. No code paths may be skipped.
灵活型——可根据代码库大小调整调研深度和子Agent数量,同时保留全面调研和无实现细节输出的规则。不得跳过任何代码路径。