to-spec — Reverse-Engineer Project Specification

to-spec — 逆向生成项目规格文档

Analyze an existing codebase and produce a structured SPEC document that captures what the project does, how it's built, and what contracts it exposes. The output is a living specification that could be used to rebuild the project from scratch or onboard new contributors.

分析现有代码库，生成结构化的SPEC文档，记录项目功能、构建方式及对外暴露的契约。输出的是一份可动态更新的规格说明，可用于从零重建项目或帮助新贡献者快速上手。

When to Use

适用场景

You want a comprehensive understanding of an existing project
Onboarding new team members who need a high-level overview
Documenting a project that was built without a spec
Comparing actual implementation against intended design
Preparing for a rewrite or major refactor
Auditing what a project actually does vs. what people think it does

你希望全面了解现有项目
为需要项目概览的新团队成员提供入职资料
为无初始规格的项目补充文档
对比实际实现与预期设计的差异
为项目重写或重大重构做准备
核查项目实际功能与认知中的功能是否一致

The Job

工作流程

Scope confirmation — ask user what to analyze (entire repo, specific directory, or specific aspect)
Deep scan — systematically read project structure, entry points, config, tests, and core logic
Synthesize — produce a structured SPEC document
Review — present to user for feedback and iteration
Save — write final SPEC to agreed location

范围确认 — 询问用户需要分析的范围（整个仓库、特定目录或特定方面）
深度扫描 — 系统读取项目结构、入口文件、配置、测试及核心逻辑
内容合成 — 生成结构化的SPEC文档
审核迭代 — 将文档提交给用户获取反馈并进行迭代
保存文档 — 将最终SPEC文档写入约定位置

Step 1: Scope Confirmation

步骤1：范围确认

Before scanning, ask the user:

What should I analyze?

A. Entire repository (recommended for small-medium projects)
B. Specific directory or module: [path]
C. Specific aspect only (e.g., API surface, data model, auth flow)

Depth level:
1. Overview — high-level architecture + tech stack + key features (fast, ~5 min)
2. Standard — includes API contracts, data models, config, dependencies (default)
3. Deep — adds internal module interactions, error handling patterns, test coverage analysis

If the project is large (>500 files), recommend starting with Overview or a specific module.

扫描前，询问用户：

What should I analyze?

A. Entire repository (recommended for small-medium projects)
B. Specific directory or module: [path]
C. Specific aspect only (e.g., API surface, data model, auth flow)

Depth level:
1. Overview — high-level architecture + tech stack + key features (fast, ~5 min)
2. Standard — includes API contracts, data models, config, dependencies (default)
3. Deep — adds internal module interactions, error handling patterns, test coverage analysis

如果项目规模较大（超过500个文件），建议从概览或特定模块开始分析。

Step 2: Deep Scan

步骤2：深度扫描

Systematically analyze the following (adapt to what exists):

系统分析以下内容（根据实际存在的内容调整）：

2.1 Project Identity

2.1 项目标识

package.json

,

go.mod

,

Cargo.toml

,

pyproject.toml

,

pom.xml

, etc.

README, LICENSE
Git history (first commit date, recent activity, contributor count)

package.json

,

go.mod

,

Cargo.toml

,

pyproject.toml

,

pom.xml

, etc.

README, LICENSE
Git history (first commit date, recent activity, contributor count)

2.2 Architecture

2.2 架构设计

Directory structure and organization pattern (monorepo, layered, hexagonal, etc.)
Entry points (main files, CLI commands, server bootstrap)
Module boundaries and dependency graph (internal)

目录结构与组织模式（单体仓库、分层架构、六边形架构等）
入口点（主文件、CLI命令、服务启动流程）
模块边界与内部依赖关系图

2.3 Tech Stack

2.3 技术栈

Language(s) and version constraints
Frameworks and major libraries
Build tools and bundlers
Runtime requirements (Node version, Docker, etc.)

使用的编程语言及版本约束
框架与主要依赖库
构建工具与打包工具
运行时要求（Node版本、Docker等）

2.4 Features & Behavior

2.4 功能与行为

Route definitions / CLI commands / exported functions
Business logic modules and their responsibilities
Background jobs, cron tasks, event handlers

路由定义 / CLI命令 / 导出函数
业务逻辑模块及其职责
后台任务、定时任务、事件处理器

2.5 Data Model

2.5 数据模型

Database schemas, migrations, ORMs
Key data structures and their relationships
State management approach

数据库 schemas, migrations, ORMs
核心数据结构及其关系
状态管理方案

2.6 API Surface

2.6 API 接口

HTTP endpoints (method, path, request/response shapes)
GraphQL schema / gRPC protos / WebSocket events
CLI interface (commands, flags, arguments)
Exported library API (public functions, classes, types)

HTTP endpoints (method, path, request/response shapes)
GraphQL schema / gRPC protos / WebSocket events
CLI interface (commands, flags, arguments)
库导出的公共API（public functions, classes, types）

2.7 Configuration & Environment

2.7 配置与环境

Environment variables and their purpose
Config files and their schema
Feature flags, toggles

环境变量及其用途
Config files and their schema
Feature flags, toggles

2.8 External Dependencies

2.8 外部依赖

Third-party services (databases, queues, APIs)
Infrastructure requirements (cloud services, storage)
Authentication/authorization providers

第三方服务（databases, queues, APIs）
基础设施要求（cloud services, storage）
Authentication/authorization providers

2.9 Testing & Quality

2.9 测试与质量保障

Test framework and approach (unit, integration, e2e)
Coverage patterns (what's tested, what's not)
Linting, formatting, type checking setup

测试框架与测试策略（unit, integration, e2e）
测试覆盖情况（what's tested, what's not）
Linting, formatting, type checking setup

2.10 Deployment & Operations

2.10 部署与运维

CI/CD configuration
Deployment targets and strategies
Monitoring, logging, health checks

CI/CD configuration
Deployment targets and strategies
Monitoring, logging, health checks

Step 3: SPEC Document Structure

步骤3：SPEC文档结构

Generate the SPEC with these sections. Omit sections that don't apply.

markdown

undefined

生成包含以下章节的SPEC文档，省略不适用的章节。

markdown

undefined

SPEC: [Project Name]

Reverse-engineered specification — generated [date] from commit [short-hash]

1. Overview

1.1 Purpose

[One paragraph: what problem this project solves and for whom]

1.2 Key Capabilities

[Bullet list of what the system can do, from a user's perspective]

[Bullet list of what the system can do, from a user's perspective]

1.3 Architecture Style

[e.g., "Monolithic Express.js API with React SPA frontend", "CLI tool with plugin system", "Microservices communicating over gRPC"]

2. Tech Stack

Layer	Technology	Version
Language	...	...
Framework	...	...
Database	...	...
Build	...	...
Test	...	...
Deploy	...	...

Layer	Technology	Version
Language	...	...
Framework	...	...
Database	...	...
Build	...	...
Test	...	...
Deploy	...	...

3. Project Structure

[Directory tree with annotations explaining each top-level directory's purpose]

4. Data Model

4.1 Core Entities

[For each entity: name, fields, relationships, constraints]

4.2 State Transitions

[If applicable: lifecycle states and valid transitions]

5. API Surface

5.1 [Interface Type: REST / CLI / Library / etc.]

[For each endpoint/command/function:]

Method	Path/Command	Description	Auth
...	...	...	...

[For each endpoint/command/function:]

Method	Path/Command	Description	Auth
...	...	...	...

5.2 Request/Response Schemas

[Key request/response shapes with field types]

6. Configuration

Variable / Key	Required	Default	Description
...	...	...	...

Variable / Key	Required	Default	Description
...	...	...	...

7. External Dependencies

Service	Purpose	Failure Impact
...	...	...

Service	Purpose	Failure Impact
...	...	...

8. Business Rules & Constraints

[Numbered list of invariants, validation rules, and business logic constraints discovered in the code]

[Numbered list of invariants, validation rules, and business logic constraints discovered in the code]

9. Non-Functional Characteristics

9.1 Performance

[Observed patterns: caching, pagination, batch processing, etc.]

9.2 Security

[Auth mechanism, input validation patterns, secrets management]

9.3 Error Handling

[Error strategy: custom error types, error codes, retry policies]

10. Testing Strategy

Type	Framework	Coverage Pattern
Unit	...	...
Integration	...	...
E2E	...	...

Type	Framework	Coverage Pattern
Unit	...	...
Integration	...	...
E2E	...	...

11. Known Gaps & Assumptions

[Things that are unclear from the code alone]
[Assumptions made during analysis]
[Areas with no tests or documentation]

[Things that are unclear from the code alone]
[Assumptions made during analysis]
[Areas with no tests or documentation]

12. Appendix

A. Dependency Graph

[Key module dependencies, import relationships]

B. Environment Setup

[Steps to run the project locally, derived from config and scripts]

---

[Steps to run the project locally, derived from config and scripts]

---

Step 4: Review & Iteration

步骤4：审核与迭代

After generating the SPEC, present it and ask:

SPEC generated. Please review:

- Are there sections that need more detail?
- Are there inaccuracies I should correct?
- Should I add/remove any sections?
- Is the depth level appropriate?

Reply OK to save, or provide feedback for iteration.

Apply feedback and re-present until user confirms.

生成SPEC文档后，提交给用户并询问：

SPEC generated. Please review:

- Are there sections that need more detail?
- Are there inaccuracies I should correct?
- Should I add/remove any sections?
- Is the depth level appropriate?

Reply OK to save, or provide feedback for iteration.

根据反馈调整文档，重新提交直到用户确认。

Step 5: Save

步骤5：保存文档

Ask user for save location:

Where should I save the SPEC?

A. docs/SPEC.md (recommended)
B. SPEC.md (project root)
C. Custom path: [specify]

询问用户保存位置：

Where should I save the SPEC?

A. docs/SPEC.md (recommended)
B. SPEC.md (project root)
C. Custom path: [specify]

Analysis Heuristics

分析启发式规则

Identifying Purpose

识别项目用途

Look at README first line, package description field, CLI help text
Check the main entry point — what does it bootstrap?
Look at test descriptions — they often describe expected behavior in plain language

先查看README首行、包描述字段、CLI帮助文本
检查主入口文件——它启动了什么？
查看测试描述——通常会用通俗语言描述预期行为

Discovering Architecture

发现架构设计

Map
```
import
```
/
```
require
```
statements to build dependency graph

Identify layers by directory naming:

controllers

,

services

,

models

,

routes

,

handlers

,

domain

,

infra

Check for dependency injection patterns, middleware chains, plugin registrations

映射
```
import
```
/
```
require
```
语句以构建依赖关系图

通过目录命名识别分层：

controllers

,

services

,

models

,

routes

,

handlers

,

domain

,

infra

检查依赖注入模式、中间件链、插件注册机制

Extracting Business Rules

提取业务规则

Look for validation functions, guard clauses, assertion statements
Check error messages — they often describe what went wrong in business terms
Examine test assertions — they encode expected behavior

查找验证函数、守卫子句、断言语句
查看错误信息——通常会用业务术语描述问题
分析测试断言——它们编码了预期行为

Finding API Contracts

提取API契约

Route registrations (Express:
```
app.get()
```
, FastAPI:
```
@app.get()
```
, Go:
```
mux.HandleFunc()
```
)
OpenAPI/Swagger files if present
Request validation schemas (Joi, Zod, Pydantic, struct tags)
CLI flag/argument definitions (cobra, argparse, yargs)

路由注册（Express:
```
app.get()
```
, FastAPI:
```
@app.get()
```
, Go:
```
mux.HandleFunc()
```
）
若存在OpenAPI/Swagger文件，优先查看
请求验证schemas（Joi, Zod, Pydantic, struct tags）
CLI参数/选项定义（cobra, argparse, yargs）

Detecting Data Models

识别数据模型

ORM model definitions (Prisma, SQLAlchemy, GORM, TypeORM)
Migration files (in chronological order)
Type/interface definitions for core domain objects
Database seed files

ORM模型定义（Prisma, SQLAlchemy, GORM, TypeORM）
迁移文件（按时间顺序）
核心领域对象的类型/接口定义
数据库种子文件

Edge Cases

边缘场景处理

Scenario	Handling
Project has no README or documentation	Note this in "Known Gaps"; infer purpose from code
Monorepo with multiple services	Ask user which service(s) to analyze; produce one SPEC per service or a unified SPEC with clear boundaries
Project uses code generation	Document the generated code's purpose but focus on the source of truth (schemas, proto files, templates)
Legacy project with mixed patterns	Document all observed patterns, note inconsistencies in "Known Gaps"
Project is a library (no runtime)	Focus on exported API surface, type contracts, and usage patterns from tests
Incomplete or broken code	Document what exists, mark broken/incomplete areas explicitly
Project >1000 files	Start with entry points and trace key flows; don't exhaustively read every file
Multiple languages in one repo	Document each language's role and how they interact

场景	处理方式
项目无README或任何文档	在“已知缺口”中记录此情况；从代码中推断项目用途
包含多个服务的单体仓库	询问用户需要分析哪些服务；为每个服务生成单独的SPEC文档，或生成带有清晰边界的统一SPEC文档
项目使用代码生成	记录生成代码的用途，但重点关注数据源（schemas, proto files, templates）
混合模式的遗留项目	记录所有观察到的模式，在“已知缺口”中注明不一致之处
纯库项目（无运行时）	重点关注导出的API接口、类型契约及测试中的使用模式
代码不完整或存在缺陷	记录现有内容，明确标记存在缺陷/未完成的区域
项目文件数超过1000个	从入口文件开始，追踪关键流程；无需逐文件通读
仓库中包含多种语言	记录每种语言的角色及它们的交互方式

Quality Criteria

质量标准

A good reverse-engineered SPEC should pass these checks:

A developer unfamiliar with the project could understand its purpose in 60 seconds
The tech stack section is complete enough to set up a dev environment
API contracts are specific enough to write a client against
Data models are complete enough to recreate the schema
Business rules are explicit (not buried in "see code")
Known gaps are honestly listed (don't invent what you can't determine)
The SPEC matches the actual code (not aspirational documentation)

一份优质的逆向生成SPEC文档应满足以下检查项：

不熟悉项目的开发者能在60秒内理解其用途
技术栈部分足够完整，可用于搭建开发环境
API契约足够具体，可基于此编写客户端
数据模型足够完整，可用于重建数据库schema
业务规则明确（不隐藏在“查看代码”中）
如实列出已知缺口（不编造无法确定的内容）
SPEC文档与实际代码一致（而非理想化的文档）

Anti-Patterns to Avoid

需避免的反模式

Don't invent intent. If you can't determine WHY something exists, say so. Don't fabricate rationale.
Don't copy code into the SPEC. Describe behavior and contracts, don't paste implementations.
Don't include transient state. The SPEC describes the system's design, not its current runtime state.
Don't over-specify internals. Focus on boundaries, contracts, and behavior. Internal implementation details belong in code comments, not specs.
Don't assume the README is accurate. READMEs often lag behind code. Verify claims against actual implementation.

不要编造意图：如果无法确定某部分存在的原因，如实说明，不要编造理由。
不要复制代码到SPEC中：描述行为和契约，不要粘贴实现代码。
不要包含临时状态：SPEC描述的是系统设计，而非当前运行时状态。
不要过度规范内部细节：重点关注边界、契约和行为。内部实现细节应放在代码注释中，而非规格文档。
不要假设README内容准确：README常滞后于代码，需对照实际实现验证内容。

code-to-spec

Original

Translation

to-spec — Reverse-Engineer Project Specification

to-spec — 逆向生成项目规格文档

When to Use

适用场景

The Job

工作流程

Step 1: Scope Confirmation

步骤1：范围确认

Step 2: Deep Scan

步骤2：深度扫描

2.1 Project Identity

2.1 项目标识

2.2 Architecture

2.2 架构设计

2.3 Tech Stack

2.3 技术栈

2.4 Features & Behavior

2.4 功能与行为

2.5 Data Model

2.5 数据模型

2.6 API Surface

2.6 API 接口

2.7 Configuration & Environment

2.7 配置与环境

2.8 External Dependencies

2.8 外部依赖

2.9 Testing & Quality

2.9 测试与质量保障

2.10 Deployment & Operations

2.10 部署与运维

Step 3: SPEC Document Structure

步骤3：SPEC文档结构

SPEC: [Project Name]

SPEC: [Project Name]

1. Overview

1. Overview

1.1 Purpose

1.1 Purpose

1.2 Key Capabilities

1.2 Key Capabilities

1.3 Architecture Style

1.3 Architecture Style

2. Tech Stack

2. Tech Stack

3. Project Structure

3. Project Structure

4. Data Model

4. Data Model

4.1 Core Entities

4.1 Core Entities

4.2 State Transitions

4.2 State Transitions

5. API Surface

5. API Surface

5.1 [Interface Type: REST / CLI / Library / etc.]

5.1 [Interface Type: REST / CLI / Library / etc.]

5.2 Request/Response Schemas

5.2 Request/Response Schemas

6. Configuration

6. Configuration

7. External Dependencies

7. External Dependencies

8. Business Rules & Constraints

8. Business Rules & Constraints

9. Non-Functional Characteristics

9. Non-Functional Characteristics

9.1 Performance

9.1 Performance

9.2 Security

9.2 Security

9.3 Error Handling

9.3 Error Handling

10. Testing Strategy

10. Testing Strategy

11. Known Gaps & Assumptions

11. Known Gaps & Assumptions

12. Appendix