code-to-spec
Compare original and translation side by side
🇺🇸
Original
English🇨🇳
Translation
Chineseto-spec — Reverse-Engineer Project Specification
to-spec — 逆向生成项目规格文档
Analyze an existing codebase and produce a structured SPEC document that captures what the project does, how it's built, and what contracts it exposes. The output is a living specification that could be used to rebuild the project from scratch or onboard new contributors.
分析现有代码库,生成结构化的SPEC文档,记录项目功能、构建方式及对外暴露的契约。输出的是一份可动态更新的规格说明,可用于从零重建项目或帮助新贡献者快速上手。
When to Use
适用场景
- You want a comprehensive understanding of an existing project
- Onboarding new team members who need a high-level overview
- Documenting a project that was built without a spec
- Comparing actual implementation against intended design
- Preparing for a rewrite or major refactor
- Auditing what a project actually does vs. what people think it does
- 你希望全面了解现有项目
- 为需要项目概览的新团队成员提供入职资料
- 为无初始规格的项目补充文档
- 对比实际实现与预期设计的差异
- 为项目重写或重大重构做准备
- 核查项目实际功能与认知中的功能是否一致
The Job
工作流程
- Scope confirmation — ask user what to analyze (entire repo, specific directory, or specific aspect)
- Deep scan — systematically read project structure, entry points, config, tests, and core logic
- Synthesize — produce a structured SPEC document
- Review — present to user for feedback and iteration
- Save — write final SPEC to agreed location
- 范围确认 — 询问用户需要分析的范围(整个仓库、特定目录或特定方面)
- 深度扫描 — 系统读取项目结构、入口文件、配置、测试及核心逻辑
- 内容合成 — 生成结构化的SPEC文档
- 审核迭代 — 将文档提交给用户获取反馈并进行迭代
- 保存文档 — 将最终SPEC文档写入约定位置
Step 1: Scope Confirmation
步骤1:范围确认
Before scanning, ask the user:
What should I analyze?
A. Entire repository (recommended for small-medium projects)
B. Specific directory or module: [path]
C. Specific aspect only (e.g., API surface, data model, auth flow)
Depth level:
1. Overview — high-level architecture + tech stack + key features (fast, ~5 min)
2. Standard — includes API contracts, data models, config, dependencies (default)
3. Deep — adds internal module interactions, error handling patterns, test coverage analysisIf the project is large (>500 files), recommend starting with Overview or a specific module.
扫描前,询问用户:
What should I analyze?
A. Entire repository (recommended for small-medium projects)
B. Specific directory or module: [path]
C. Specific aspect only (e.g., API surface, data model, auth flow)
Depth level:
1. Overview — high-level architecture + tech stack + key features (fast, ~5 min)
2. Standard — includes API contracts, data models, config, dependencies (default)
3. Deep — adds internal module interactions, error handling patterns, test coverage analysis如果项目规模较大(超过500个文件),建议从概览或特定模块开始分析。
Step 2: Deep Scan
步骤2:深度扫描
Systematically analyze the following (adapt to what exists):
系统分析以下内容(根据实际存在的内容调整):
2.1 Project Identity
2.1 项目标识
- ,
package.json,go.mod,Cargo.toml,pyproject.toml, etc.pom.xml - README, LICENSE
- Git history (first commit date, recent activity, contributor count)
- ,
package.json,go.mod,Cargo.toml,pyproject.toml, etc.pom.xml - README, LICENSE
- Git history (first commit date, recent activity, contributor count)
2.2 Architecture
2.2 架构设计
- Directory structure and organization pattern (monorepo, layered, hexagonal, etc.)
- Entry points (main files, CLI commands, server bootstrap)
- Module boundaries and dependency graph (internal)
- 目录结构与组织模式(单体仓库、分层架构、六边形架构等)
- 入口点(主文件、CLI命令、服务启动流程)
- 模块边界与内部依赖关系图
2.3 Tech Stack
2.3 技术栈
- Language(s) and version constraints
- Frameworks and major libraries
- Build tools and bundlers
- Runtime requirements (Node version, Docker, etc.)
- 使用的编程语言及版本约束
- 框架与主要依赖库
- 构建工具与打包工具
- 运行时要求(Node版本、Docker等)
2.4 Features & Behavior
2.4 功能与行为
- Route definitions / CLI commands / exported functions
- Business logic modules and their responsibilities
- Background jobs, cron tasks, event handlers
- 路由定义 / CLI命令 / 导出函数
- 业务逻辑模块及其职责
- 后台任务、定时任务、事件处理器
2.5 Data Model
2.5 数据模型
- Database schemas, migrations, ORMs
- Key data structures and their relationships
- State management approach
- 数据库 schemas, migrations, ORMs
- 核心数据结构及其关系
- 状态管理方案
2.6 API Surface
2.6 API 接口
- HTTP endpoints (method, path, request/response shapes)
- GraphQL schema / gRPC protos / WebSocket events
- CLI interface (commands, flags, arguments)
- Exported library API (public functions, classes, types)
- HTTP endpoints (method, path, request/response shapes)
- GraphQL schema / gRPC protos / WebSocket events
- CLI interface (commands, flags, arguments)
- 库导出的公共API(public functions, classes, types)
2.7 Configuration & Environment
2.7 配置与环境
- Environment variables and their purpose
- Config files and their schema
- Feature flags, toggles
- 环境变量及其用途
- Config files and their schema
- Feature flags, toggles
2.8 External Dependencies
2.8 外部依赖
- Third-party services (databases, queues, APIs)
- Infrastructure requirements (cloud services, storage)
- Authentication/authorization providers
- 第三方服务(databases, queues, APIs)
- 基础设施要求(cloud services, storage)
- Authentication/authorization providers
2.9 Testing & Quality
2.9 测试与质量保障
- Test framework and approach (unit, integration, e2e)
- Coverage patterns (what's tested, what's not)
- Linting, formatting, type checking setup
- 测试框架与测试策略(unit, integration, e2e)
- 测试覆盖情况(what's tested, what's not)
- Linting, formatting, type checking setup
2.10 Deployment & Operations
2.10 部署与运维
- CI/CD configuration
- Deployment targets and strategies
- Monitoring, logging, health checks
- CI/CD configuration
- Deployment targets and strategies
- Monitoring, logging, health checks
Step 3: SPEC Document Structure
步骤3:SPEC文档结构
Generate the SPEC with these sections. Omit sections that don't apply.
markdown
undefined生成包含以下章节的SPEC文档,省略不适用的章节。
markdown
undefinedSPEC: [Project Name]
SPEC: [Project Name]
Reverse-engineered specification — generated [date] from commit [short-hash]
Reverse-engineered specification — generated [date] from commit [short-hash]
1. Overview
1. Overview
1.1 Purpose
1.1 Purpose
[One paragraph: what problem this project solves and for whom]
[One paragraph: what problem this project solves and for whom]
1.2 Key Capabilities
1.2 Key Capabilities
- [Bullet list of what the system can do, from a user's perspective]
- [Bullet list of what the system can do, from a user's perspective]
1.3 Architecture Style
1.3 Architecture Style
[e.g., "Monolithic Express.js API with React SPA frontend", "CLI tool with plugin system", "Microservices communicating over gRPC"]
[e.g., "Monolithic Express.js API with React SPA frontend", "CLI tool with plugin system", "Microservices communicating over gRPC"]
2. Tech Stack
2. Tech Stack
| Layer | Technology | Version |
|---|---|---|
| Language | ... | ... |
| Framework | ... | ... |
| Database | ... | ... |
| Build | ... | ... |
| Test | ... | ... |
| Deploy | ... | ... |
| Layer | Technology | Version |
|---|---|---|
| Language | ... | ... |
| Framework | ... | ... |
| Database | ... | ... |
| Build | ... | ... |
| Test | ... | ... |
| Deploy | ... | ... |
3. Project Structure
3. Project Structure
[Directory tree with annotations explaining each top-level directory's purpose]
[Directory tree with annotations explaining each top-level directory's purpose]
4. Data Model
4. Data Model
4.1 Core Entities
4.1 Core Entities
[For each entity: name, fields, relationships, constraints]
[For each entity: name, fields, relationships, constraints]
4.2 State Transitions
4.2 State Transitions
[If applicable: lifecycle states and valid transitions]
[If applicable: lifecycle states and valid transitions]
5. API Surface
5. API Surface
5.1 [Interface Type: REST / CLI / Library / etc.]
5.1 [Interface Type: REST / CLI / Library / etc.]
[For each endpoint/command/function:]
| Method | Path/Command | Description | Auth |
|---|---|---|---|
| ... | ... | ... | ... |
[For each endpoint/command/function:]
| Method | Path/Command | Description | Auth |
|---|---|---|---|
| ... | ... | ... | ... |
5.2 Request/Response Schemas
5.2 Request/Response Schemas
[Key request/response shapes with field types]
[Key request/response shapes with field types]
6. Configuration
6. Configuration
| Variable / Key | Required | Default | Description |
|---|---|---|---|
| ... | ... | ... | ... |
| Variable / Key | Required | Default | Description |
|---|---|---|---|
| ... | ... | ... | ... |
7. External Dependencies
7. External Dependencies
| Service | Purpose | Failure Impact |
|---|---|---|
| ... | ... | ... |
| Service | Purpose | Failure Impact |
|---|---|---|
| ... | ... | ... |
8. Business Rules & Constraints
8. Business Rules & Constraints
- [Numbered list of invariants, validation rules, and business logic constraints discovered in the code]
- [Numbered list of invariants, validation rules, and business logic constraints discovered in the code]
9. Non-Functional Characteristics
9. Non-Functional Characteristics
9.1 Performance
9.1 Performance
[Observed patterns: caching, pagination, batch processing, etc.]
[Observed patterns: caching, pagination, batch processing, etc.]
9.2 Security
9.2 Security
[Auth mechanism, input validation patterns, secrets management]
[Auth mechanism, input validation patterns, secrets management]
9.3 Error Handling
9.3 Error Handling
[Error strategy: custom error types, error codes, retry policies]
[Error strategy: custom error types, error codes, retry policies]
10. Testing Strategy
10. Testing Strategy
| Type | Framework | Coverage Pattern |
|---|---|---|
| Unit | ... | ... |
| Integration | ... | ... |
| E2E | ... | ... |
| Type | Framework | Coverage Pattern |
|---|---|---|
| Unit | ... | ... |
| Integration | ... | ... |
| E2E | ... | ... |
11. Known Gaps & Assumptions
11. Known Gaps & Assumptions
- [Things that are unclear from the code alone]
- [Assumptions made during analysis]
- [Areas with no tests or documentation]
- [Things that are unclear from the code alone]
- [Assumptions made during analysis]
- [Areas with no tests or documentation]
12. Appendix
12. Appendix
A. Dependency Graph
A. Dependency Graph
[Key module dependencies, import relationships]
[Key module dependencies, import relationships]
B. Environment Setup
B. Environment Setup
[Steps to run the project locally, derived from config and scripts]
---[Steps to run the project locally, derived from config and scripts]
---Step 4: Review & Iteration
步骤4:审核与迭代
After generating the SPEC, present it and ask:
SPEC generated. Please review:
- Are there sections that need more detail?
- Are there inaccuracies I should correct?
- Should I add/remove any sections?
- Is the depth level appropriate?
Reply OK to save, or provide feedback for iteration.Apply feedback and re-present until user confirms.
生成SPEC文档后,提交给用户并询问:
SPEC generated. Please review:
- Are there sections that need more detail?
- Are there inaccuracies I should correct?
- Should I add/remove any sections?
- Is the depth level appropriate?
Reply OK to save, or provide feedback for iteration.根据反馈调整文档,重新提交直到用户确认。
Step 5: Save
步骤5:保存文档
Ask user for save location:
Where should I save the SPEC?
A. docs/SPEC.md (recommended)
B. SPEC.md (project root)
C. Custom path: [specify]询问用户保存位置:
Where should I save the SPEC?
A. docs/SPEC.md (recommended)
B. SPEC.md (project root)
C. Custom path: [specify]Analysis Heuristics
分析启发式规则
Identifying Purpose
识别项目用途
- Look at README first line, package description field, CLI help text
- Check the main entry point — what does it bootstrap?
- Look at test descriptions — they often describe expected behavior in plain language
- 先查看README首行、包描述字段、CLI帮助文本
- 检查主入口文件——它启动了什么?
- 查看测试描述——通常会用通俗语言描述预期行为
Discovering Architecture
发现架构设计
- Map /
importstatements to build dependency graphrequire - Identify layers by directory naming: ,
controllers,services,models,routes,handlers,domaininfra - Check for dependency injection patterns, middleware chains, plugin registrations
- 映射/
import语句以构建依赖关系图require - 通过目录命名识别分层:,
controllers,services,models,routes,handlers,domaininfra - 检查依赖注入模式、中间件链、插件注册机制
Extracting Business Rules
提取业务规则
- Look for validation functions, guard clauses, assertion statements
- Check error messages — they often describe what went wrong in business terms
- Examine test assertions — they encode expected behavior
- 查找验证函数、守卫子句、断言语句
- 查看错误信息——通常会用业务术语描述问题
- 分析测试断言——它们编码了预期行为
Finding API Contracts
提取API契约
- Route registrations (Express: , FastAPI:
app.get(), Go:@app.get())mux.HandleFunc() - OpenAPI/Swagger files if present
- Request validation schemas (Joi, Zod, Pydantic, struct tags)
- CLI flag/argument definitions (cobra, argparse, yargs)
- 路由注册(Express: , FastAPI:
app.get(), Go:@app.get())mux.HandleFunc() - 若存在OpenAPI/Swagger文件,优先查看
- 请求验证schemas(Joi, Zod, Pydantic, struct tags)
- CLI参数/选项定义(cobra, argparse, yargs)
Detecting Data Models
识别数据模型
- ORM model definitions (Prisma, SQLAlchemy, GORM, TypeORM)
- Migration files (in chronological order)
- Type/interface definitions for core domain objects
- Database seed files
- ORM模型定义(Prisma, SQLAlchemy, GORM, TypeORM)
- 迁移文件(按时间顺序)
- 核心领域对象的类型/接口定义
- 数据库种子文件
Edge Cases
边缘场景处理
| Scenario | Handling |
|---|---|
| Project has no README or documentation | Note this in "Known Gaps"; infer purpose from code |
| Monorepo with multiple services | Ask user which service(s) to analyze; produce one SPEC per service or a unified SPEC with clear boundaries |
| Project uses code generation | Document the generated code's purpose but focus on the source of truth (schemas, proto files, templates) |
| Legacy project with mixed patterns | Document all observed patterns, note inconsistencies in "Known Gaps" |
| Project is a library (no runtime) | Focus on exported API surface, type contracts, and usage patterns from tests |
| Incomplete or broken code | Document what exists, mark broken/incomplete areas explicitly |
| Project >1000 files | Start with entry points and trace key flows; don't exhaustively read every file |
| Multiple languages in one repo | Document each language's role and how they interact |
| 场景 | 处理方式 |
|---|---|
| 项目无README或任何文档 | 在“已知缺口”中记录此情况;从代码中推断项目用途 |
| 包含多个服务的单体仓库 | 询问用户需要分析哪些服务;为每个服务生成单独的SPEC文档,或生成带有清晰边界的统一SPEC文档 |
| 项目使用代码生成 | 记录生成代码的用途,但重点关注数据源(schemas, proto files, templates) |
| 混合模式的遗留项目 | 记录所有观察到的模式,在“已知缺口”中注明不一致之处 |
| 纯库项目(无运行时) | 重点关注导出的API接口、类型契约及测试中的使用模式 |
| 代码不完整或存在缺陷 | 记录现有内容,明确标记存在缺陷/未完成的区域 |
| 项目文件数超过1000个 | 从入口文件开始,追踪关键流程;无需逐文件通读 |
| 仓库中包含多种语言 | 记录每种语言的角色及它们的交互方式 |
Quality Criteria
质量标准
A good reverse-engineered SPEC should pass these checks:
- A developer unfamiliar with the project could understand its purpose in 60 seconds
- The tech stack section is complete enough to set up a dev environment
- API contracts are specific enough to write a client against
- Data models are complete enough to recreate the schema
- Business rules are explicit (not buried in "see code")
- Known gaps are honestly listed (don't invent what you can't determine)
- The SPEC matches the actual code (not aspirational documentation)
一份优质的逆向生成SPEC文档应满足以下检查项:
- 不熟悉项目的开发者能在60秒内理解其用途
- 技术栈部分足够完整,可用于搭建开发环境
- API契约足够具体,可基于此编写客户端
- 数据模型足够完整,可用于重建数据库schema
- 业务规则明确(不隐藏在“查看代码”中)
- 如实列出已知缺口(不编造无法确定的内容)
- SPEC文档与实际代码一致(而非理想化的文档)
Anti-Patterns to Avoid
需避免的反模式
- Don't invent intent. If you can't determine WHY something exists, say so. Don't fabricate rationale.
- Don't copy code into the SPEC. Describe behavior and contracts, don't paste implementations.
- Don't include transient state. The SPEC describes the system's design, not its current runtime state.
- Don't over-specify internals. Focus on boundaries, contracts, and behavior. Internal implementation details belong in code comments, not specs.
- Don't assume the README is accurate. READMEs often lag behind code. Verify claims against actual implementation.
- 不要编造意图:如果无法确定某部分存在的原因,如实说明,不要编造理由。
- 不要复制代码到SPEC中:描述行为和契约,不要粘贴实现代码。
- 不要包含临时状态:SPEC描述的是系统设计,而非当前运行时状态。
- 不要过度规范内部细节:重点关注边界、契约和行为。内部实现细节应放在代码注释中,而非规格文档。
- 不要假设README内容准确:README常滞后于代码,需对照实际实现验证内容。