code-to-spec

Compare original and translation side by side

🇺🇸

Original

English
🇨🇳

Translation

Chinese

to-spec — Reverse-Engineer Project Specification

to-spec — 逆向生成项目规格文档

Analyze an existing codebase and produce a structured SPEC document that captures what the project does, how it's built, and what contracts it exposes. The output is a living specification that could be used to rebuild the project from scratch or onboard new contributors.

分析现有代码库,生成结构化的SPEC文档,记录项目功能、构建方式及对外暴露的契约。输出的是一份可动态更新的规格说明,可用于从零重建项目或帮助新贡献者快速上手。

When to Use

适用场景

  • You want a comprehensive understanding of an existing project
  • Onboarding new team members who need a high-level overview
  • Documenting a project that was built without a spec
  • Comparing actual implementation against intended design
  • Preparing for a rewrite or major refactor
  • Auditing what a project actually does vs. what people think it does

  • 你希望全面了解现有项目
  • 为需要项目概览的新团队成员提供入职资料
  • 为无初始规格的项目补充文档
  • 对比实际实现与预期设计的差异
  • 为项目重写或重大重构做准备
  • 核查项目实际功能与认知中的功能是否一致

The Job

工作流程

  1. Scope confirmation — ask user what to analyze (entire repo, specific directory, or specific aspect)
  2. Deep scan — systematically read project structure, entry points, config, tests, and core logic
  3. Synthesize — produce a structured SPEC document
  4. Review — present to user for feedback and iteration
  5. Save — write final SPEC to agreed location

  1. 范围确认 — 询问用户需要分析的范围(整个仓库、特定目录或特定方面)
  2. 深度扫描 — 系统读取项目结构、入口文件、配置、测试及核心逻辑
  3. 内容合成 — 生成结构化的SPEC文档
  4. 审核迭代 — 将文档提交给用户获取反馈并进行迭代
  5. 保存文档 — 将最终SPEC文档写入约定位置

Step 1: Scope Confirmation

步骤1:范围确认

Before scanning, ask the user:
What should I analyze?

A. Entire repository (recommended for small-medium projects)
B. Specific directory or module: [path]
C. Specific aspect only (e.g., API surface, data model, auth flow)

Depth level:
1. Overview — high-level architecture + tech stack + key features (fast, ~5 min)
2. Standard — includes API contracts, data models, config, dependencies (default)
3. Deep — adds internal module interactions, error handling patterns, test coverage analysis
If the project is large (>500 files), recommend starting with Overview or a specific module.

扫描前,询问用户:
What should I analyze?

A. Entire repository (recommended for small-medium projects)
B. Specific directory or module: [path]
C. Specific aspect only (e.g., API surface, data model, auth flow)

Depth level:
1. Overview — high-level architecture + tech stack + key features (fast, ~5 min)
2. Standard — includes API contracts, data models, config, dependencies (default)
3. Deep — adds internal module interactions, error handling patterns, test coverage analysis
如果项目规模较大(超过500个文件),建议从概览或特定模块开始分析。

Step 2: Deep Scan

步骤2:深度扫描

Systematically analyze the following (adapt to what exists):
系统分析以下内容(根据实际存在的内容调整):

2.1 Project Identity

2.1 项目标识

  • package.json
    ,
    go.mod
    ,
    Cargo.toml
    ,
    pyproject.toml
    ,
    pom.xml
    , etc.
  • README, LICENSE
  • Git history (first commit date, recent activity, contributor count)
  • package.json
    ,
    go.mod
    ,
    Cargo.toml
    ,
    pyproject.toml
    ,
    pom.xml
    , etc.
  • README, LICENSE
  • Git history (first commit date, recent activity, contributor count)

2.2 Architecture

2.2 架构设计

  • Directory structure and organization pattern (monorepo, layered, hexagonal, etc.)
  • Entry points (main files, CLI commands, server bootstrap)
  • Module boundaries and dependency graph (internal)
  • 目录结构与组织模式(单体仓库、分层架构、六边形架构等)
  • 入口点(主文件、CLI命令、服务启动流程)
  • 模块边界与内部依赖关系图

2.3 Tech Stack

2.3 技术栈

  • Language(s) and version constraints
  • Frameworks and major libraries
  • Build tools and bundlers
  • Runtime requirements (Node version, Docker, etc.)
  • 使用的编程语言及版本约束
  • 框架与主要依赖库
  • 构建工具与打包工具
  • 运行时要求(Node版本、Docker等)

2.4 Features & Behavior

2.4 功能与行为

  • Route definitions / CLI commands / exported functions
  • Business logic modules and their responsibilities
  • Background jobs, cron tasks, event handlers
  • 路由定义 / CLI命令 / 导出函数
  • 业务逻辑模块及其职责
  • 后台任务、定时任务、事件处理器

2.5 Data Model

2.5 数据模型

  • Database schemas, migrations, ORMs
  • Key data structures and their relationships
  • State management approach
  • 数据库 schemas, migrations, ORMs
  • 核心数据结构及其关系
  • 状态管理方案

2.6 API Surface

2.6 API 接口

  • HTTP endpoints (method, path, request/response shapes)
  • GraphQL schema / gRPC protos / WebSocket events
  • CLI interface (commands, flags, arguments)
  • Exported library API (public functions, classes, types)
  • HTTP endpoints (method, path, request/response shapes)
  • GraphQL schema / gRPC protos / WebSocket events
  • CLI interface (commands, flags, arguments)
  • 库导出的公共API(public functions, classes, types)

2.7 Configuration & Environment

2.7 配置与环境

  • Environment variables and their purpose
  • Config files and their schema
  • Feature flags, toggles
  • 环境变量及其用途
  • Config files and their schema
  • Feature flags, toggles

2.8 External Dependencies

2.8 外部依赖

  • Third-party services (databases, queues, APIs)
  • Infrastructure requirements (cloud services, storage)
  • Authentication/authorization providers
  • 第三方服务(databases, queues, APIs)
  • 基础设施要求(cloud services, storage)
  • Authentication/authorization providers

2.9 Testing & Quality

2.9 测试与质量保障

  • Test framework and approach (unit, integration, e2e)
  • Coverage patterns (what's tested, what's not)
  • Linting, formatting, type checking setup
  • 测试框架与测试策略(unit, integration, e2e)
  • 测试覆盖情况(what's tested, what's not)
  • Linting, formatting, type checking setup

2.10 Deployment & Operations

2.10 部署与运维

  • CI/CD configuration
  • Deployment targets and strategies
  • Monitoring, logging, health checks

  • CI/CD configuration
  • Deployment targets and strategies
  • Monitoring, logging, health checks

Step 3: SPEC Document Structure

步骤3:SPEC文档结构

Generate the SPEC with these sections. Omit sections that don't apply.
markdown
undefined
生成包含以下章节的SPEC文档,省略不适用的章节。
markdown
undefined

SPEC: [Project Name]

SPEC: [Project Name]

Reverse-engineered specification — generated [date] from commit [short-hash]
Reverse-engineered specification — generated [date] from commit [short-hash]

1. Overview

1. Overview

1.1 Purpose

1.1 Purpose

[One paragraph: what problem this project solves and for whom]
[One paragraph: what problem this project solves and for whom]

1.2 Key Capabilities

1.2 Key Capabilities

  • [Bullet list of what the system can do, from a user's perspective]
  • [Bullet list of what the system can do, from a user's perspective]

1.3 Architecture Style

1.3 Architecture Style

[e.g., "Monolithic Express.js API with React SPA frontend", "CLI tool with plugin system", "Microservices communicating over gRPC"]

[e.g., "Monolithic Express.js API with React SPA frontend", "CLI tool with plugin system", "Microservices communicating over gRPC"]

2. Tech Stack

2. Tech Stack

LayerTechnologyVersion
Language......
Framework......
Database......
Build......
Test......
Deploy......

LayerTechnologyVersion
Language......
Framework......
Database......
Build......
Test......
Deploy......

3. Project Structure

3. Project Structure

[Directory tree with annotations explaining each top-level directory's purpose]

[Directory tree with annotations explaining each top-level directory's purpose]

4. Data Model

4. Data Model

4.1 Core Entities

4.1 Core Entities

[For each entity: name, fields, relationships, constraints]
[For each entity: name, fields, relationships, constraints]

4.2 State Transitions

4.2 State Transitions

[If applicable: lifecycle states and valid transitions]

[If applicable: lifecycle states and valid transitions]

5. API Surface

5. API Surface

5.1 [Interface Type: REST / CLI / Library / etc.]

5.1 [Interface Type: REST / CLI / Library / etc.]

[For each endpoint/command/function:]
MethodPath/CommandDescriptionAuth
............
[For each endpoint/command/function:]
MethodPath/CommandDescriptionAuth
............

5.2 Request/Response Schemas

5.2 Request/Response Schemas

[Key request/response shapes with field types]

[Key request/response shapes with field types]

6. Configuration

6. Configuration

Variable / KeyRequiredDefaultDescription
............

Variable / KeyRequiredDefaultDescription
............

7. External Dependencies

7. External Dependencies

ServicePurposeFailure Impact
.........

ServicePurposeFailure Impact
.........

8. Business Rules & Constraints

8. Business Rules & Constraints

  • [Numbered list of invariants, validation rules, and business logic constraints discovered in the code]

  • [Numbered list of invariants, validation rules, and business logic constraints discovered in the code]

9. Non-Functional Characteristics

9. Non-Functional Characteristics

9.1 Performance

9.1 Performance

[Observed patterns: caching, pagination, batch processing, etc.]
[Observed patterns: caching, pagination, batch processing, etc.]

9.2 Security

9.2 Security

[Auth mechanism, input validation patterns, secrets management]
[Auth mechanism, input validation patterns, secrets management]

9.3 Error Handling

9.3 Error Handling

[Error strategy: custom error types, error codes, retry policies]

[Error strategy: custom error types, error codes, retry policies]

10. Testing Strategy

10. Testing Strategy

TypeFrameworkCoverage Pattern
Unit......
Integration......
E2E......

TypeFrameworkCoverage Pattern
Unit......
Integration......
E2E......

11. Known Gaps & Assumptions

11. Known Gaps & Assumptions

  • [Things that are unclear from the code alone]
  • [Assumptions made during analysis]
  • [Areas with no tests or documentation]

  • [Things that are unclear from the code alone]
  • [Assumptions made during analysis]
  • [Areas with no tests or documentation]

12. Appendix

12. Appendix

A. Dependency Graph

A. Dependency Graph

[Key module dependencies, import relationships]
[Key module dependencies, import relationships]

B. Environment Setup

B. Environment Setup

[Steps to run the project locally, derived from config and scripts]

---
[Steps to run the project locally, derived from config and scripts]

---

Step 4: Review & Iteration

步骤4:审核与迭代

After generating the SPEC, present it and ask:
SPEC generated. Please review:

- Are there sections that need more detail?
- Are there inaccuracies I should correct?
- Should I add/remove any sections?
- Is the depth level appropriate?

Reply OK to save, or provide feedback for iteration.
Apply feedback and re-present until user confirms.

生成SPEC文档后,提交给用户并询问:
SPEC generated. Please review:

- Are there sections that need more detail?
- Are there inaccuracies I should correct?
- Should I add/remove any sections?
- Is the depth level appropriate?

Reply OK to save, or provide feedback for iteration.
根据反馈调整文档,重新提交直到用户确认。

Step 5: Save

步骤5:保存文档

Ask user for save location:
Where should I save the SPEC?

A. docs/SPEC.md (recommended)
B. SPEC.md (project root)
C. Custom path: [specify]

询问用户保存位置:
Where should I save the SPEC?

A. docs/SPEC.md (recommended)
B. SPEC.md (project root)
C. Custom path: [specify]

Analysis Heuristics

分析启发式规则

Identifying Purpose

识别项目用途

  • Look at README first line, package description field, CLI help text
  • Check the main entry point — what does it bootstrap?
  • Look at test descriptions — they often describe expected behavior in plain language
  • 先查看README首行、包描述字段、CLI帮助文本
  • 检查主入口文件——它启动了什么?
  • 查看测试描述——通常会用通俗语言描述预期行为

Discovering Architecture

发现架构设计

  • Map
    import
    /
    require
    statements to build dependency graph
  • Identify layers by directory naming:
    controllers
    ,
    services
    ,
    models
    ,
    routes
    ,
    handlers
    ,
    domain
    ,
    infra
  • Check for dependency injection patterns, middleware chains, plugin registrations
  • 映射
    import
    /
    require
    语句以构建依赖关系图
  • 通过目录命名识别分层:
    controllers
    ,
    services
    ,
    models
    ,
    routes
    ,
    handlers
    ,
    domain
    ,
    infra
  • 检查依赖注入模式、中间件链、插件注册机制

Extracting Business Rules

提取业务规则

  • Look for validation functions, guard clauses, assertion statements
  • Check error messages — they often describe what went wrong in business terms
  • Examine test assertions — they encode expected behavior
  • 查找验证函数、守卫子句、断言语句
  • 查看错误信息——通常会用业务术语描述问题
  • 分析测试断言——它们编码了预期行为

Finding API Contracts

提取API契约

  • Route registrations (Express:
    app.get()
    , FastAPI:
    @app.get()
    , Go:
    mux.HandleFunc()
    )
  • OpenAPI/Swagger files if present
  • Request validation schemas (Joi, Zod, Pydantic, struct tags)
  • CLI flag/argument definitions (cobra, argparse, yargs)
  • 路由注册(Express:
    app.get()
    , FastAPI:
    @app.get()
    , Go:
    mux.HandleFunc()
  • 若存在OpenAPI/Swagger文件,优先查看
  • 请求验证schemas(Joi, Zod, Pydantic, struct tags)
  • CLI参数/选项定义(cobra, argparse, yargs)

Detecting Data Models

识别数据模型

  • ORM model definitions (Prisma, SQLAlchemy, GORM, TypeORM)
  • Migration files (in chronological order)
  • Type/interface definitions for core domain objects
  • Database seed files

  • ORM模型定义(Prisma, SQLAlchemy, GORM, TypeORM)
  • 迁移文件(按时间顺序)
  • 核心领域对象的类型/接口定义
  • 数据库种子文件

Edge Cases

边缘场景处理

ScenarioHandling
Project has no README or documentationNote this in "Known Gaps"; infer purpose from code
Monorepo with multiple servicesAsk user which service(s) to analyze; produce one SPEC per service or a unified SPEC with clear boundaries
Project uses code generationDocument the generated code's purpose but focus on the source of truth (schemas, proto files, templates)
Legacy project with mixed patternsDocument all observed patterns, note inconsistencies in "Known Gaps"
Project is a library (no runtime)Focus on exported API surface, type contracts, and usage patterns from tests
Incomplete or broken codeDocument what exists, mark broken/incomplete areas explicitly
Project >1000 filesStart with entry points and trace key flows; don't exhaustively read every file
Multiple languages in one repoDocument each language's role and how they interact

场景处理方式
项目无README或任何文档在“已知缺口”中记录此情况;从代码中推断项目用途
包含多个服务的单体仓库询问用户需要分析哪些服务;为每个服务生成单独的SPEC文档,或生成带有清晰边界的统一SPEC文档
项目使用代码生成记录生成代码的用途,但重点关注数据源(schemas, proto files, templates)
混合模式的遗留项目记录所有观察到的模式,在“已知缺口”中注明不一致之处
纯库项目(无运行时)重点关注导出的API接口、类型契约及测试中的使用模式
代码不完整或存在缺陷记录现有内容,明确标记存在缺陷/未完成的区域
项目文件数超过1000个从入口文件开始,追踪关键流程;无需逐文件通读
仓库中包含多种语言记录每种语言的角色及它们的交互方式

Quality Criteria

质量标准

A good reverse-engineered SPEC should pass these checks:
  • A developer unfamiliar with the project could understand its purpose in 60 seconds
  • The tech stack section is complete enough to set up a dev environment
  • API contracts are specific enough to write a client against
  • Data models are complete enough to recreate the schema
  • Business rules are explicit (not buried in "see code")
  • Known gaps are honestly listed (don't invent what you can't determine)
  • The SPEC matches the actual code (not aspirational documentation)

一份优质的逆向生成SPEC文档应满足以下检查项:
  • 不熟悉项目的开发者能在60秒内理解其用途
  • 技术栈部分足够完整,可用于搭建开发环境
  • API契约足够具体,可基于此编写客户端
  • 数据模型足够完整,可用于重建数据库schema
  • 业务规则明确(不隐藏在“查看代码”中)
  • 如实列出已知缺口(不编造无法确定的内容)
  • SPEC文档与实际代码一致(而非理想化的文档)

Anti-Patterns to Avoid

需避免的反模式

  • Don't invent intent. If you can't determine WHY something exists, say so. Don't fabricate rationale.
  • Don't copy code into the SPEC. Describe behavior and contracts, don't paste implementations.
  • Don't include transient state. The SPEC describes the system's design, not its current runtime state.
  • Don't over-specify internals. Focus on boundaries, contracts, and behavior. Internal implementation details belong in code comments, not specs.
  • Don't assume the README is accurate. READMEs often lag behind code. Verify claims against actual implementation.
  • 不要编造意图:如果无法确定某部分存在的原因,如实说明,不要编造理由。
  • 不要复制代码到SPEC中:描述行为和契约,不要粘贴实现代码。
  • 不要包含临时状态:SPEC描述的是系统设计,而非当前运行时状态。
  • 不要过度规范内部细节:重点关注边界、契约和行为。内部实现细节应放在代码注释中,而非规格文档。
  • 不要假设README内容准确:README常滞后于代码,需对照实际实现验证内容。