autopilot

Compare original and translation side by side

🇺🇸

Original

English

🇨🇳

Translation

Chinese

<Purpose> Autopilot takes a brief product idea and autonomously handles the full lifecycle: requirements analysis, technical design, planning, parallel implementation, QA cycling, and multi-perspective validation. It produces working, verified code from a 2-3 line description. </Purpose>

<Use_When>

User wants end-to-end autonomous execution from an idea to working code
User says "autopilot", "auto pilot", "autonomous", "build me", "create me", "make me", "full auto", "handle it all", or "I want a/an..."
Task requires multiple phases: planning, coding, testing, and validation
User wants hands-off execution and is willing to let the system run to completion </Use_When>

<Do_Not_Use_When>

User wants to explore options or brainstorm -- use
```
plan
```
skill instead
User says "just explain", "draft only", or "what would you suggest" -- respond conversationally
User wants a single focused code change -- use
```
ralph
```
or delegate to an executor agent
User wants to review or critique an existing plan -- use
```
plan --review
```
Task is a quick fix or small bug -- use direct executor delegation </Do_Not_Use_When>

<Why_This_Exists> Most non-trivial software tasks require coordinated phases: understanding requirements, designing a solution, implementing in parallel, testing, and validating quality. Autopilot orchestrates all of these phases automatically so the user can describe what they want and receive working code without managing each step. </Why_This_Exists>

<Execution_Policy>

Each phase must complete before the next begins
Parallel execution is used within phases where possible (Phase 2 and Phase 4)
QA cycles repeat up to 5 times; if the same error persists 3 times, stop and report the fundamental issue
Validation requires approval from all reviewers; rejected items get fixed and re-validated
Cancel with
```
/cancel
```
at any time; progress is preserved for resume
If a deep-interview spec exists, use it as high-clarity phase input instead of re-expanding from scratch
If input is too vague for reliable expansion, offer/trigger
```
$deep-interview
```
first
Do not enter expansion/planning/execution-heavy phases until pre-context grounding exists; if fast execution is forced, proceed only with explicit risk notes
Default to concise, evidence-dense progress and completion reporting unless the user or risk level requires more detail
Treat newer user task updates as local overrides for the active workflow branch while preserving earlier non-conflicting constraints
If correctness depends on additional inspection, retrieval, execution, or verification, keep using the relevant tools until the workflow is grounded
Continue through clear, low-risk, reversible next steps automatically; ask only when the next step is materially branching, destructive, or preference-dependent </Execution_Policy>

<Steps> 0. **Pre-context Intake (required before Phase 0 starts)**: - Derive a task slug from the request. - Load the latest relevant snapshot from `.omx/context/{slug}-*.md` when available. - If no snapshot exists, create `.omx/context/{slug}-{timestamp}.md` (UTC `YYYYMMDDTHHMMSSZ`) with: - Task statement - Desired outcome - Known facts/evidence - Constraints - Unknowns/open questions - Likely codebase touchpoints - If ambiguity remains high, run `explore` first for brownfield facts, then run `$deep-interview --quick <task>` before proceeding. - Carry the snapshot path into autopilot artifacts/state so all phases share grounded context.

Phase 0 - Expansion: Turn the user's idea into a detailed spec
- If
```
.omx/specs/deep-interview-*.md
```
  exists for this task: reuse it and skip redundant expansion work
- If prompt is highly vague: route to
```
$deep-interview
```
  for Socratic ambiguity-gated clarification
- Analyst (THOROUGH tier): Extract requirements
- Architect (THOROUGH tier): Create technical specification
- Output:
```
.omx/plans/autopilot-spec.md
```
Phase 1 - Planning: Create an implementation plan from the spec
- Architect (THOROUGH tier): Create plan (direct mode, no interview)
- Critic (THOROUGH tier): Validate plan
- Output:
```
.omx/plans/autopilot-impl.md
```
Phase 2 - Execution: Implement the plan using Ralph + Ultrawork
- LOW-tier executor/search roles: Simple tasks
- STANDARD-tier executor roles: Standard tasks
- THOROUGH-tier executor/architect roles: Complex tasks
- Run independent tasks in parallel
Phase 3 - QA: Cycle until all tests pass (UltraQA mode)
- Build, lint, test, fix failures
- Repeat up to 5 cycles
- Stop early if the same error repeats 3 times (indicates a fundamental issue)
Phase 4 - Validation: Multi-perspective review in parallel
- Architect: Functional completeness
- Security-reviewer: Vulnerability check
- Code-reviewer: Quality review
- All must approve; fix and re-validate on rejection

Phase 5 - Cleanup: Clear all mode state via OMX MCP tools on successful completion

```
state_clear({mode: "autopilot"})
```
```
state_clear({mode: "ralph"})
```
```
state_clear({mode: "ultrawork"})
```
```
state_clear({mode: "ultraqa"})
```
Or run
```
/cancel
```
for clean exit </Steps>

<Tool_Usage>

Before first MCP tool use, call
```
ToolSearch("mcp")
```
to discover deferred MCP tools
Use
```
ask_codex
```
with
```
agent_role: "architect"
```
for Phase 4 architecture validation

Use

ask_codex

with

agent_role: "security-reviewer"

for Phase 4 security review

Use
```
ask_codex
```
with
```
agent_role: "code-reviewer"
```
for Phase 4 quality review
Agents form their own analysis first, then consult Codex for cross-validation
If ToolSearch finds no MCP tools or Codex is unavailable, proceed without it -- never block on external tools </Tool_Usage>

<Purpose> Autopilot 接收简短的产品创意，自主处理全生命周期：需求分析、技术设计、规划、并行实现、QA 迭代、多维度验证。它可以仅通过 2-3 行的描述生成可运行、经过验证的代码。 </Purpose>

<Use_When>

用户需要从创意到可运行代码的端到端自主执行
用户提到 "autopilot"、"auto pilot"、"autonomous"、"build me"、"create me"、"make me"、"full auto"、"handle it all" 或 "I want a/an..."
任务需要多个阶段：规划、编码、测试和验证
用户希望无需手动干预的执行，愿意让系统运行直到完成 </Use_When>

<Do_Not_Use_When>

用户希望探索选项或头脑风暴——请改用
```
plan
```
技能
用户提到 "just explain"、"draft only" 或 "what would you suggest"——采用对话式回复
用户需要单一聚焦的代码变更——请使用
```
ralph
```
或委派给执行 Agent
用户希望审查或评论现有计划——请使用
```
plan --review
```
任务是快速修复或小 Bug——直接委派给执行器 </Do_Not_Use_When>

<Why_This_Exists> 大多数非琐碎的软件任务需要多阶段协调：理解需求、设计解决方案、并行实现、测试和质量验证。Autopilot 会自动编排所有这些阶段，用户只需描述需求就能获得可运行代码，无需管理每一个步骤。 </Why_This_Exists>

<Execution_Policy>

每个阶段必须完成后才能进入下一阶段
阶段内部尽可能使用并行执行（第 2 和第 4 阶段）
QA 循环最多重复 5 次；如果同一错误持续出现 3 次，停止并报告根本问题
验证需要所有审核者的批准；被驳回的项需要修复后重新验证
随时可以使用
```
/cancel
```
取消；进度会被保存以便恢复
如果存在 deep-interview 规格说明，将其作为高清晰度的阶段输入，无需从零重新扩展
如果输入过于模糊，无法可靠扩展，优先提供/触发
```
$deep-interview
```
先
在获取前置上下文基础之前，不要进入扩展/规划/重执行阶段；如果强制要求快速执行，仅在添加明确风险提示后继续
默认提供简洁、高密度证据的进度和完成报告，除非用户或风险等级要求更多细节
最新的用户任务更新视为当前工作流分支的局部覆盖，同时保留早期无冲突的约束条件
如果正确性需要额外的检查、检索、执行或验证，持续使用相关工具直到工作流落地
自动执行清晰、低风险、可回滚的后续步骤；仅当下一个步骤存在重大分支、破坏性操作或依赖用户偏好时才进行询问 </Execution_Policy>

<Steps> 0. **前置上下文采集（第 0 阶段开始前必须完成）**： - 从请求中生成任务标识（task slug）。 - 如果存在 `.omx/context/{slug}-*.md`，加载最新的相关快照。 - 如果不存在快照，创建 `.omx/context/{slug}-{timestamp}.md`（UTC 格式为 `YYYYMMDDTHHMMSSZ`），包含以下内容： - 任务说明 - 预期结果 - 已知事实/证据 - 约束条件 - 未知项/待解决问题 - 可能涉及的代码库位置 - 如果歧义仍然很高，先运行 `explore` 获取现有项目事实，然后运行 `$deep-interview --quick <task>` 再继续。 - 将快照路径带入 Autopilot 工件/状态，确保所有阶段共享统一的上下文基础。

第 0 阶段 - 扩展：将用户创意转化为详细规格说明
- 如果当前任务存在
```
.omx/specs/deep-interview-*.md
```
  ：复用该文件，跳过冗余的扩展工作
- 如果提示非常模糊：路由到
```
$deep-interview
```
  进行苏格拉底式歧义澄清
- 分析师（THOROUGH 层级）：提取需求
- 架构师（THOROUGH 层级）：创建技术规格说明
- 输出：
```
.omx/plans/autopilot-spec.md
```
第 1 阶段 - 规划：根据规格说明生成实现计划
- 架构师（THOROUGH 层级）：创建计划（直接模式，无需访谈）
- 评审员（THOROUGH 层级）：验证计划
- 输出：
```
.omx/plans/autopilot-impl.md
```
第 2 阶段 - 执行：使用 Ralph + Ultrawork 实现计划
- LOW 层级执行器/搜索角色：处理简单任务
- STANDARD 层级执行器角色：处理标准任务
- THOROUGH 层级执行器/架构师角色：处理复杂任务
- 并行运行独立任务
第 3 阶段 - QA：循环迭代直到所有测试通过（UltraQA 模式）
- 构建、代码检查、测试、修复失败项
- 最多重复 5 次循环
- 如果同一错误重复 3 次则提前停止（表明存在根本问题）
第 4 阶段 - 验证：并行开展多维度评审
- 架构师：功能完整性检查
- 安全评审员：漏洞检查
- 代码评审员：质量评审
- 所有评审必须通过；被驳回的项需要修复后重新验证

第 5 阶段 - 清理：成功完成后通过 OMX MCP 工具清除所有模式状态

```
state_clear({mode: "autopilot"})
```
```
state_clear({mode: "ralph"})
```
```
state_clear({mode: "ultrawork"})
```
```
state_clear({mode: "ultraqa"})
```
或运行
```
/cancel
```
实现干净退出 </Steps>

<Tool_Usage>

首次使用 MCP 工具前，调用
```
ToolSearch("mcp")
```
发现待加载的 MCP 工具
第 4 阶段架构验证使用
```
ask_codex
```
并指定
```
agent_role: "architect"
```

第 4 阶段安全评审使用

ask_codex

并指定

agent_role: "security-reviewer"

第 4 阶段质量评审使用
```
ask_codex
```
并指定
```
agent_role: "code-reviewer"
```
Agent 先自行完成分析，再咨询 Codex 进行交叉验证
如果 ToolSearch 没有找到 MCP 工具或 Codex 不可用，继续执行不要阻塞在外部工具上 </Tool_Usage>

State Management

状态管理

Use

omx_state

MCP tools for autopilot lifecycle state.

On start:

state_write({mode: "autopilot", active: true, current_phase: "expansion", started_at: "<now>", state: {context_snapshot_path: "<snapshot-path>"}})

On phase transitions:

state_write({mode: "autopilot", current_phase: "planning"})

state_write({mode: "autopilot", current_phase: "execution"})

state_write({mode: "autopilot", current_phase: "qa"})

state_write({mode: "autopilot", current_phase: "validation"})

On completion:

state_write({mode: "autopilot", active: false, current_phase: "complete", completed_at: "<now>"})

On cancellation/cleanup: run
```
$cancel
```
(which should call
```
state_clear(mode="autopilot")
```
)

使用

omx_state

MCP 工具管理 Autopilot 生命周期状态。

启动时：

state_write({mode: "autopilot", active: true, current_phase: "expansion", started_at: "<now>", state: {context_snapshot_path: "<snapshot-path>"}})

阶段切换时：

state_write({mode: "autopilot", current_phase: "planning"})

state_write({mode: "autopilot", current_phase: "execution"})

state_write({mode: "autopilot", current_phase: "qa"})

state_write({mode: "autopilot", current_phase: "validation"})

完成时：

state_write({mode: "autopilot", active: false, current_phase: "complete", completed_at: "<now>"})

取消/清理时：运行
```
$cancel
```
（会调用
```
state_clear(mode="autopilot")
```
）

Scenario Examples

场景示例

Good: The user says

continue

after the workflow already has a clear next step. Continue the current branch of work instead of restarting or re-asking the same question.

Good: The user changes only the output shape or downstream delivery step (for example

make a PR

). Preserve earlier non-conflicting workflow constraints and apply the update locally.

Bad: The user says

continue

, and the workflow restarts discovery or stops before the missing verification/evidence is gathered.

<Examples> <Good> User: "autopilot A REST API for a bookstore inventory with CRUD operations using TypeScript" Why good: Specific domain (bookstore), clear features (CRUD), technology constraint (TypeScript). Autopilot has enough context to expand into a full spec. </Good> <Good> User: "build me a CLI tool that tracks daily habits with streak counting" Why good: Clear product concept with a specific feature. The "build me" trigger activates autopilot. </Good> <Bad> User: "fix the bug in the login page" Why bad: This is a single focused fix, not a multi-phase project. Use direct executor delegation or ralph instead. </Bad> <Bad> User: "what are some good approaches for adding caching?" Why bad: This is an exploration/brainstorming request. Respond conversationally or use the plan skill. </Bad> </Examples>

<Escalation_And_Stop_Conditions>

Stop and report when the same QA error persists across 3 cycles (fundamental issue requiring human input)
Stop and report when validation keeps failing after 3 re-validation rounds
Stop when the user says "stop", "cancel", or "abort"
If requirements were too vague and expansion produces an unclear spec, pause and redirect to
```
$deep-interview
```
before proceeding </Escalation_And_Stop_Conditions>

<Final_Checklist>

All 5 phases completed (Expansion, Planning, Execution, QA, Validation)
All validators approved in Phase 4
Tests pass (verified with fresh test run output)
Build succeeds (verified with fresh build output)
State files cleaned up
User informed of completion with summary of what was built </Final_Checklist>

好的示例：工作流已经有明确的下一步时用户说

continue

。继续当前分支的工作，不要重启或重复询问相同问题。

好的示例：用户仅修改输出形式或下游交付步骤（比如

make a PR

）。保留早期无冲突的工作流约束，局部应用更新。

不好的示例：用户说

continue

，工作流却重启发现流程，或在缺失验证/证据时停止。

<Examples> <Good> 用户："autopilot A REST API for a bookstore inventory with CRUD operations using TypeScript" 为什么好：明确的领域（书店）、清晰的功能（CRUD）、技术约束（TypeScript）。Autopilot 有足够的上下文扩展为完整的规格说明。 </Good> <Good> 用户："build me a CLI tool that tracks daily habits with streak counting" 为什么好：清晰的产品概念和特定功能。"build me" 触发词会激活 Autopilot。 </Good> <Bad> 用户："fix the bug in the login page" 为什么不好：这是单一聚焦的修复，不是多阶段项目。请直接委派给执行器或使用 ralph 代替。 </Bad> <Bad> 用户："what are some good approaches for adding caching?" 为什么不好：这是探索/头脑风暴请求。采用对话式回复或使用 plan 技能。 </Bad> </Examples>

<Escalation_And_Stop_Conditions>

同一 QA 错误在 3 次循环中持续出现时停止并报告（需要人工输入的根本问题）
3 次重新验证后仍然不通过时停止并报告
用户说 "stop"、"cancel" 或 "abort" 时停止
如果需求过于模糊，扩展后生成的规格说明不清晰，先暂停并重定向到
```
$deep-interview
```
再继续 </Escalation_And_Stop_Conditions>

<Final_Checklist>

所有 5 个阶段已完成（扩展、规划、执行、QA、验证）
第 4 阶段所有验证者已批准
测试通过（通过最新的测试运行输出验证）
构建成功（通过最新的构建输出验证）
状态文件已清理
已告知用户完成状态并总结构建内容 </Final_Checklist>

Configuration

配置

Optional settings in

~/.codex/config.toml

toml

[omx.autopilot]
maxIterations = 10
maxQaCycles = 5
maxValidationRounds = 3
pauseAfterExpansion = false
pauseAfterPlanning = false
skipQa = false
skipValidation = false

~/.codex/config.toml

中的可选设置：

toml

[omx.autopilot]
maxIterations = 10
maxQaCycles = 5
maxValidationRounds = 3
pauseAfterExpansion = false
pauseAfterPlanning = false
skipQa = false
skipValidation = false

Resume

恢复

If autopilot was cancelled or failed, run

/autopilot

again to resume from where it stopped.

如果 Autopilot 被取消或运行失败，再次运行

/autopilot

即可从停止位置恢复。

Recommended Clarity Pipeline

Best Practices for Input

输入最佳实践

Be specific about the domain -- "bookstore" not "store"
Mention key features -- "with CRUD", "with authentication"
Specify constraints -- "using TypeScript", "with PostgreSQL"
Let it run -- avoid interrupting unless truly needed

明确领域——比如 "书店" 而不是 "商店"
说明核心功能——比如 "支持 CRUD"、"带身份验证"
指定约束条件——比如 "使用 TypeScript"、"对接 PostgreSQL"
让它运行——除非真的需要，否则不要中断

Pipeline Orchestrator (v0.8+)

流水线编排器（v0.8+）

Autopilot can be driven by the configurable pipeline orchestrator (

src/pipeline/

), which sequences stages through a uniform

PipelineStage

interface:

RALPLAN (consensus planning) -> team-exec (Codex CLI workers) -> ralph-verify (architect verification)

Pipeline configuration options:

toml

[omx.autopilot.pipeline]
maxRalphIterations = 10    # Ralph verification iteration ceiling
workerCount = 2            # Number of Codex CLI team workers
agentType = "executor"     # Agent type for team workers

The pipeline persists state via

pipeline-state.json

and supports resume from the last incomplete stage. See

src/pipeline/orchestrator.ts

for the full API.

Autopilot 可以由可配置的流水线编排器（

src/pipeline/

）驱动，通过统一的

PipelineStage

接口编排各个阶段：

RALPLAN (consensus planning) -> team-exec (Codex CLI workers) -> ralph-verify (architect verification)

流水线配置选项：

toml

[omx.autopilot.pipeline]
maxRalphIterations = 10    # Ralph verification iteration ceiling
workerCount = 2            # Number of Codex CLI team workers
agentType = "executor"     # Agent type for team workers

流水线通过

pipeline-state.json

持久化状态，支持从最后一个未完成阶段恢复。完整 API 请参考

src/pipeline/orchestrator.ts

。

Troubleshooting

故障排查

Stuck in a phase? Check TODO list for blocked tasks, run

state_read({mode: "autopilot"})

, or cancel and resume.

QA cycles exhausted? The same error 3 times indicates a fundamental issue. Review the error pattern; manual intervention may be needed.

Validation keeps failing? Review the specific issues. Requirements may have been too vague -- cancel and provide more detail. </Advanced>

卡在某个阶段？ 检查待办列表中的阻塞任务，运行

state_read({mode: "autopilot"})

，或取消后恢复。

QA 循环已用尽？ 同一错误出现 3 次表明存在根本问题。查看错误模式；可能需要人工介入。

验证持续失败？ 查看具体问题。需求可能过于模糊——取消后提供更多细节。 </Advanced>

autopilot

Original

Translation

State Management

状态管理

Scenario Examples

场景示例

Configuration

配置

Resume

恢复

Recommended Clarity Pipeline

推荐的清晰度处理流程

Best Practices for Input

输入最佳实践

Pipeline Orchestrator (v0.8+)

流水线编排器（v0.8+）

Troubleshooting

故障排查