autopilot

Compare original and translation side by side

🇺🇸

Original

English
🇨🇳

Translation

Chinese
<Purpose> Autopilot takes a brief product idea and autonomously handles the full lifecycle: requirements analysis, technical design, planning, parallel implementation, QA cycling, and multi-perspective validation. It produces working, verified code from a 2-3 line description. </Purpose>
<Use_When>
  • User wants end-to-end autonomous execution from an idea to working code
  • User says "autopilot", "auto pilot", "autonomous", "build me", "create me", "make me", "full auto", "handle it all", or "I want a/an..."
  • Task requires multiple phases: planning, coding, testing, and validation
  • User wants hands-off execution and is willing to let the system run to completion </Use_When>
<Do_Not_Use_When>
  • User wants to explore options or brainstorm -- use
    plan
    skill instead
  • User says "just explain", "draft only", or "what would you suggest" -- respond conversationally
  • User wants a single focused code change -- use
    ralph
    or delegate to an executor agent
  • User wants to review or critique an existing plan -- use
    plan --review
  • Task is a quick fix or small bug -- use direct executor delegation </Do_Not_Use_When>
<Why_This_Exists> Most non-trivial software tasks require coordinated phases: understanding requirements, designing a solution, implementing in parallel, testing, and validating quality. Autopilot orchestrates all of these phases automatically so the user can describe what they want and receive working code without managing each step. </Why_This_Exists>
<Execution_Policy>
  • Each phase must complete before the next begins
  • Parallel execution is used within phases where possible (Phase 2 and Phase 4)
  • QA cycles repeat up to 5 times; if the same error persists 3 times, stop and report the fundamental issue
  • Validation requires approval from all reviewers; rejected items get fixed and re-validated
  • Cancel with
    /cancel
    at any time; progress is preserved for resume
  • If a deep-interview spec exists, use it as high-clarity phase input instead of re-expanding from scratch
  • If input is too vague for reliable expansion, offer/trigger
    $deep-interview
    first
  • Do not enter expansion/planning/execution-heavy phases until pre-context grounding exists; if fast execution is forced, proceed only with explicit risk notes
  • Default to concise, evidence-dense progress and completion reporting unless the user or risk level requires more detail
  • Treat newer user task updates as local overrides for the active workflow branch while preserving earlier non-conflicting constraints
  • If correctness depends on additional inspection, retrieval, execution, or verification, keep using the relevant tools until the workflow is grounded
  • Continue through clear, low-risk, reversible next steps automatically; ask only when the next step is materially branching, destructive, or preference-dependent </Execution_Policy>
<Steps> 0. **Pre-context Intake (required before Phase 0 starts)**: - Derive a task slug from the request. - Load the latest relevant snapshot from `.omx/context/{slug}-*.md` when available. - If no snapshot exists, create `.omx/context/{slug}-{timestamp}.md` (UTC `YYYYMMDDTHHMMSSZ`) with: - Task statement - Desired outcome - Known facts/evidence - Constraints - Unknowns/open questions - Likely codebase touchpoints - If ambiguity remains high, run `explore` first for brownfield facts, then run `$deep-interview --quick <task>` before proceeding. - Carry the snapshot path into autopilot artifacts/state so all phases share grounded context.
  1. Phase 0 - Expansion: Turn the user's idea into a detailed spec
    • If
      .omx/specs/deep-interview-*.md
      exists for this task: reuse it and skip redundant expansion work
    • If prompt is highly vague: route to
      $deep-interview
      for Socratic ambiguity-gated clarification
    • Analyst (THOROUGH tier): Extract requirements
    • Architect (THOROUGH tier): Create technical specification
    • Output:
      .omx/plans/autopilot-spec.md
  2. Phase 1 - Planning: Create an implementation plan from the spec
    • Architect (THOROUGH tier): Create plan (direct mode, no interview)
    • Critic (THOROUGH tier): Validate plan
    • Output:
      .omx/plans/autopilot-impl.md
  3. Phase 2 - Execution: Implement the plan using Ralph + Ultrawork
    • LOW-tier executor/search roles: Simple tasks
    • STANDARD-tier executor roles: Standard tasks
    • THOROUGH-tier executor/architect roles: Complex tasks
    • Run independent tasks in parallel
  4. Phase 3 - QA: Cycle until all tests pass (UltraQA mode)
    • Build, lint, test, fix failures
    • Repeat up to 5 cycles
    • Stop early if the same error repeats 3 times (indicates a fundamental issue)
  5. Phase 4 - Validation: Multi-perspective review in parallel
    • Architect: Functional completeness
    • Security-reviewer: Vulnerability check
    • Code-reviewer: Quality review
    • All must approve; fix and re-validate on rejection
  6. Phase 5 - Cleanup: Clear all mode state via OMX MCP tools on successful completion
    • state_clear({mode: "autopilot"})
    • state_clear({mode: "ralph"})
    • state_clear({mode: "ultrawork"})
    • state_clear({mode: "ultraqa"})
    • Or run
      /cancel
      for clean exit </Steps>
<Tool_Usage>
  • Before first MCP tool use, call
    ToolSearch("mcp")
    to discover deferred MCP tools
  • Use
    ask_codex
    with
    agent_role: "architect"
    for Phase 4 architecture validation
  • Use
    ask_codex
    with
    agent_role: "security-reviewer"
    for Phase 4 security review
  • Use
    ask_codex
    with
    agent_role: "code-reviewer"
    for Phase 4 quality review
  • Agents form their own analysis first, then consult Codex for cross-validation
  • If ToolSearch finds no MCP tools or Codex is unavailable, proceed without it -- never block on external tools </Tool_Usage>
<Purpose> Autopilot 接收简短的产品创意,自主处理全生命周期:需求分析、技术设计、规划、并行实现、QA 迭代、多维度验证。它可以仅通过 2-3 行的描述生成可运行、经过验证的代码。 </Purpose>
<Use_When>
  • 用户需要从创意到可运行代码的端到端自主执行
  • 用户提到 "autopilot"、"auto pilot"、"autonomous"、"build me"、"create me"、"make me"、"full auto"、"handle it all" 或 "I want a/an..."
  • 任务需要多个阶段:规划、编码、测试和验证
  • 用户希望无需手动干预的执行,愿意让系统运行直到完成 </Use_When>
<Do_Not_Use_When>
  • 用户希望探索选项或头脑风暴——请改用
    plan
    技能
  • 用户提到 "just explain"、"draft only" 或 "what would you suggest"——采用对话式回复
  • 用户需要单一聚焦的代码变更——请使用
    ralph
    或委派给执行 Agent
  • 用户希望审查或评论现有计划——请使用
    plan --review
  • 任务是快速修复或小 Bug——直接委派给执行器 </Do_Not_Use_When>
<Why_This_Exists> 大多数非琐碎的软件任务需要多阶段协调:理解需求、设计解决方案、并行实现、测试和质量验证。Autopilot 会自动编排所有这些阶段,用户只需描述需求就能获得可运行代码,无需管理每一个步骤。 </Why_This_Exists>
<Execution_Policy>
  • 每个阶段必须完成后才能进入下一阶段
  • 阶段内部尽可能使用并行执行(第 2 和第 4 阶段)
  • QA 循环最多重复 5 次;如果同一错误持续出现 3 次,停止并报告根本问题
  • 验证需要所有审核者的批准;被驳回的项需要修复后重新验证
  • 随时可以使用
    /cancel
    取消;进度会被保存以便恢复
  • 如果存在 deep-interview 规格说明,将其作为高清晰度的阶段输入,无需从零重新扩展
  • 如果输入过于模糊,无法可靠扩展,优先提供/触发
    $deep-interview
  • 在获取前置上下文基础之前,不要进入扩展/规划/重执行阶段;如果强制要求快速执行,仅在添加明确风险提示后继续
  • 默认提供简洁、高密度证据的进度和完成报告,除非用户或风险等级要求更多细节
  • 最新的用户任务更新视为当前工作流分支的局部覆盖,同时保留早期无冲突的约束条件
  • 如果正确性需要额外的检查、检索、执行或验证,持续使用相关工具直到工作流落地
  • 自动执行清晰、低风险、可回滚的后续步骤;仅当下一个步骤存在重大分支、破坏性操作或依赖用户偏好时才进行询问 </Execution_Policy>
<Steps> 0. **前置上下文采集(第 0 阶段开始前必须完成)**: - 从请求中生成任务标识(task slug)。 - 如果存在 `.omx/context/{slug}-*.md`,加载最新的相关快照。 - 如果不存在快照,创建 `.omx/context/{slug}-{timestamp}.md`(UTC 格式为 `YYYYMMDDTHHMMSSZ`),包含以下内容: - 任务说明 - 预期结果 - 已知事实/证据 - 约束条件 - 未知项/待解决问题 - 可能涉及的代码库位置 - 如果歧义仍然很高,先运行 `explore` 获取现有项目事实,然后运行 `$deep-interview --quick <task>` 再继续。 - 将快照路径带入 Autopilot 工件/状态,确保所有阶段共享统一的上下文基础。
  1. 第 0 阶段 - 扩展:将用户创意转化为详细规格说明
    • 如果当前任务存在
      .omx/specs/deep-interview-*.md
      :复用该文件,跳过冗余的扩展工作
    • 如果提示非常模糊:路由到
      $deep-interview
      进行苏格拉底式歧义澄清
    • 分析师(THOROUGH 层级):提取需求
    • 架构师(THOROUGH 层级):创建技术规格说明
    • 输出:
      .omx/plans/autopilot-spec.md
  2. 第 1 阶段 - 规划:根据规格说明生成实现计划
    • 架构师(THOROUGH 层级):创建计划(直接模式,无需访谈)
    • 评审员(THOROUGH 层级):验证计划
    • 输出:
      .omx/plans/autopilot-impl.md
  3. 第 2 阶段 - 执行:使用 Ralph + Ultrawork 实现计划
    • LOW 层级执行器/搜索角色:处理简单任务
    • STANDARD 层级执行器角色:处理标准任务
    • THOROUGH 层级执行器/架构师角色:处理复杂任务
    • 并行运行独立任务
  4. 第 3 阶段 - QA:循环迭代直到所有测试通过(UltraQA 模式)
    • 构建、代码检查、测试、修复失败项
    • 最多重复 5 次循环
    • 如果同一错误重复 3 次则提前停止(表明存在根本问题)
  5. 第 4 阶段 - 验证:并行开展多维度评审
    • 架构师:功能完整性检查
    • 安全评审员:漏洞检查
    • 代码评审员:质量评审
    • 所有评审必须通过;被驳回的项需要修复后重新验证
  6. 第 5 阶段 - 清理:成功完成后通过 OMX MCP 工具清除所有模式状态
    • state_clear({mode: "autopilot"})
    • state_clear({mode: "ralph"})
    • state_clear({mode: "ultrawork"})
    • state_clear({mode: "ultraqa"})
    • 或运行
      /cancel
      实现干净退出 </Steps>
<Tool_Usage>
  • 首次使用 MCP 工具前,调用
    ToolSearch("mcp")
    发现待加载的 MCP 工具
  • 第 4 阶段架构验证使用
    ask_codex
    并指定
    agent_role: "architect"
  • 第 4 阶段安全评审使用
    ask_codex
    并指定
    agent_role: "security-reviewer"
  • 第 4 阶段质量评审使用
    ask_codex
    并指定
    agent_role: "code-reviewer"
  • Agent 先自行完成分析,再咨询 Codex 进行交叉验证
  • 如果 ToolSearch 没有找到 MCP 工具或 Codex 不可用,继续执行不要阻塞在外部工具上 </Tool_Usage>

State Management

状态管理

Use
omx_state
MCP tools for autopilot lifecycle state.
  • On start:
    state_write({mode: "autopilot", active: true, current_phase: "expansion", started_at: "<now>", state: {context_snapshot_path: "<snapshot-path>"}})
  • On phase transitions:
    state_write({mode: "autopilot", current_phase: "planning"})
    state_write({mode: "autopilot", current_phase: "execution"})
    state_write({mode: "autopilot", current_phase: "qa"})
    state_write({mode: "autopilot", current_phase: "validation"})
  • On completion:
    state_write({mode: "autopilot", active: false, current_phase: "complete", completed_at: "<now>"})
  • On cancellation/cleanup: run
    $cancel
    (which should call
    state_clear(mode="autopilot")
    )
使用
omx_state
MCP 工具管理 Autopilot 生命周期状态。
  • 启动时
    state_write({mode: "autopilot", active: true, current_phase: "expansion", started_at: "<now>", state: {context_snapshot_path: "<snapshot-path>"}})
  • 阶段切换时
    state_write({mode: "autopilot", current_phase: "planning"})
    state_write({mode: "autopilot", current_phase: "execution"})
    state_write({mode: "autopilot", current_phase: "qa"})
    state_write({mode: "autopilot", current_phase: "validation"})
  • 完成时
    state_write({mode: "autopilot", active: false, current_phase: "complete", completed_at: "<now>"})
  • 取消/清理时: 运行
    $cancel
    (会调用
    state_clear(mode="autopilot")

Scenario Examples

场景示例

Good: The user says
continue
after the workflow already has a clear next step. Continue the current branch of work instead of restarting or re-asking the same question.
Good: The user changes only the output shape or downstream delivery step (for example
make a PR
). Preserve earlier non-conflicting workflow constraints and apply the update locally.
Bad: The user says
continue
, and the workflow restarts discovery or stops before the missing verification/evidence is gathered.
<Examples> <Good> User: "autopilot A REST API for a bookstore inventory with CRUD operations using TypeScript" Why good: Specific domain (bookstore), clear features (CRUD), technology constraint (TypeScript). Autopilot has enough context to expand into a full spec. </Good> <Good> User: "build me a CLI tool that tracks daily habits with streak counting" Why good: Clear product concept with a specific feature. The "build me" trigger activates autopilot. </Good> <Bad> User: "fix the bug in the login page" Why bad: This is a single focused fix, not a multi-phase project. Use direct executor delegation or ralph instead. </Bad> <Bad> User: "what are some good approaches for adding caching?" Why bad: This is an exploration/brainstorming request. Respond conversationally or use the plan skill. </Bad> </Examples>
<Escalation_And_Stop_Conditions>
  • Stop and report when the same QA error persists across 3 cycles (fundamental issue requiring human input)
  • Stop and report when validation keeps failing after 3 re-validation rounds
  • Stop when the user says "stop", "cancel", or "abort"
  • If requirements were too vague and expansion produces an unclear spec, pause and redirect to
    $deep-interview
    before proceeding </Escalation_And_Stop_Conditions>
<Final_Checklist>
  • All 5 phases completed (Expansion, Planning, Execution, QA, Validation)
  • All validators approved in Phase 4
  • Tests pass (verified with fresh test run output)
  • Build succeeds (verified with fresh build output)
  • State files cleaned up
  • User informed of completion with summary of what was built </Final_Checklist>
<Advanced>
好的示例:工作流已经有明确的下一步时用户说
continue
。继续当前分支的工作,不要重启或重复询问相同问题。
好的示例:用户仅修改输出形式或下游交付步骤(比如
make a PR
)。保留早期无冲突的工作流约束,局部应用更新。
不好的示例:用户说
continue
,工作流却重启发现流程,或在缺失验证/证据时停止。
<Examples> <Good> 用户:"autopilot A REST API for a bookstore inventory with CRUD operations using TypeScript" 为什么好:明确的领域(书店)、清晰的功能(CRUD)、技术约束(TypeScript)。Autopilot 有足够的上下文扩展为完整的规格说明。 </Good> <Good> 用户:"build me a CLI tool that tracks daily habits with streak counting" 为什么好:清晰的产品概念和特定功能。"build me" 触发词会激活 Autopilot。 </Good> <Bad> 用户:"fix the bug in the login page" 为什么不好:这是单一聚焦的修复,不是多阶段项目。请直接委派给执行器或使用 ralph 代替。 </Bad> <Bad> 用户:"what are some good approaches for adding caching?" 为什么不好:这是探索/头脑风暴请求。采用对话式回复或使用 plan 技能。 </Bad> </Examples>
<Escalation_And_Stop_Conditions>
  • 同一 QA 错误在 3 次循环中持续出现时停止并报告(需要人工输入的根本问题)
  • 3 次重新验证后仍然不通过时停止并报告
  • 用户说 "stop"、"cancel" 或 "abort" 时停止
  • 如果需求过于模糊,扩展后生成的规格说明不清晰,先暂停并重定向到
    $deep-interview
    再继续 </Escalation_And_Stop_Conditions>
<Final_Checklist>
  • 所有 5 个阶段已完成(扩展、规划、执行、QA、验证)
  • 第 4 阶段所有验证者已批准
  • 测试通过(通过最新的测试运行输出验证)
  • 构建成功(通过最新的构建输出验证)
  • 状态文件已清理
  • 已告知用户完成状态并总结构建内容 </Final_Checklist>
<Advanced>

Configuration

配置

Optional settings in
~/.codex/config.toml
:
toml
[omx.autopilot]
maxIterations = 10
maxQaCycles = 5
maxValidationRounds = 3
pauseAfterExpansion = false
pauseAfterPlanning = false
skipQa = false
skipValidation = false
~/.codex/config.toml
中的可选设置:
toml
[omx.autopilot]
maxIterations = 10
maxQaCycles = 5
maxValidationRounds = 3
pauseAfterExpansion = false
pauseAfterPlanning = false
skipQa = false
skipValidation = false

Resume

恢复

If autopilot was cancelled or failed, run
/autopilot
again to resume from where it stopped.
如果 Autopilot 被取消或运行失败,再次运行
/autopilot
即可从停止位置恢复。

Recommended Clarity Pipeline

推荐的清晰度处理流程

For ambiguous requests, prefer:
deep-interview -> ralplan -> autopilot
  • deep-interview
    : ambiguity-gated Socratic requirements
  • ralplan
    : consensus planning (planner/architect/critic)
  • autopilot
    : execution + QA + validation
对于模糊的请求,优先采用:
deep-interview -> ralplan -> autopilot
  • deep-interview
    :基于歧义阈值的苏格拉底式需求澄清
  • ralplan
    :共识规划(规划者/架构师/评审员)
  • autopilot
    :执行 + QA + 验证

Best Practices for Input

输入最佳实践

  1. Be specific about the domain -- "bookstore" not "store"
  2. Mention key features -- "with CRUD", "with authentication"
  3. Specify constraints -- "using TypeScript", "with PostgreSQL"
  4. Let it run -- avoid interrupting unless truly needed
  1. 明确领域——比如 "书店" 而不是 "商店"
  2. 说明核心功能——比如 "支持 CRUD"、"带身份验证"
  3. 指定约束条件——比如 "使用 TypeScript"、"对接 PostgreSQL"
  4. 让它运行——除非真的需要,否则不要中断

Pipeline Orchestrator (v0.8+)

流水线编排器(v0.8+)

Autopilot can be driven by the configurable pipeline orchestrator (
src/pipeline/
), which sequences stages through a uniform
PipelineStage
interface:
RALPLAN (consensus planning) -> team-exec (Codex CLI workers) -> ralph-verify (architect verification)
Pipeline configuration options:
toml
[omx.autopilot.pipeline]
maxRalphIterations = 10    # Ralph verification iteration ceiling
workerCount = 2            # Number of Codex CLI team workers
agentType = "executor"     # Agent type for team workers
The pipeline persists state via
pipeline-state.json
and supports resume from the last incomplete stage. See
src/pipeline/orchestrator.ts
for the full API.
Autopilot 可以由可配置的流水线编排器(
src/pipeline/
)驱动,通过统一的
PipelineStage
接口编排各个阶段:
RALPLAN (consensus planning) -> team-exec (Codex CLI workers) -> ralph-verify (architect verification)
流水线配置选项:
toml
[omx.autopilot.pipeline]
maxRalphIterations = 10    # Ralph verification iteration ceiling
workerCount = 2            # Number of Codex CLI team workers
agentType = "executor"     # Agent type for team workers
流水线通过
pipeline-state.json
持久化状态,支持从最后一个未完成阶段恢复。完整 API 请参考
src/pipeline/orchestrator.ts

Troubleshooting

故障排查

Stuck in a phase? Check TODO list for blocked tasks, run
state_read({mode: "autopilot"})
, or cancel and resume.
QA cycles exhausted? The same error 3 times indicates a fundamental issue. Review the error pattern; manual intervention may be needed.
Validation keeps failing? Review the specific issues. Requirements may have been too vague -- cancel and provide more detail. </Advanced>
卡在某个阶段? 检查待办列表中的阻塞任务,运行
state_read({mode: "autopilot"})
,或取消后恢复。
QA 循环已用尽? 同一错误出现 3 次表明存在根本问题。查看错误模式;可能需要人工介入。
验证持续失败? 查看具体问题。需求可能过于模糊——取消后提供更多细节。 </Advanced>