ce-work-beta

Compare original and translation side by side

🇺🇸

Original

English

🇨🇳

Translation

Chinese

Work Execution Command

任务执行命令

Execute work efficiently while maintaining quality and finishing features.

在保证质量的前提下高效执行任务，完成功能开发。

Introduction

简介

This command takes a work document (plan or specification) or a bare prompt describing the work, and executes it systematically. The focus is on shipping complete features by understanding requirements quickly, following existing patterns, and maintaining quality throughout.

Beta rollout note: Invoke

ce-work-beta

manually when you want to trial Codex delegation. During the beta period, planning and workflow handoffs remain pointed at stable

ce-work

to avoid dual-path orchestration complexity.

该命令接收任务文档（计划或规范）或描述任务的纯提示词，系统性地执行任务。核心目标是通过快速理解需求、遵循现有模式并全程保障质量，交付完整功能。

测试版发布说明：若想试用Codex委托模式，请手动调用

ce-work-beta

。在测试阶段，规划和工作流交接仍指向稳定版

ce-work

，以避免双路径编排的复杂性。

Input Document

输入文档

<input_document> #$ARGUMENTS </input_document>

Argument Parsing

参数解析

Parse

$ARGUMENTS

for the following optional tokens. Strip each recognized token before interpreting the remainder as the plan file path or bare prompt.

Token	Example	Effect
`delegate:codex`	`delegate:codex`	Activate Codex delegation mode for plan execution
`delegate:local`	`delegate:local`	Deactivate delegation even if enabled in config

All tokens are optional. When absent, fall back to the resolution chain below.

Fuzzy activation: Also recognize imperative delegation-intent phrases such as "use codex", "delegate to codex", "codex mode", or "delegate mode" as equivalent to

delegate:codex

. A bare mention of "codex" in a prompt (e.g., "fix codex converter bugs") must NOT activate delegation -- only clear delegation intent triggers it.

Fuzzy deactivation: Also recognize phrases such as "no codex", "local mode", "standard mode" as equivalent to

delegate:local

解析

$ARGUMENTS

中的以下可选令牌。识别并移除每个已识别的令牌后，将剩余内容解读为计划文件路径或纯提示词。

令牌	示例	作用
`delegate:codex`	`delegate:codex`	激活计划执行的Codex委托模式
`delegate:local`	`delegate:local`	即使配置中已启用，也关闭委托模式

所有令牌均为可选。若未提供，则按以下优先级链解析。

模糊激活：也将明确表达委托意图的短语（如"use codex"、"delegate to codex"、"codex mode"或"delegate mode"）视为与

delegate:codex

等效。但提示词中仅提及"codex"（例如"fix codex converter bugs"）不得激活委托模式——只有明确的委托意图才能触发。

模糊关闭：也将短语（如"no codex"、"local mode"、"standard mode"）视为与

delegate:local

等效。

Settings Resolution Chain

设置优先级链

After extracting tokens from arguments, resolve the delegation state using this precedence chain:

Argument flag --
```
delegate:codex
```
or
```
delegate:local
```
from the current invocation (highest priority)
Config file -- extract settings from the config block below. Value
```
codex
```
for
```
work_delegate
```
activates delegation;
```
false
```
deactivates.
Hard default --
```
false
```
(delegation off)

Config (pre-resolved): !

cat "$(git rev-parse --show-toplevel 2>/dev/null)/.compound-engineering/config.local.yaml" 2>/dev/null || cat "$(dirname "$(git rev-parse --path-format=absolute --git-common-dir 2>/dev/null)")/.compound-engineering/config.local.yaml" 2>/dev/null || echo '__NO_CONFIG__'

If the block above contains YAML key-value pairs, extract values for the keys listed below. If it shows

__NO_CONFIG__

, the file does not exist — all settings fall through to defaults. If it shows an unresolved command string, read

.compound-engineering/config.local.yaml

from the repo root using the native file-read tool (e.g., Read in Claude Code, read_file in Codex). If the file does not exist, all settings fall through to defaults.

If any setting has an unrecognized value, fall through to the hard default for that setting.

Config keys:

```
work_delegate
```
--
```
codex
```
or default
```
false
```
```
work_delegate_consent
```
--
```
true
```
or default
```
false
```
```
work_delegate_sandbox
```
--
```
yolo
```
(default) or
```
full-auto
```
```
work_delegate_decision
```
--
```
auto
```
(default) or
```
ask
```
```
work_delegate_model
```
-- Codex model to use (default
```
gpt-5.4
```
). Passthrough — any valid model name accepted.

work_delegate_effort

minimal

low

medium

high

(default), or

xhigh

Store the resolved state for downstream consumption:

```
delegation_active
```
-- boolean, whether delegation mode is on
```
delegation_source
```
--
```
argument
```
or
```
config
```
or
```
default
```
-- how delegation was resolved (used by environment guard to decide notification verbosity)
```
sandbox_mode
```
--
```
yolo
```
or
```
full-auto
```
(from config or default
```
yolo
```
)
```
consent_granted
```
-- boolean (from config
```
work_delegate_consent
```
)
```
delegate_model
```
-- string (from config or default
```
gpt-5.4
```
)
```
delegate_effort
```
-- string (from config or default
```
high
```
)

从参数中提取令牌后，按以下优先级解析委托状态：

参数标志 — 当前调用中的
```
delegate:codex
```
或
```
delegate:local
```
（最高优先级）
配置文件 — 从下方配置块中提取设置。
```
work_delegate
```
值为
```
codex
```
时激活委托模式；
```
false
```
则关闭。
硬默认值 —
```
false
```
（委托模式关闭）

配置（预解析）： !

cat "$(git rev-parse --show-toplevel 2>/dev/null)/.compound-engineering/config.local.yaml" 2>/dev/null || cat "$(dirname "$(git rev-parse --path-format=absolute --git-common-dir 2>/dev/null)")/.compound-engineering/config.local.yaml" 2>/dev/null || echo '__NO_CONFIG__'

若上述块包含YAML键值对，则提取以下键对应的值。若显示

__NO_CONFIG__

，表示文件不存在——所有设置均使用默认值。若显示未解析的命令字符串，请使用原生文件读取工具（如Claude Code中的Read，Codex中的read_file）从仓库根目录读取

.compound-engineering/config.local.yaml

。若文件不存在，所有设置均使用默认值。

若任何设置的值无法识别，则该设置使用硬默认值。

配置键：

```
work_delegate
```
—
```
codex
```
或默认值
```
false
```
```
work_delegate_consent
```
—
```
true
```
或默认值
```
false
```
```
work_delegate_sandbox
```
—
```
yolo
```
（默认）或
```
full-auto
```
```
work_delegate_decision
```
—
```
auto
```
（默认）或
```
ask
```
```
work_delegate_model
```
— 使用的Codex模型（默认
```
gpt-5.4
```
）。直接传递——接受任何有效的模型名称。

work_delegate_effort

—

minimal

、

low

、

medium

、

high

（默认）或

xhigh

存储解析后的状态供下游使用：

```
delegation_active
```
— 布尔值，委托模式是否开启
```
delegation_source
```
—
```
argument
```
、
```
config
```
或
```
default
```
——委托状态的解析来源（用于环境防护决定通知的详细程度）
```
sandbox_mode
```
—
```
yolo
```
或
```
full-auto
```
（来自配置或默认
```
yolo
```
）
```
consent_granted
```
— 布尔值（来自配置
```
work_delegate_consent
```
）
```
delegate_model
```
— 字符串（来自配置或默认
```
gpt-5.4
```
）
```
delegate_effort
```
— 字符串（来自配置或默认
```
high
```
）

Execution Workflow

执行工作流

Phase 0: Input Triage

阶段0：输入分类

Determine how to proceed based on what was provided in

<input_document>

Plan document (input is a file path to an existing plan or specification) → skip to Phase 1.

Bare prompt (input is a description of work, not a file path):

Scan the work area
- Identify files likely to change based on the prompt
- Find existing test files for those areas (search for test/spec files that import, reference, or share names with the implementation files)
- Note local patterns and conventions in the affected areas

Assess complexity and route

Complexity	Signals	Action
Trivial	1-2 files, no behavioral change (typo, config, rename)	Proceed to Phase 1 step 2 (environment setup), then implement directly — no task list, no execution loop. Apply Test Discovery if the change touches behavior-bearing code
Small / Medium	Clear scope, under ~10 files	Build a task list from discovery. Proceed to Phase 1 step 2
Large	Cross-cutting, architectural decisions, 10+ files, touches auth/payments/migrations	Inform the user this would benefit from `/ce-brainstorm` or `/ce-plan` to surface edge cases and scope boundaries. Honor their choice. If proceeding, build a task list and continue to Phase 1 step 2

根据

<input_document>

中提供的内容决定后续操作。

计划文档（输入为现有计划或规范的文件路径）→ 跳至阶段1。

纯提示词（输入为任务描述，而非文件路径）：

扫描工作区
- 根据提示词识别可能需要修改的文件
- 查找这些区域的现有测试文件（搜索导入、引用或与实现文件同名的测试/规范文件）
- 记录受影响区域的本地模式和约定

评估复杂度并路由

复杂度	信号	操作
trivial（ trivial）	1-2个文件，无行为变更（拼写错误、配置、重命名）	直接进入阶段1第2步（环境设置），然后直接实现——无需任务列表，无需执行循环。若变更涉及行为代码，应用测试发现流程
小/中	范围明确，涉及文件少于10个	根据发现结果构建任务列表。进入阶段1第2步
大	跨模块、架构决策、10个以上文件、涉及认证/支付/迁移	告知用户该任务将受益于 `/ce-brainstorm` 或 `/ce-plan` ，以梳理边缘情况和范围边界。尊重用户选择。若继续执行，构建任务列表并进入阶段1第2步

Phase 1: Quick Start

阶段1：快速启动

Read Plan and Clarify (skip if arriving from Phase 0 with a bare prompt)
- Read the work document completely
- Treat the plan as a decision artifact, not an execution script
- If the plan includes sections such as
```
Implementation Units
```
  ,
```
Work Breakdown
```
  ,
```
Requirements Trace
```
  ,
```
Files
```
  ,
```
Test Scenarios
```
  , or
```
Verification
```
  , use those as the primary source material for execution
- Check for
```
Execution note
```
  on each implementation unit — these carry the plan's execution posture signal for that unit (for example, test-first or characterization-first). Note them when creating tasks.
- Check for a
```
Deferred to Implementation
```
  or
```
Implementation-Time Unknowns
```
  section — these are questions the planner intentionally left for you to resolve during execution. Note them before starting so they inform your approach rather than surprising you mid-task
- Check for a
```
Scope Boundaries
```
  section — these are explicit non-goals. Refer back to them if implementation starts pulling you toward adjacent work
- Review any references or links provided in the plan
- If the user explicitly asks for TDD, test-first, or characterization-first execution in this session, honor that request even if the plan has no
```
Execution note
```
- If anything is unclear or ambiguous, ask clarifying questions now
- If clarifying questions were needed above, get user approval on the resolved answers. If no clarifications were needed, proceed without a separate approval step — plan scope is the plan's authority, not something to renegotiate
- Do not skip this - better to ask questions now than build the wrong thing
- Do not edit the plan body during execution. The plan is a decision artifact; progress lives in git commits and the task tracker. The only plan mutation during ce-work is the final
```
status: active → completed
```
  flip at shipping (see
```
references/shipping-workflow.md
```
  Phase 4 Step 2). Legacy plans may contain
```
- [ ]
```
  /
```
- [x]
```
  marks on unit headings — ignore them as state; per-unit completion is determined during execution by reading the current file state.
Setup Environment

First, check the current branch:
bash
```
current_branch=$(git branch --show-current)
default_branch=$(git symbolic-ref refs/remotes/origin/HEAD 2>/dev/null | sed 's@^refs/remotes/origin/@@')

# Fallback if remote HEAD isn't set
if [ -z "$default_branch" ]; then
  default_branch=$(git rev-parse --verify origin/main >/dev/null 2>&1 && echo "main" || echo "master")
fi
```
If already on a feature branch (not the default branch):
First, check whether the branch name is meaningful — a name like
```
feat/crowd-sniff
```
or
```
fix/email-validation
```
tells future readers what the work is about. Auto-generated worktree names (e.g.,
```
worktree-jolly-beaming-raven
```
) or other opaque names do not.
If the branch name is meaningless or auto-generated, suggest renaming it before continuing:
bash
```
git branch -m <meaningful-name>
```
Derive the new name from the plan title or work description (e.g.,
```
feat/crowd-sniff
```
). Present the rename as a recommended option alongside continuing as-is.
Then ask: "Continue working on
```
[current_branch]
```
, or create a new branch?"
- If continuing (with or without rename), proceed to step 3
- If creating new, follow Option A or B below
If on the default branch, choose how to proceed:

Option A: Create a new branch
bash
```
git pull origin [default_branch]
git checkout -b feature-branch-name
```
Use a meaningful name based on the work (e.g.,
```
feat/user-authentication
```
,
```
fix/email-validation
```
).
Option B: Use a worktree (recommended for parallel development)
bash
```
skill: ce-worktree
# The skill will create a new branch from the default branch in an isolated worktree
```
Option C: Continue on the default branch
- Requires explicit user confirmation
- Only proceed after user explicitly says "yes, commit to [default_branch]"
- Never commit directly to the default branch without explicit permission
Recommendation: Use worktree if:
- You want to work on multiple features simultaneously
- You want to keep the default branch clean while experimenting
- You plan to switch between branches frequently
Create Task List (skip if Phase 0 already built one, or if Phase 0 routed as Trivial)
- Use the platform's task tracking tool (
```
TaskCreate
```
  /
```
TaskUpdate
```
  /
```
TaskList
```
  in Claude Code,
```
update_plan
```
  in Codex, or the equivalent on other harnesses) to break the plan into actionable tasks
- Derive tasks from the plan's implementation units, dependencies, files, test targets, and verification criteria
- When the plan defines U-IDs for Implementation Units, preserve the unit's U-ID as a prefix in the task subject (e.g., "U3: Add parser coverage"). This keeps blocker references, deferred-work notes, and final summaries anchored to the same identifier the plan uses, so progress and traceability remain unambiguous across plan edits
- Carry each unit's
```
Execution note
```
  into the task when present
- For each unit, read the
```
Patterns to follow
```
  field before implementing — these point to specific files or conventions to mirror
- Use each unit's
```
Verification
```
  field as the primary "done" signal for that task
- Do not expect the plan to contain implementation code, micro-step TDD instructions, or exact shell commands
- Include dependencies between tasks
- Prioritize based on what needs to be done first
- Include testing and quality check tasks
- Keep tasks specific and completable

Choose Execution Strategy

Delegation routing gate: If

delegation_active

is true AND the input is a plan file (not a bare prompt), read

references/codex-delegation-workflow.md

and follow its Pre-Delegation Checks and Delegation Decision flow. If all checks pass and delegation proceeds, force serial execution and proceed directly to Phase 2 using the workflow's batched execution loop. If any check disables delegation, fall through to the standard strategy table below. If delegation is active but the input is a bare prompt (no plan file), set

delegation_active

to false with a brief note: "Codex delegation requires a plan file -- using standard mode." and continue with the standard strategy selection below.

After creating the task list, decide how to execute based on the plan's size and dependency structure:

Strategy	When to use
Inline	1-2 small tasks, or tasks needing user interaction mid-flight. Default for bare-prompt work — bare prompts rarely produce enough structured context to justify subagent dispatch
Serial subagents	3+ tasks with dependencies between them. Each subagent gets a fresh context window focused on one unit — prevents context degradation across many tasks. Requires plan-unit metadata (Goal, Files, Approach, Test scenarios)
Parallel subagents	3+ tasks that pass the Parallel Safety Check (below). Dispatch independent units simultaneously, run dependent units after their prerequisites complete. Requires plan-unit metadata

Parallel Safety Check — required before choosing parallel dispatch:

Build a file-to-unit mapping from every candidate unit's
```
Files:
```
section (Create, Modify, and Test paths)
Check for intersection — any file path appearing in 2+ units means overlap
If any overlap is found, downgrade to serial subagents. Log the reason (e.g., "Units 2 and 4 share
```
config/routes.rb
```
— using serial dispatch"). Serial subagents still provide context-window isolation without shared-directory risks

Even with no file overlap, parallel subagents sharing a working directory face git index contention (concurrent staging/committing corrupts the index) and test interference (concurrent test runs pick up each other's in-progress changes). The parallel subagent constraints below mitigate these.

Subagent dispatch uses your available subagent or task spawning mechanism. For each unit, give the subagent:

The full plan file path (for overall context)
The specific unit's Goal, Files, Approach, Execution note, Patterns, Test scenarios, and Verification
Any resolved deferred questions relevant to that unit
Instruction to check whether the unit's test scenarios cover all applicable categories (happy paths, edge cases, error paths, integration) and supplement gaps before writing tests

Parallel subagent constraints — when dispatching units in parallel (not serial or inline):

Instruct each subagent: "Do not stage files (
```
git add
```
), create commits, or run the project test suite. The orchestrator handles testing, staging, and committing after all parallel units complete."
These constraints prevent git index contention and test interference between concurrent subagents

Permission mode: Omit the

mode

parameter when dispatching subagents so the user's configured permission settings apply. Do not pass

mode: "auto"

— it overrides user-level settings like

bypassPermissions

After each subagent completes (serial mode):

Review the subagent's diff — verify changes match the unit's scope and
```
Files:
```
list
Run the relevant test suite to confirm the tree is healthy
If tests fail, diagnose and fix before proceeding — do not dispatch dependent units on a broken tree
Update the task list (do not edit the plan body — progress is carried by the commit)
Dispatch the next unit

After all parallel subagents in a batch complete:

Wait for every subagent in the current parallel batch to finish before acting on any of their results
Cross-check for discovered file collisions: compare the actual files modified by all subagents in the batch (not just their declared
```
Files:
```
lists). Subagents may create or modify files not anticipated during planning — this is expected, since plans describe what not how. A collision only matters when 2+ subagents in the same batch modified the same file. In a shared working directory, only the last writer's version survives — the other unit's changes to that file are lost. If a collision is detected: commit all non-colliding files from all units first, then re-run the affected units serially for the shared file so each builds on the other's committed work
For each completed unit, in dependency order: review the diff, run the relevant test suite, stage only that unit's files, and commit with a conventional message derived from the unit's Goal
If tests fail after committing a unit's changes, diagnose and fix before committing the next unit
Update the task list (do not edit the plan body — progress is carried by the commits just made)
Dispatch the next batch of independent units, or the next dependent unit

读取计划并澄清 (若从阶段0的纯提示词跳转则跳过此步)
- 完整读取任务文档
- 将计划视为决策工件，而非执行脚本
- 若计划包含
```
Implementation Units
```
  、
```
Work Breakdown
```
  、
```
Requirements Trace
```
  、
```
Files
```
  、
```
Test Scenarios
```
  或
```
Verification
```
  等部分，将其作为执行的主要素材
- 检查每个实现单元的
```
Execution note
```
  ——这些包含计划对该单元的执行姿态信号（例如测试优先或特性优先）。创建任务时记录这些信息
- 检查
```
Deferred to Implementation
```
  或
```
Implementation-Time Unknowns
```
  部分——这些是规划者特意留到执行阶段解决的问题。开始前记录这些问题，以便指导执行方法，避免中途意外
- 检查
```
Scope Boundaries
```
  部分——这些是明确的非目标。若执行过程中涉及相邻工作，需参考此部分
- 查看计划中提供的任何参考资料或链接
- 若用户在本次会话中明确要求TDD、测试优先或特性优先执行，即使计划中无
```
Execution note
```
  ，也需遵守该请求
- 若有任何不清楚或模糊的地方，立即提出澄清问题
- 若上述步骤需要澄清问题，需获得用户对解决方案的批准。若无需澄清，则直接继续——计划范围具有权威性，无需重新协商
- 请勿跳过此步骤 - 现在提问比后续构建错误内容更好
- 执行期间请勿编辑计划主体。计划是决策工件；进度记录在git提交和任务跟踪器中。ce-work执行期间唯一的计划变更就是交付时最终将
```
status: active → completed
```
  翻转（见
```
references/shipping-workflow.md
```
  阶段4第2步）。旧计划可能在单元标题中包含
```
- [ ]
```
  /
```
- [x]
```
  标记——忽略这些状态；单元完成情况由执行期间读取的当前文件状态决定。
环境设置

首先，检查当前分支：
bash
```
current_branch=$(git branch --show-current)
default_branch=$(git symbolic-ref refs/remotes/origin/HEAD 2>/dev/null | sed 's@^refs/remotes/origin/@@')

# Fallback if remote HEAD isn't set
if [ -z "$default_branch" ]; then
  default_branch=$(git rev-parse --verify origin/main >/dev/null 2>&1 && echo "main" || echo "master")
fi
```
若已在功能分支上（非默认分支）：
首先，检查分支名称是否有意义——类似
```
feat/crowd-sniff
```
或
```
fix/email-validation
```
的名称能让未来的读者了解任务内容。自动生成的工作树名称（例如
```
worktree-jolly-beaming-raven
```
）或其他不透明名称则无意义。
若分支名称无意义或为自动生成，建议在继续前重命名：
bash
```
git branch -m <meaningful-name>
```
根据计划标题或任务描述推导新名称（例如
```
feat/crowd-sniff
```
）。将重命名作为推荐选项，同时提供继续使用原名称的选择。
然后询问："继续在
```
[current_branch]
```
上工作，还是创建新分支？"
- 若继续（无论是否重命名），进入第3步
- 若创建新分支，遵循以下选项A或B
若在默认分支上，选择执行方式：

选项A：创建新分支
bash
```
git pull origin [default_branch]
git checkout -b feature-branch-name
```
根据任务内容使用有意义的名称（例如
```
feat/user-authentication
```
、
```
fix/email-validation
```
）。
选项B：使用工作树（并行开发推荐）
bash
```
skill: ce-worktree
# 该技能将在独立工作树中从默认分支创建新分支
```
选项C：继续在默认分支上工作
- 需要用户明确确认
- 仅在用户明确表示"yes, commit to [default_branch]"后继续
- 未经明确许可，切勿直接提交到默认分支
推荐场景：若符合以下情况，使用工作树：
- 需同时处理多个功能
- 希望在实验时保持默认分支干净
- 计划频繁切换分支
创建任务列表 (若阶段0已构建任务列表，或阶段0路由为Trivial则跳过此步)
- 使用平台的任务跟踪工具（Claude Code中的
```
TaskCreate
```
  /
```
TaskUpdate
```
  /
```
TaskList
```
  ，Codex中的
```
update_plan
```
  ，或其他工具的等效功能）将计划拆分为可执行任务
- 根据计划的实现单元、依赖关系、文件、测试目标和验证标准推导任务
- 若计划为实现单元定义了U-ID，在任务主题中保留单元的U-ID作为前缀（例如"U3: Add parser coverage"）。这能让阻塞引用、延迟工作记录和最终摘要与计划使用的标识符保持一致，确保计划编辑后进度和可追溯性仍清晰明确
- 若存在单元的
```
Execution note
```
  ，将其纳入任务
- 对于每个单元，实现前先阅读
```
Patterns to follow
```
  字段——这些指向需镜像的特定文件或约定
- 将每个单元的
```
Verification
```
  字段作为该任务的主要"完成"信号
- 不要期望计划包含实现代码、微步骤TDD指令或精确的shell命令
- 包含任务之间的依赖关系
- 根据优先级排序
- 包含测试和质量检查任务
- 保持任务具体且可完成

选择执行策略

委托路由 gate：若

delegation_active

为true且输入为计划文件（非纯提示词），请阅读

references/codex-delegation-workflow.md

并遵循其委托前检查和委托决策流程。若所有检查通过且委托执行，强制串行执行并直接进入阶段2，使用工作流的批量执行循环。若任何检查禁用委托，则使用下方的标准策略表。若委托已激活但输入为纯提示词（无计划文件），将

delegation_active

设为false并附带简短说明："Codex委托需要计划文件——使用标准模式。"然后继续选择下方的标准策略。

创建任务列表后，根据计划的规模和依赖结构决定执行方式：

策略	使用场景
Inline（内联）	1-2个小任务，或执行过程中需要用户交互的任务。纯提示词任务的默认策略——纯提示词很少能产生足够的结构化上下文来证明子代理调度的合理性
Serial subagents（串行子代理）	3个以上存在依赖关系的任务。每个子代理获得专注于一个单元的全新上下文窗口——避免多任务导致的上下文退化。需要计划单元元数据（目标、文件、方法、测试场景）
Parallel subagents（并行子代理）	3个以上通过并行安全检查（见下文）的任务。同时调度独立单元，在其前置任务完成后再调度依赖单元。需要计划单元元数据

并行安全检查 — 选择并行调度前必须执行：

根据每个候选单元的
```
Files:
```
部分（创建、修改和测试路径）构建文件到单元的映射
检查是否存在交集——任何文件路径出现在2个以上单元中即表示存在重叠
若发现任何重叠，降级为串行子代理。记录原因（例如"Units 2 and 4 share
```
config/routes.rb
```
— using serial dispatch"）。串行子代理仍能提供上下文窗口隔离，且无共享目录风险

即使无文件重叠，共享工作目录的并行子代理仍会面临git索引冲突（并发暂存/提交会损坏索引）和测试干扰（并发测试运行会获取彼此的进行中变更）。以下并行子代理约束可缓解这些问题。

子代理调度使用可用的子代理或任务生成机制。对于每个单元，向子代理提供：

完整的计划文件路径（用于整体上下文）
特定单元的目标、文件、方法、执行说明、模式、测试场景和验证标准
与该单元相关的已解决延迟问题
指令：检查单元的测试场景是否涵盖所有适用类别（正常路径、边缘情况、错误路径、集成），并在编写测试前补充缺失部分

并行子代理约束 — 并行调度单元时（非串行或内联）：

指示每个子代理："不要暂存文件（
```
git add
```
）、创建提交或运行项目测试套件。协调器会在所有并行单元完成后处理测试、暂存和提交。"
这些约束可防止并发子代理之间的git索引冲突和测试干扰

权限模式：调度子代理时省略

mode

参数，以便应用用户配置的权限设置。请勿传递

mode: "auto"

——这会覆盖用户级设置如

bypassPermissions

。

每个子代理完成后（串行模式）：

审查子代理的diff——验证变更是否符合单元的范围和
```
Files:
```
列表
运行相关测试套件以确认代码库健康
若测试失败，先诊断并修复再继续——不要在损坏的代码库上调度依赖单元
更新任务列表（请勿编辑计划主体——进度由提交记录）
调度下一个单元

所有并行子代理批次完成后：

等待当前并行批次中的每个子代理完成后，再处理其结果
交叉检查是否发现文件冲突：比较批次中所有子代理实际修改的文件（不仅是其声明的
```
Files:
```
列表）。子代理可能会创建或修改规划期间未预期的文件——这是正常的，因为计划描述的是做什么而非怎么做。只有当同一批次中的2个以上子代理修改了同一文件时，冲突才会产生影响。在共享工作目录中，只有最后写入者的版本会保留——其他单元对该文件的变更会丢失。若检测到冲突：先提交所有单元的非冲突文件，然后针对共享文件重新串行运行受影响的单元，以便每个单元都基于其他单元已提交的工作进行构建
按依赖顺序处理每个已完成的单元：审查diff、运行相关测试套件、仅暂存该单元的文件，并根据单元目标生成符合规范的提交消息
若提交单元变更后测试失败，先诊断并修复再提交下一个单元
更新任务列表（请勿编辑计划主体——进度由刚完成的提交记录）
调度下一批独立单元，或下一个依赖单元

Phase 2: Execute

阶段2：执行

Task Execution Loop

For each task in priority order:

while (tasks remain):
  - Mark task as in-progress
  - Read any referenced files from the plan or discovered during Phase 0
  - **If the unit's work is already present and matches the plan's intent** (files exist with the expected capability, or the unit's `Verification` criteria are already satisfied by the current code), the work has likely shipped on a prior branch or session. Verify it matches, mark the task complete, and move on. Do not silently reimplement.
  - Look for similar patterns in codebase
  - Find existing test files for implementation files being changed (Test Discovery — see below)
  - If delegation_active: branch to the Codex Delegation Execution Loop
    (see `references/codex-delegation-workflow.md`)
  - Otherwise: implement following existing conventions
  - Add, update, or remove tests to match implementation changes (see Test Discovery below)
  - Run System-Wide Test Check (see below)
  - Run tests after changes
  - Assess testing coverage: did this task change behavior? If yes, were tests written or updated? If no tests were added, is the justification deliberate (e.g., pure config, no behavioral change)?
  - Mark task as completed
  - Evaluate for incremental commit (see below)

When a unit carries an

Execution note

, honor it. For test-first units, write the failing test before implementation for that unit. For characterization-first units, capture existing behavior before changing it. For units without an

Execution note

, proceed pragmatically.

Guardrails for execution posture:

Do not write the test and implementation in the same step when working test-first
Do not skip verifying that a new test fails before implementing the fix or feature
Do not over-implement beyond the current behavior slice when working test-first
Skip test-first discipline for trivial renames, pure configuration, and pure styling work

Test Discovery — Before implementing changes to a file, find its existing test files (search for test/spec files that import, reference, or share naming patterns with the implementation file). When a plan specifies test scenarios or test files, start there, then check for additional test coverage the plan may not have enumerated. Changes to implementation files should be accompanied by corresponding test updates — new tests for new behavior, modified tests for changed behavior, removed or updated tests for deleted behavior.

Test Scenario Completeness — Before writing tests for a feature-bearing unit, check whether the plan's

Test scenarios

cover all categories that apply to this unit. If a category is missing or scenarios are vague (e.g., "validates correctly" without naming inputs and expected outcomes), supplement from the unit's own context before writing tests:

Category	When it applies	How to derive if missing
Happy path	Always for feature-bearing units	Read the unit's Goal and Approach for core input/output pairs
Edge cases	When the unit has meaningful boundaries (inputs, state, concurrency)	Identify boundary values, empty/nil inputs, and concurrent access patterns
Error/failure paths	When the unit has failure modes (validation, external calls, permissions)	Enumerate invalid inputs the unit should reject, permission/auth denials it should enforce, and downstream failures it should handle
Integration	When the unit crosses layers (callbacks, middleware, multi-service)	Identify the cross-layer chain and write a scenario that exercises it without mocks

System-Wide Test Check — Before marking a task done, pause and ask:

Question	What to do
What fires when this runs? Callbacks, middleware, observers, event handlers — trace two levels out from your change.	Read the actual code (not docs) for callbacks on models you touch, middleware in the request chain, `after_*` hooks.
Do my tests exercise the real chain? If every dependency is mocked, the test proves your logic works in isolation — it says nothing about the interaction.	Write at least one integration test that uses real objects through the full callback/middleware chain. No mocks for the layers that interact.
Can failure leave orphaned state? If your code persists state (DB row, cache, file) before calling an external service, what happens when the service fails? Does retry create duplicates?	Trace the failure path with real objects. If state is created before the risky call, test that failure cleans up or that retry is idempotent.
What other interfaces expose this? Mixins, DSLs, alternative entry points (Agent vs Chat vs ChatMethods).	Grep for the method/behavior in related classes. If parity is needed, add it now — not as a follow-up.
Do error strategies align across layers? Retry middleware + application fallback + framework error handling — do they conflict or create double execution?	List the specific error classes at each layer. Verify your rescue list matches what the lower layer actually raises.

When to skip: Leaf-node changes with no callbacks, no state persistence, no parallel interfaces. If the change is purely additive (new helper method, new view partial), the check takes 10 seconds and the answer is "nothing fires, skip."

When this matters most: Any change that touches models with callbacks, error handling with fallback/retry, or functionality exposed through multiple interfaces.

Incremental Commits

After completing each task, evaluate whether to create an incremental commit:

Commit when...	Don't commit when...
Logical unit complete (model, service, component)	Small part of a larger unit
Tests pass + meaningful progress	Tests failing
About to switch contexts (backend → frontend)	Purely scaffolding with no behavior
About to attempt risky/uncertain changes	Would need a "WIP" commit message

Heuristic: "Can I write a commit message that describes a complete, valuable change? If yes, commit. If the message would be 'WIP' or 'partial X', wait."

If the plan has Implementation Units, use them as a starting guide for commit boundaries — but adapt based on what you find during implementation. A unit might need multiple commits if it's larger than expected, or small related units might land together. Use each unit's Goal to inform the commit message.

Commit workflow:

bash

# 1. Verify tests pass (use project's test command)
# Examples: bin/rails test, npm test, pytest, go test, etc.

# 2. Stage only files related to this logical unit (not `git add .`)
git add <files related to this logical unit>

# 3. Commit with conventional message
git commit -m "feat(scope): description of this unit"

Handling merge conflicts: If conflicts arise during rebasing or merging, resolve them immediately. Incremental commits make conflict resolution easier since each commit is small and focused.

Note: Incremental commits use clean conventional messages without attribution footers. The final Phase 4 commit/PR includes the full attribution.

Parallel subagent mode: When units run as parallel subagents, the subagents do not commit — the orchestrator handles staging and committing after the entire parallel batch completes (see Parallel subagent constraints in Phase 1 Step 4). The commit guidance in this section applies to inline and serial execution, and to the orchestrator's commit decisions after parallel batch completion.

Follow Existing Patterns
- The plan should reference similar code - read those files first
- Match naming conventions exactly
- Reuse existing components where possible
- Follow project coding standards (see AGENTS.md; use CLAUDE.md only if the repo still keeps a compatibility shim)
- When in doubt, grep for similar implementations
Test Continuously
- Run relevant tests after each significant change
- Don't wait until the end to test
- Fix failures immediately
- Add new tests for new behavior, update tests for changed behavior, remove tests for deleted behavior
- Unit tests with mocks prove logic in isolation. Integration tests with real objects prove the layers work together. If your change touches callbacks, middleware, or error handling — you need both.
Simplify as You Go

After completing a cluster of related implementation units (or every 2-3 units), review recently changed files for simplification opportunities — consolidate duplicated patterns, extract shared helpers, and improve code reuse and efficiency. This is especially valuable when using subagents, since each agent works with isolated context and can't see patterns emerging across units.

Don't simplify after every single unit — early patterns may look duplicated but diverge intentionally in later units. Wait for a natural phase boundary or when you notice accumulated complexity.
If a
```
/simplify
```
skill or equivalent is available, use it. Otherwise, review the changed files yourself for reuse and consolidation opportunities.
Figma Design Sync (if applicable)

For UI work with Figma designs:
- Implement components following design specs
- Use ce-figma-design-sync agent iteratively to compare
- Fix visual differences identified
- Repeat until implementation matches design
Frontend Design Guidance (if applicable)

For UI tasks without a Figma design -- where the implementation touches view, template, component, layout, or page files, creates user-visible routes, or the plan contains explicit UI/frontend/design language:
- Load the
```
ce-frontend-design
```
  skill before implementing
- Follow its detection, guidance, and verification flow
- If the skill produced a verification screenshot, it satisfies Phase 4's screenshot requirement -- no need to capture separately. If the skill fell back to mental review (no browser access), Phase 4's screenshot capture still applies
Track Progress
- Keep the task list updated as you complete tasks
- Note any blockers or unexpected discoveries
- Create new tasks if scope expands
- Keep user informed of major milestones
- When the plan defines U-IDs for Implementation Units, or the plan or origin document carries stable R-IDs (and optionally A/F/AE IDs), reference them in blockers, deferred-work notes, task summaries, and final verification — not routine status updates. U-IDs anchor units across plan edits; R/A/F/AE anchor product intent across the brainstorm-plan handoff. Use the IDs the plan supplies and do not invent ones it does not. This preserves traceability without burying signal under noise.

任务执行循环

按优先级顺序处理每个任务：

while (tasks remain):
  - 将任务标记为进行中
  - 读取计划中引用的文件或阶段0中发现的文件
  - **若单元的工作已存在且符合计划意图**（文件具备预期功能，或当前代码已满足单元的`Verification`标准），则该工作可能已在之前的分支或会话中交付。验证匹配后，将任务标记为完成并继续。请勿静默重新实现。
  - 在代码库中查找类似模式
  - 为正在修改的实现文件查找现有测试文件（测试发现——见下文）
  - 若`delegation_active`为true：切换到Codex委托执行循环
    (见`references/codex-delegation-workflow.md`)
  - 否则：遵循现有约定实现
  - 添加、更新或移除测试以匹配实现变更（见下文测试发现）
  - 运行系统级测试检查（见下文）
  - 变更后运行测试
  - 评估测试覆盖率：该任务是否变更了行为？若是，是否编写或更新了测试？若未添加测试，是否有合理理由（例如纯配置变更，无行为变更）？
  - 将任务标记为完成
  - 评估是否进行增量提交（见下文）

若单元带有

Execution note

，需遵守该说明。对于测试优先的单元，在实现该单元前先编写失败的测试。对于特性优先的单元，在变更前先捕获现有行为。对于无

Execution note

的单元，务实推进。

执行姿态防护措施：

测试优先工作时，请勿在同一步骤中编写测试和实现
请勿跳过验证新测试在修复或实现功能前是否失败的步骤
测试优先工作时，请勿过度实现超出当前行为范围的内容
对于 trivial重命名、纯配置和纯样式工作，可跳过测试优先流程

测试发现 — 在修改文件前，查找其现有测试文件（搜索导入、引用或与实现文件命名模式一致的测试/规范文件）。若计划指定了测试场景或测试文件，从这些开始，然后检查计划可能未列举的额外测试覆盖率。实现文件的变更应伴随相应的测试更新——为新行为添加新测试，为变更行为修改测试，为删除行为移除或更新测试。

测试场景完整性 — 在为功能单元编写测试前，检查计划的

Test scenarios

是否涵盖该单元适用的所有类别。若某类别缺失或场景模糊（例如"validates correctly"未指定输入和预期结果），在编写测试前根据单元自身上下文补充：

类别	适用场景	缺失时如何推导
正常路径	所有功能单元均适用	从单元的目标和方法中读取核心输入/输出对
边缘情况	当单元有明确边界（输入、状态、并发）时	识别边界值、空/nil输入和并发访问模式
错误/失败路径	当单元有失败模式（验证、外部调用、权限）时	列举单元应拒绝的无效输入、应执行的权限/认证拒绝，以及应处理的下游失败
集成	当单元跨层（回调、中间件、多服务）时	识别跨层链并编写不使用模拟的场景来测试

系统级测试检查 — 在标记任务完成前，暂停并思考：

问题	操作
运行时会触发什么？回调、中间件、观察者、事件处理程序——从你的变更向外追踪两层。	读取你接触的模型的回调、请求链中的中间件、 `after_*` 钩子的实际代码（而非文档）。
我的测试是否覆盖了真实链路？若所有依赖均被模拟，测试仅证明你的逻辑在隔离环境中有效——无法证明交互是否正常。	至少编写一个使用真实对象贯穿完整回调/中间件链的集成测试。交互层不使用模拟。
失败是否会留下孤立状态？若你的代码在调用外部服务前持久化状态（数据库行、缓存、文件），服务失败时会发生什么？重试是否会创建重复数据？	使用真实对象追踪失败路径。若在风险调用前创建了状态，测试失败是否会清理状态或重试是否具有幂等性。
还有哪些接口暴露此功能？混合类、DSL、替代入口点（Agent vs Chat vs ChatMethods）。	在相关类中搜索该方法/行为。若需要保持一致性，立即添加——不要留到后续。
各层的错误策略是否一致？重试中间件 + 应用回退 + 框架错误处理——它们是否冲突或导致重复执行？	列出每层的具体错误类。验证你的捕获列表是否与下层实际抛出的错误匹配。

可跳过场景：无回调、无状态持久化、无并行接口的叶节点变更。若变更为纯添加（新辅助方法、新视图片段），检查仅需10秒，答案为"无触发内容，跳过"。

重点关注场景：任何涉及带回调的模型、带回退/重试的错误处理，或通过多个接口暴露的功能的变更。

增量提交

完成每个任务后，评估是否创建增量提交：

提交时机...	不提交时机...
逻辑单元完成（模型、服务、组件）	大型单元的一小部分
测试通过 + 有意义的进度	测试失败
即将切换上下文（后端 → 前端）	纯脚手架，无行为变更
即将尝试有风险/不确定的变更	需要"WIP"提交消息

启发式规则："我能否编写描述完整、有价值变更的提交消息？若可以，提交。若消息为'WIP'或'partial X'，等待。"

若计划包含实现单元，将其作为提交边界的起始指南——但需根据实现过程中的发现调整。若单元比预期大，可能需要多次提交；或相关的小单元可一起提交。使用每个单元的目标来指导提交消息。

提交工作流：

bash

# 1. 验证测试通过（使用项目的测试命令）
# 示例：bin/rails test, npm test, pytest, go test, etc.

# 2. 仅暂存与此逻辑单元相关的文件（不要使用`git add .`）
git add <files related to this logical unit>

# 3. 使用符合规范的消息提交
git commit -m "feat(scope): description of this unit"

处理合并冲突：若在变基或合并过程中出现冲突，立即解决。增量提交使冲突解决更简单，因为每个提交都小而集中。

注意：增量提交使用简洁的规范消息，无署名页脚。最终阶段4的提交/PR包含完整署名。

并行子代理模式：当单元以并行子代理运行时，子代理不提交——协调器在整个并行批次完成后处理暂存和提交（见阶段1第4步中的并行子代理约束）。本节的提交指南适用于内联和串行执行，以及协调器在并行批次完成后的提交决策。

遵循现有模式
- 计划应引用类似代码——先读取这些文件
- 完全匹配命名约定
- 尽可能重用现有组件
- 遵循项目编码标准（见AGENTS.md；仅当仓库仍保留兼容性垫片时才使用CLAUDE.md）
- 若有疑问，搜索类似实现
持续测试
- 每次重大变更后运行相关测试
- 不要等到最后才测试
- 立即修复失败
- 为新行为添加新测试，为变更行为更新测试，为删除行为移除测试
- 带模拟的单元测试证明逻辑在隔离环境中有效。带真实对象的集成测试证明各层协同工作。 若你的变更涉及回调、中间件或错误处理——两者都需要。
逐步简化

完成一组相关实现单元后（或每完成2-3个单元），审查最近变更的文件以寻找简化机会——合并重复模式、提取共享辅助方法、提高代码重用性和效率。使用子代理时这一点尤其重要，因为每个代理都在孤立上下文中工作，无法跨单元发现模式。

不要在每个单元完成后都进行简化——早期模式可能看似重复，但在后续单元中可能会有意分化。等待自然的阶段边界或发现累积的复杂性时再进行简化。
若有
```
/simplify
```
技能或等效功能，可使用。否则，自行审查变更文件以寻找重用和合并机会。
Figma设计同步（如适用）

对于涉及Figma设计的UI工作：
- 遵循设计规范实现组件
- 迭代使用ce-figma-design-sync代理进行比较
- 修复识别出的视觉差异
- 重复直到实现与设计匹配
前端设计指南（如适用）

对于无Figma设计的UI任务——实现涉及视图、模板、组件、布局或页面文件，创建用户可见路由，或计划包含明确的UI/前端/设计语言：
- 实现前加载
```
ce-frontend-design
```
  技能
- 遵循其检测、指导和验证流程
- 若技能生成了验证截图，则满足阶段4的截图要求——无需单独捕获。若技能退化为人工审查（无浏览器访问），阶段4的截图捕获仍适用
跟踪进度
- 完成任务后更新任务列表
- 记录任何阻塞或意外发现
- 若范围扩大，创建新任务
- 向用户通报重大里程碑
- 若计划为实现单元定义了U-ID，或计划/origin文档包含稳定的R-ID（以及可选的A/F/AE ID），在阻塞记录、延迟工作记录、任务摘要和最终验证中引用这些ID——常规状态更新无需引用。U-ID在计划编辑后仍能锚定单元；R/A/F/AE在头脑风暴-计划交接后仍能锚定产品意图。使用计划提供的ID，不要自行创建。这样既能保持可追溯性，又不会因过多噪音掩盖信号。

Phase 3-4: Quality Check and Ship It

阶段3-4：质量检查与交付

When all Phase 2 tasks are complete and execution transitions to quality check, read

references/shipping-workflow.md

for the full shipping workflow: quality checks, code review, final validation, PR creation, and notification.

当阶段2的所有任务完成，执行过渡到质量检查时，请阅读

references/shipping-workflow.md

获取完整的交付工作流：质量检查、代码审查、最终验证、PR创建和通知。

Codex Delegation Mode

Codex委托模式

When

delegation_active

is true after argument parsing, read

references/codex-delegation-workflow.md

for the complete delegation workflow: pre-checks, batching, prompt template, execution loop, and result classification.

参数解析后若

delegation_active

为true，请阅读

references/codex-delegation-workflow.md

获取完整的委托工作流：预检查、批处理、提示模板、执行循环和结果分类。

Key Principles

核心原则

Start Fast, Execute Faster

快速启动，高效执行

Get clarification once at the start, then execute
Don't wait for perfect understanding - ask questions and move
The goal is to finish the feature, not create perfect process

开始时一次性澄清，然后执行
不要等待完美理解——提问并推进
目标是完成功能，而非创建完美流程

The Plan is Your Guide

计划是你的指南

Work documents should reference similar code and patterns
Load those references and follow them
Don't reinvent - match what exists

任务文档应引用类似代码和模式
加载这些参考资料并遵循
不要重新发明——匹配现有内容

Test As You Go

逐步测试

Run tests after each change, not at the end
Fix failures immediately
Continuous testing prevents big surprises

每次变更后运行测试，而非最后才测试
立即修复失败
持续测试避免重大意外

Quality is Built In

内置质量保障

Follow existing patterns
Write tests for new code
Run linting before pushing
Review every change — inline for simple additive work, full review for everything else

遵循现有模式
为新代码编写测试
推送前运行代码检查
审查每个变更——简单添加工作可内联审查，其他工作需全面审查

Ship Complete Features

交付完整功能

Mark all tasks completed before moving on
Don't leave features 80% done
A finished feature that ships beats a perfect feature that doesn't

完成所有任务后再推进
不要让功能停留在80%完成的状态
交付的完成功能优于未交付的完美功能

Common Pitfalls to Avoid

需避免的常见陷阱

Analysis paralysis - Don't overthink, read the plan and execute
Skipping clarifying questions - Ask now, not after building wrong thing
Ignoring plan references - The plan has links for a reason
Testing at the end - Test continuously or suffer later
Forgetting to track progress - Update task status as you go or lose track of what's done
80% done syndrome - Finish the feature, don't move on early
Skipping review - Every change gets reviewed; only the depth varies
Re-scoping the plan into human-time phases - The plan's Implementation Units define the scope of execution. Do not estimate human-hours per unit, propose multi-day breakdowns, or ask the user to pick a subset of units for "this session". Agents execute at agent speed, and context-window pressure is addressed by subagent dispatch (Phase 1 Step 4), not by phased sessions. If a plan-file input is genuinely too large for a single execution, say so plainly and suggest the user return to
```
/ce-plan
```
to reduce scope — don't invent session phases as a workaround. For bare-prompt input, Phase 0's Large routing already handles oversized work

分析瘫痪 - 不要过度思考，阅读计划并执行
跳过澄清问题 - 现在提问，不要等到构建错误内容后
忽略计划参考资料 - 计划中的链接是有原因的
最后才测试 - 持续测试，否则后续会吃苦头
忘记跟踪进度 - 及时更新任务状态，否则会忘记已完成的工作
80%完成综合征 - 完成功能，不要提前推进
跳过审查 - 每个变更都要审查；仅审查深度不同
将计划重新划分为人工时间阶段 - 计划的实现单元定义了执行范围。不要估算每个单元的人工工时，不要提出多日分解，也不要让用户选择"本次会话"要执行的单元子集。代理以代理速度执行，上下文窗口压力通过子代理调度（阶段1第4步）解决，而非分阶段会话。若计划文件输入确实过大，无法一次执行，请直接告知用户并建议返回
```
/ce-plan
```
缩小范围——不要发明会话阶段作为变通方法。对于纯提示词输入，阶段0的大型路由已处理超大任务