ce-work-beta
Compare original and translation side by side
🇺🇸
Original
English🇨🇳
Translation
ChineseWork Execution Command
任务执行命令
Execute work efficiently while maintaining quality and finishing features.
在保证质量的前提下高效执行任务,完成功能开发。
Introduction
简介
This command takes a work document (plan or specification) or a bare prompt describing the work, and executes it systematically. The focus is on shipping complete features by understanding requirements quickly, following existing patterns, and maintaining quality throughout.
Beta rollout note: Invoke manually when you want to trial Codex delegation. During the beta period, planning and workflow handoffs remain pointed at stable to avoid dual-path orchestration complexity.
ce-work-betace-work该命令接收任务文档(计划或规范)或描述任务的纯提示词,系统性地执行任务。核心目标是通过快速理解需求、遵循现有模式并全程保障质量,交付完整功能。
测试版发布说明:若想试用Codex委托模式,请手动调用。在测试阶段,规划和工作流交接仍指向稳定版,以避免双路径编排的复杂性。
ce-work-betace-workInput Document
输入文档
<input_document> #$ARGUMENTS </input_document>
<input_document> #$ARGUMENTS </input_document>
Argument Parsing
参数解析
Parse for the following optional tokens. Strip each recognized token before interpreting the remainder as the plan file path or bare prompt.
$ARGUMENTS| Token | Example | Effect |
|---|---|---|
| | Activate Codex delegation mode for plan execution |
| | Deactivate delegation even if enabled in config |
All tokens are optional. When absent, fall back to the resolution chain below.
Fuzzy activation: Also recognize imperative delegation-intent phrases such as "use codex", "delegate to codex", "codex mode", or "delegate mode" as equivalent to . A bare mention of "codex" in a prompt (e.g., "fix codex converter bugs") must NOT activate delegation -- only clear delegation intent triggers it.
delegate:codexFuzzy deactivation: Also recognize phrases such as "no codex", "local mode", "standard mode" as equivalent to .
delegate:local解析中的以下可选令牌。识别并移除每个已识别的令牌后,将剩余内容解读为计划文件路径或纯提示词。
$ARGUMENTS| 令牌 | 示例 | 作用 |
|---|---|---|
| | 激活计划执行的Codex委托模式 |
| | 即使配置中已启用,也关闭委托模式 |
所有令牌均为可选。若未提供,则按以下优先级链解析。
模糊激活:也将明确表达委托意图的短语(如"use codex"、"delegate to codex"、"codex mode"或"delegate mode")视为与等效。但提示词中仅提及"codex"(例如"fix codex converter bugs")不得激活委托模式——只有明确的委托意图才能触发。
delegate:codex模糊关闭:也将短语(如"no codex"、"local mode"、"standard mode")视为与等效。
delegate:localSettings Resolution Chain
设置优先级链
After extracting tokens from arguments, resolve the delegation state using this precedence chain:
- Argument flag -- or
delegate:codexfrom the current invocation (highest priority)delegate:local - Config file -- extract settings from the config block below. Value for
codexactivates delegation;work_delegatedeactivates.false - Hard default -- (delegation off)
false
Config (pre-resolved):
!
cat "$(git rev-parse --show-toplevel 2>/dev/null)/.compound-engineering/config.local.yaml" 2>/dev/null || cat "$(dirname "$(git rev-parse --path-format=absolute --git-common-dir 2>/dev/null)")/.compound-engineering/config.local.yaml" 2>/dev/null || echo '__NO_CONFIG__'If the block above contains YAML key-value pairs, extract values for the keys listed below.
If it shows , the file does not exist — all settings fall through to defaults.
If it shows an unresolved command string, read from the repo root using the native file-read tool (e.g., Read in Claude Code, read_file in Codex). If the file does not exist, all settings fall through to defaults.
__NO_CONFIG__.compound-engineering/config.local.yamlIf any setting has an unrecognized value, fall through to the hard default for that setting.
Config keys:
- --
work_delegateor defaultcodexfalse - --
work_delegate_consentor defaulttruefalse - --
work_delegate_sandbox(default) oryolofull-auto - --
work_delegate_decision(default) orautoask - -- Codex model to use (default
work_delegate_model). Passthrough — any valid model name accepted.gpt-5.4 - --
work_delegate_effort,minimal,low,medium(default), orhighxhigh
Store the resolved state for downstream consumption:
- -- boolean, whether delegation mode is on
delegation_active - --
delegation_sourceorargumentorconfig-- how delegation was resolved (used by environment guard to decide notification verbosity)default - --
sandbox_modeoryolo(from config or defaultfull-auto)yolo - -- boolean (from config
consent_granted)work_delegate_consent - -- string (from config or default
delegate_model)gpt-5.4 - -- string (from config or default
delegate_effort)high
从参数中提取令牌后,按以下优先级解析委托状态:
- 参数标志 — 当前调用中的或
delegate:codex(最高优先级)delegate:local - 配置文件 — 从下方配置块中提取设置。值为
work_delegate时激活委托模式;codex则关闭。false - 硬默认值 — (委托模式关闭)
false
配置(预解析):
!
cat "$(git rev-parse --show-toplevel 2>/dev/null)/.compound-engineering/config.local.yaml" 2>/dev/null || cat "$(dirname "$(git rev-parse --path-format=absolute --git-common-dir 2>/dev/null)")/.compound-engineering/config.local.yaml" 2>/dev/null || echo '__NO_CONFIG__'若上述块包含YAML键值对,则提取以下键对应的值。
若显示,表示文件不存在——所有设置均使用默认值。
若显示未解析的命令字符串,请使用原生文件读取工具(如Claude Code中的Read,Codex中的read_file)从仓库根目录读取。若文件不存在,所有设置均使用默认值。
__NO_CONFIG__.compound-engineering/config.local.yaml若任何设置的值无法识别,则该设置使用硬默认值。
配置键:
- —
work_delegate或默认值codexfalse - —
work_delegate_consent或默认值truefalse - —
work_delegate_sandbox(默认)或yolofull-auto - —
work_delegate_decision(默认)或autoask - — 使用的Codex模型(默认
work_delegate_model)。直接传递——接受任何有效的模型名称。gpt-5.4 - —
work_delegate_effort、minimal、low、medium(默认)或highxhigh
存储解析后的状态供下游使用:
- — 布尔值,委托模式是否开启
delegation_active - —
delegation_source、argument或config——委托状态的解析来源(用于环境防护决定通知的详细程度)default - —
sandbox_mode或yolo(来自配置或默认full-auto)yolo - — 布尔值(来自配置
consent_granted)work_delegate_consent - — 字符串(来自配置或默认
delegate_model)gpt-5.4 - — 字符串(来自配置或默认
delegate_effort)high
Execution Workflow
执行工作流
Phase 0: Input Triage
阶段0:输入分类
Determine how to proceed based on what was provided in .
<input_document>Plan document (input is a file path to an existing plan or specification) → skip to Phase 1.
Bare prompt (input is a description of work, not a file path):
-
Scan the work area
- Identify files likely to change based on the prompt
- Find existing test files for those areas (search for test/spec files that import, reference, or share names with the implementation files)
- Note local patterns and conventions in the affected areas
-
Assess complexity and route
Complexity Signals Action Trivial 1-2 files, no behavioral change (typo, config, rename) Proceed to Phase 1 step 2 (environment setup), then implement directly — no task list, no execution loop. Apply Test Discovery if the change touches behavior-bearing code Small / Medium Clear scope, under ~10 files Build a task list from discovery. Proceed to Phase 1 step 2 Large Cross-cutting, architectural decisions, 10+ files, touches auth/payments/migrations Inform the user this would benefit from or/ce-brainstormto surface edge cases and scope boundaries. Honor their choice. If proceeding, build a task list and continue to Phase 1 step 2/ce-plan
根据中提供的内容决定后续操作。
<input_document>计划文档(输入为现有计划或规范的文件路径)→ 跳至阶段1。
纯提示词(输入为任务描述,而非文件路径):
-
扫描工作区
- 根据提示词识别可能需要修改的文件
- 查找这些区域的现有测试文件(搜索导入、引用或与实现文件同名的测试/规范文件)
- 记录受影响区域的本地模式和约定
-
评估复杂度并路由
复杂度 信号 操作 ** trivial( trivial)** 1-2个文件,无行为变更(拼写错误、配置、重命名) 直接进入阶段1第2步(环境设置),然后直接实现——无需任务列表,无需执行循环。若变更涉及行为代码,应用测试发现流程 小/中 范围明确,涉及文件少于10个 根据发现结果构建任务列表。进入阶段1第2步 大 跨模块、架构决策、10个以上文件、涉及认证/支付/迁移 告知用户该任务将受益于 或/ce-brainstorm,以梳理边缘情况和范围边界。尊重用户选择。若继续执行,构建任务列表并进入阶段1第2步/ce-plan
Phase 1: Quick Start
阶段1:快速启动
-
Read Plan and Clarify (skip if arriving from Phase 0 with a bare prompt)
- Read the work document completely
- Treat the plan as a decision artifact, not an execution script
- If the plan includes sections such as ,
Implementation Units,Work Breakdown,Requirements Trace,Files, orTest Scenarios, use those as the primary source material for executionVerification - Check for on each implementation unit — these carry the plan's execution posture signal for that unit (for example, test-first or characterization-first). Note them when creating tasks.
Execution note - Check for a or
Deferred to Implementationsection — these are questions the planner intentionally left for you to resolve during execution. Note them before starting so they inform your approach rather than surprising you mid-taskImplementation-Time Unknowns - Check for a section — these are explicit non-goals. Refer back to them if implementation starts pulling you toward adjacent work
Scope Boundaries - Review any references or links provided in the plan
- If the user explicitly asks for TDD, test-first, or characterization-first execution in this session, honor that request even if the plan has no
Execution note - If anything is unclear or ambiguous, ask clarifying questions now
- If clarifying questions were needed above, get user approval on the resolved answers. If no clarifications were needed, proceed without a separate approval step — plan scope is the plan's authority, not something to renegotiate
- Do not skip this - better to ask questions now than build the wrong thing
- Do not edit the plan body during execution. The plan is a decision artifact; progress lives in git commits and the task tracker. The only plan mutation during ce-work is the final flip at shipping (see
status: active → completedPhase 4 Step 2). Legacy plans may containreferences/shipping-workflow.md/- [ ]marks on unit headings — ignore them as state; per-unit completion is determined during execution by reading the current file state.- [x]
-
Setup EnvironmentFirst, check the current branch:bash
current_branch=$(git branch --show-current) default_branch=$(git symbolic-ref refs/remotes/origin/HEAD 2>/dev/null | sed 's@^refs/remotes/origin/@@') # Fallback if remote HEAD isn't set if [ -z "$default_branch" ]; then default_branch=$(git rev-parse --verify origin/main >/dev/null 2>&1 && echo "main" || echo "master") fiIf already on a feature branch (not the default branch):First, check whether the branch name is meaningful — a name likeorfeat/crowd-snifftells future readers what the work is about. Auto-generated worktree names (e.g.,fix/email-validation) or other opaque names do not.worktree-jolly-beaming-ravenIf the branch name is meaningless or auto-generated, suggest renaming it before continuing:bashgit branch -m <meaningful-name>Derive the new name from the plan title or work description (e.g.,). Present the rename as a recommended option alongside continuing as-is.feat/crowd-sniffThen ask: "Continue working on, or create a new branch?"[current_branch]- If continuing (with or without rename), proceed to step 3
- If creating new, follow Option A or B below
If on the default branch, choose how to proceed:Option A: Create a new branchbashgit pull origin [default_branch] git checkout -b feature-branch-nameUse a meaningful name based on the work (e.g.,,feat/user-authentication).fix/email-validationOption B: Use a worktree (recommended for parallel development)bashskill: ce-worktree # The skill will create a new branch from the default branch in an isolated worktreeOption C: Continue on the default branch- Requires explicit user confirmation
- Only proceed after user explicitly says "yes, commit to [default_branch]"
- Never commit directly to the default branch without explicit permission
Recommendation: Use worktree if:- You want to work on multiple features simultaneously
- You want to keep the default branch clean while experimenting
- You plan to switch between branches frequently
-
Create Task List (skip if Phase 0 already built one, or if Phase 0 routed as Trivial)
- Use the platform's task tracking tool (/
TaskCreate/TaskUpdatein Claude Code,TaskListin Codex, or the equivalent on other harnesses) to break the plan into actionable tasksupdate_plan - Derive tasks from the plan's implementation units, dependencies, files, test targets, and verification criteria
- When the plan defines U-IDs for Implementation Units, preserve the unit's U-ID as a prefix in the task subject (e.g., "U3: Add parser coverage"). This keeps blocker references, deferred-work notes, and final summaries anchored to the same identifier the plan uses, so progress and traceability remain unambiguous across plan edits
- Carry each unit's into the task when present
Execution note - For each unit, read the field before implementing — these point to specific files or conventions to mirror
Patterns to follow - Use each unit's field as the primary "done" signal for that task
Verification - Do not expect the plan to contain implementation code, micro-step TDD instructions, or exact shell commands
- Include dependencies between tasks
- Prioritize based on what needs to be done first
- Include testing and quality check tasks
- Keep tasks specific and completable
- Use the platform's task tracking tool (
-
Choose Execution StrategyDelegation routing gate: Ifis true AND the input is a plan file (not a bare prompt), read
delegation_activeand follow its Pre-Delegation Checks and Delegation Decision flow. If all checks pass and delegation proceeds, force serial execution and proceed directly to Phase 2 using the workflow's batched execution loop. If any check disables delegation, fall through to the standard strategy table below. If delegation is active but the input is a bare prompt (no plan file), setreferences/codex-delegation-workflow.mdto false with a brief note: "Codex delegation requires a plan file -- using standard mode." and continue with the standard strategy selection below.delegation_activeAfter creating the task list, decide how to execute based on the plan's size and dependency structure:Strategy When to use Inline 1-2 small tasks, or tasks needing user interaction mid-flight. Default for bare-prompt work — bare prompts rarely produce enough structured context to justify subagent dispatch Serial subagents 3+ tasks with dependencies between them. Each subagent gets a fresh context window focused on one unit — prevents context degradation across many tasks. Requires plan-unit metadata (Goal, Files, Approach, Test scenarios) Parallel subagents 3+ tasks that pass the Parallel Safety Check (below). Dispatch independent units simultaneously, run dependent units after their prerequisites complete. Requires plan-unit metadata Parallel Safety Check — required before choosing parallel dispatch:- Build a file-to-unit mapping from every candidate unit's section (Create, Modify, and Test paths)
Files: - Check for intersection — any file path appearing in 2+ units means overlap
- If any overlap is found, downgrade to serial subagents. Log the reason (e.g., "Units 2 and 4 share — using serial dispatch"). Serial subagents still provide context-window isolation without shared-directory risks
config/routes.rb
Even with no file overlap, parallel subagents sharing a working directory face git index contention (concurrent staging/committing corrupts the index) and test interference (concurrent test runs pick up each other's in-progress changes). The parallel subagent constraints below mitigate these.Subagent dispatch uses your available subagent or task spawning mechanism. For each unit, give the subagent:- The full plan file path (for overall context)
- The specific unit's Goal, Files, Approach, Execution note, Patterns, Test scenarios, and Verification
- Any resolved deferred questions relevant to that unit
- Instruction to check whether the unit's test scenarios cover all applicable categories (happy paths, edge cases, error paths, integration) and supplement gaps before writing tests
Parallel subagent constraints — when dispatching units in parallel (not serial or inline):- Instruct each subagent: "Do not stage files (), create commits, or run the project test suite. The orchestrator handles testing, staging, and committing after all parallel units complete."
git add - These constraints prevent git index contention and test interference between concurrent subagents
Permission mode: Omit theparameter when dispatching subagents so the user's configured permission settings apply. Do not passmode— it overrides user-level settings likemode: "auto".bypassPermissionsAfter each subagent completes (serial mode):- Review the subagent's diff — verify changes match the unit's scope and list
Files: - Run the relevant test suite to confirm the tree is healthy
- If tests fail, diagnose and fix before proceeding — do not dispatch dependent units on a broken tree
- Update the task list (do not edit the plan body — progress is carried by the commit)
- Dispatch the next unit
After all parallel subagents in a batch complete:- Wait for every subagent in the current parallel batch to finish before acting on any of their results
- Cross-check for discovered file collisions: compare the actual files modified by all subagents in the batch (not just their declared lists). Subagents may create or modify files not anticipated during planning — this is expected, since plans describe what not how. A collision only matters when 2+ subagents in the same batch modified the same file. In a shared working directory, only the last writer's version survives — the other unit's changes to that file are lost. If a collision is detected: commit all non-colliding files from all units first, then re-run the affected units serially for the shared file so each builds on the other's committed work
Files: - For each completed unit, in dependency order: review the diff, run the relevant test suite, stage only that unit's files, and commit with a conventional message derived from the unit's Goal
- If tests fail after committing a unit's changes, diagnose and fix before committing the next unit
- Update the task list (do not edit the plan body — progress is carried by the commits just made)
- Dispatch the next batch of independent units, or the next dependent unit
- Build a file-to-unit mapping from every candidate unit's
-
读取计划并澄清 (若从阶段0的纯提示词跳转则跳过此步)
- 完整读取任务文档
- 将计划视为决策工件,而非执行脚本
- 若计划包含、
Implementation Units、Work Breakdown、Requirements Trace、Files或Test Scenarios等部分,将其作为执行的主要素材Verification - 检查每个实现单元的——这些包含计划对该单元的执行姿态信号(例如测试优先或特性优先)。创建任务时记录这些信息
Execution note - 检查或
Deferred to Implementation部分——这些是规划者特意留到执行阶段解决的问题。开始前记录这些问题,以便指导执行方法,避免中途意外Implementation-Time Unknowns - 检查部分——这些是明确的非目标。若执行过程中涉及相邻工作,需参考此部分
Scope Boundaries - 查看计划中提供的任何参考资料或链接
- 若用户在本次会话中明确要求TDD、测试优先或特性优先执行,即使计划中无,也需遵守该请求
Execution note - 若有任何不清楚或模糊的地方,立即提出澄清问题
- 若上述步骤需要澄清问题,需获得用户对解决方案的批准。若无需澄清,则直接继续——计划范围具有权威性,无需重新协商
- 请勿跳过此步骤 - 现在提问比后续构建错误内容更好
- 执行期间请勿编辑计划主体。计划是决策工件;进度记录在git提交和任务跟踪器中。ce-work执行期间唯一的计划变更就是交付时最终将翻转(见
status: active → completed阶段4第2步)。旧计划可能在单元标题中包含references/shipping-workflow.md/- [ ]标记——忽略这些状态;单元完成情况由执行期间读取的当前文件状态决定。- [x]
-
环境设置首先,检查当前分支:bash
current_branch=$(git branch --show-current) default_branch=$(git symbolic-ref refs/remotes/origin/HEAD 2>/dev/null | sed 's@^refs/remotes/origin/@@') # Fallback if remote HEAD isn't set if [ -z "$default_branch" ]; then default_branch=$(git rev-parse --verify origin/main >/dev/null 2>&1 && echo "main" || echo "master") fi若已在功能分支上(非默认分支):首先,检查分支名称是否有意义——类似或feat/crowd-sniff的名称能让未来的读者了解任务内容。自动生成的工作树名称(例如fix/email-validation)或其他不透明名称则无意义。worktree-jolly-beaming-raven若分支名称无意义或为自动生成,建议在继续前重命名:bashgit branch -m <meaningful-name>根据计划标题或任务描述推导新名称(例如)。将重命名作为推荐选项,同时提供继续使用原名称的选择。feat/crowd-sniff然后询问:"继续在上工作,还是创建新分支?"[current_branch]- 若继续(无论是否重命名),进入第3步
- 若创建新分支,遵循以下选项A或B
若在默认分支上,选择执行方式:选项A:创建新分支bashgit pull origin [default_branch] git checkout -b feature-branch-name根据任务内容使用有意义的名称(例如、feat/user-authentication)。fix/email-validation选项B:使用工作树(并行开发推荐)bashskill: ce-worktree # 该技能将在独立工作树中从默认分支创建新分支选项C:继续在默认分支上工作- 需要用户明确确认
- 仅在用户明确表示"yes, commit to [default_branch]"后继续
- 未经明确许可,切勿直接提交到默认分支
推荐场景:若符合以下情况,使用工作树:- 需同时处理多个功能
- 希望在实验时保持默认分支干净
- 计划频繁切换分支
-
创建任务列表 (若阶段0已构建任务列表,或阶段0路由为Trivial则跳过此步)
- 使用平台的任务跟踪工具(Claude Code中的/
TaskCreate/TaskUpdate,Codex中的TaskList,或其他工具的等效功能)将计划拆分为可执行任务update_plan - 根据计划的实现单元、依赖关系、文件、测试目标和验证标准推导任务
- 若计划为实现单元定义了U-ID,在任务主题中保留单元的U-ID作为前缀(例如"U3: Add parser coverage")。这能让阻塞引用、延迟工作记录和最终摘要与计划使用的标识符保持一致,确保计划编辑后进度和可追溯性仍清晰明确
- 若存在单元的,将其纳入任务
Execution note - 对于每个单元,实现前先阅读字段——这些指向需镜像的特定文件或约定
Patterns to follow - 将每个单元的字段作为该任务的主要"完成"信号
Verification - 不要期望计划包含实现代码、微步骤TDD指令或精确的shell命令
- 包含任务之间的依赖关系
- 根据优先级排序
- 包含测试和质量检查任务
- 保持任务具体且可完成
- 使用平台的任务跟踪工具(Claude Code中的
-
选择执行策略委托路由 gate:若为true且输入为计划文件(非纯提示词),请阅读
delegation_active并遵循其委托前检查和委托决策流程。若所有检查通过且委托执行,强制串行执行并直接进入阶段2,使用工作流的批量执行循环。若任何检查禁用委托,则使用下方的标准策略表。若委托已激活但输入为纯提示词(无计划文件),将references/codex-delegation-workflow.md设为false并附带简短说明:"Codex委托需要计划文件——使用标准模式。"然后继续选择下方的标准策略。delegation_active创建任务列表后,根据计划的规模和依赖结构决定执行方式:策略 使用场景 Inline(内联) 1-2个小任务,或执行过程中需要用户交互的任务。纯提示词任务的默认策略——纯提示词很少能产生足够的结构化上下文来证明子代理调度的合理性 Serial subagents(串行子代理) 3个以上存在依赖关系的任务。每个子代理获得专注于一个单元的全新上下文窗口——避免多任务导致的上下文退化。需要计划单元元数据(目标、文件、方法、测试场景) Parallel subagents(并行子代理) 3个以上通过并行安全检查(见下文)的任务。同时调度独立单元,在其前置任务完成后再调度依赖单元。需要计划单元元数据 并行安全检查 — 选择并行调度前必须执行:- 根据每个候选单元的部分(创建、修改和测试路径)构建文件到单元的映射
Files: - 检查是否存在交集——任何文件路径出现在2个以上单元中即表示存在重叠
- 若发现任何重叠,降级为串行子代理。记录原因(例如"Units 2 and 4 share — using serial dispatch")。串行子代理仍能提供上下文窗口隔离,且无共享目录风险
config/routes.rb
即使无文件重叠,共享工作目录的并行子代理仍会面临git索引冲突(并发暂存/提交会损坏索引)和测试干扰(并发测试运行会获取彼此的进行中变更)。以下并行子代理约束可缓解这些问题。子代理调度使用可用的子代理或任务生成机制。对于每个单元,向子代理提供:- 完整的计划文件路径(用于整体上下文)
- 特定单元的目标、文件、方法、执行说明、模式、测试场景和验证标准
- 与该单元相关的已解决延迟问题
- 指令:检查单元的测试场景是否涵盖所有适用类别(正常路径、边缘情况、错误路径、集成),并在编写测试前补充缺失部分
并行子代理约束 — 并行调度单元时(非串行或内联):- 指示每个子代理:"不要暂存文件()、创建提交或运行项目测试套件。协调器会在所有并行单元完成后处理测试、暂存和提交。"
git add - 这些约束可防止并发子代理之间的git索引冲突和测试干扰
权限模式:调度子代理时省略参数,以便应用用户配置的权限设置。请勿传递mode——这会覆盖用户级设置如mode: "auto"。bypassPermissions每个子代理完成后(串行模式):- 审查子代理的diff——验证变更是否符合单元的范围和列表
Files: - 运行相关测试套件以确认代码库健康
- 若测试失败,先诊断并修复再继续——不要在损坏的代码库上调度依赖单元
- 更新任务列表(请勿编辑计划主体——进度由提交记录)
- 调度下一个单元
所有并行子代理批次完成后:- 等待当前并行批次中的每个子代理完成后,再处理其结果
- 交叉检查是否发现文件冲突:比较批次中所有子代理实际修改的文件(不仅是其声明的列表)。子代理可能会创建或修改规划期间未预期的文件——这是正常的,因为计划描述的是做什么而非怎么做。只有当同一批次中的2个以上子代理修改了同一文件时,冲突才会产生影响。在共享工作目录中,只有最后写入者的版本会保留——其他单元对该文件的变更会丢失。若检测到冲突:先提交所有单元的非冲突文件,然后针对共享文件重新串行运行受影响的单元,以便每个单元都基于其他单元已提交的工作进行构建
Files: - 按依赖顺序处理每个已完成的单元:审查diff、运行相关测试套件、仅暂存该单元的文件,并根据单元目标生成符合规范的提交消息
- 若提交单元变更后测试失败,先诊断并修复再提交下一个单元
- 更新任务列表(请勿编辑计划主体——进度由刚完成的提交记录)
- 调度下一批独立单元,或下一个依赖单元
- 根据每个候选单元的
Phase 2: Execute
阶段2:执行
-
Task Execution LoopFor each task in priority order:
while (tasks remain): - Mark task as in-progress - Read any referenced files from the plan or discovered during Phase 0 - **If the unit's work is already present and matches the plan's intent** (files exist with the expected capability, or the unit's `Verification` criteria are already satisfied by the current code), the work has likely shipped on a prior branch or session. Verify it matches, mark the task complete, and move on. Do not silently reimplement. - Look for similar patterns in codebase - Find existing test files for implementation files being changed (Test Discovery — see below) - If delegation_active: branch to the Codex Delegation Execution Loop (see `references/codex-delegation-workflow.md`) - Otherwise: implement following existing conventions - Add, update, or remove tests to match implementation changes (see Test Discovery below) - Run System-Wide Test Check (see below) - Run tests after changes - Assess testing coverage: did this task change behavior? If yes, were tests written or updated? If no tests were added, is the justification deliberate (e.g., pure config, no behavioral change)? - Mark task as completed - Evaluate for incremental commit (see below)When a unit carries an, honor it. For test-first units, write the failing test before implementation for that unit. For characterization-first units, capture existing behavior before changing it. For units without anExecution note, proceed pragmatically.Execution noteGuardrails for execution posture:- Do not write the test and implementation in the same step when working test-first
- Do not skip verifying that a new test fails before implementing the fix or feature
- Do not over-implement beyond the current behavior slice when working test-first
- Skip test-first discipline for trivial renames, pure configuration, and pure styling work
Test Discovery — Before implementing changes to a file, find its existing test files (search for test/spec files that import, reference, or share naming patterns with the implementation file). When a plan specifies test scenarios or test files, start there, then check for additional test coverage the plan may not have enumerated. Changes to implementation files should be accompanied by corresponding test updates — new tests for new behavior, modified tests for changed behavior, removed or updated tests for deleted behavior.Test Scenario Completeness — Before writing tests for a feature-bearing unit, check whether the plan'scover all categories that apply to this unit. If a category is missing or scenarios are vague (e.g., "validates correctly" without naming inputs and expected outcomes), supplement from the unit's own context before writing tests:Test scenariosCategory When it applies How to derive if missing Happy path Always for feature-bearing units Read the unit's Goal and Approach for core input/output pairs Edge cases When the unit has meaningful boundaries (inputs, state, concurrency) Identify boundary values, empty/nil inputs, and concurrent access patterns Error/failure paths When the unit has failure modes (validation, external calls, permissions) Enumerate invalid inputs the unit should reject, permission/auth denials it should enforce, and downstream failures it should handle Integration When the unit crosses layers (callbacks, middleware, multi-service) Identify the cross-layer chain and write a scenario that exercises it without mocks System-Wide Test Check — Before marking a task done, pause and ask:Question What to do What fires when this runs? Callbacks, middleware, observers, event handlers — trace two levels out from your change. Read the actual code (not docs) for callbacks on models you touch, middleware in the request chain, hooks.after_*Do my tests exercise the real chain? If every dependency is mocked, the test proves your logic works in isolation — it says nothing about the interaction. Write at least one integration test that uses real objects through the full callback/middleware chain. No mocks for the layers that interact. Can failure leave orphaned state? If your code persists state (DB row, cache, file) before calling an external service, what happens when the service fails? Does retry create duplicates? Trace the failure path with real objects. If state is created before the risky call, test that failure cleans up or that retry is idempotent. What other interfaces expose this? Mixins, DSLs, alternative entry points (Agent vs Chat vs ChatMethods). Grep for the method/behavior in related classes. If parity is needed, add it now — not as a follow-up. Do error strategies align across layers? Retry middleware + application fallback + framework error handling — do they conflict or create double execution? List the specific error classes at each layer. Verify your rescue list matches what the lower layer actually raises. When to skip: Leaf-node changes with no callbacks, no state persistence, no parallel interfaces. If the change is purely additive (new helper method, new view partial), the check takes 10 seconds and the answer is "nothing fires, skip."When this matters most: Any change that touches models with callbacks, error handling with fallback/retry, or functionality exposed through multiple interfaces. -
Incremental CommitsAfter completing each task, evaluate whether to create an incremental commit:
Commit when... Don't commit when... Logical unit complete (model, service, component) Small part of a larger unit Tests pass + meaningful progress Tests failing About to switch contexts (backend → frontend) Purely scaffolding with no behavior About to attempt risky/uncertain changes Would need a "WIP" commit message Heuristic: "Can I write a commit message that describes a complete, valuable change? If yes, commit. If the message would be 'WIP' or 'partial X', wait."If the plan has Implementation Units, use them as a starting guide for commit boundaries — but adapt based on what you find during implementation. A unit might need multiple commits if it's larger than expected, or small related units might land together. Use each unit's Goal to inform the commit message.Commit workflow:bash# 1. Verify tests pass (use project's test command) # Examples: bin/rails test, npm test, pytest, go test, etc. # 2. Stage only files related to this logical unit (not `git add .`) git add <files related to this logical unit> # 3. Commit with conventional message git commit -m "feat(scope): description of this unit"Handling merge conflicts: If conflicts arise during rebasing or merging, resolve them immediately. Incremental commits make conflict resolution easier since each commit is small and focused.Note: Incremental commits use clean conventional messages without attribution footers. The final Phase 4 commit/PR includes the full attribution.Parallel subagent mode: When units run as parallel subagents, the subagents do not commit — the orchestrator handles staging and committing after the entire parallel batch completes (see Parallel subagent constraints in Phase 1 Step 4). The commit guidance in this section applies to inline and serial execution, and to the orchestrator's commit decisions after parallel batch completion. -
Follow Existing Patterns
- The plan should reference similar code - read those files first
- Match naming conventions exactly
- Reuse existing components where possible
- Follow project coding standards (see AGENTS.md; use CLAUDE.md only if the repo still keeps a compatibility shim)
- When in doubt, grep for similar implementations
-
Test Continuously
- Run relevant tests after each significant change
- Don't wait until the end to test
- Fix failures immediately
- Add new tests for new behavior, update tests for changed behavior, remove tests for deleted behavior
- Unit tests with mocks prove logic in isolation. Integration tests with real objects prove the layers work together. If your change touches callbacks, middleware, or error handling — you need both.
-
Simplify as You GoAfter completing a cluster of related implementation units (or every 2-3 units), review recently changed files for simplification opportunities — consolidate duplicated patterns, extract shared helpers, and improve code reuse and efficiency. This is especially valuable when using subagents, since each agent works with isolated context and can't see patterns emerging across units.Don't simplify after every single unit — early patterns may look duplicated but diverge intentionally in later units. Wait for a natural phase boundary or when you notice accumulated complexity.If askill or equivalent is available, use it. Otherwise, review the changed files yourself for reuse and consolidation opportunities.
/simplify -
Figma Design Sync (if applicable)For UI work with Figma designs:
- Implement components following design specs
- Use ce-figma-design-sync agent iteratively to compare
- Fix visual differences identified
- Repeat until implementation matches design
-
Frontend Design Guidance (if applicable)For UI tasks without a Figma design -- where the implementation touches view, template, component, layout, or page files, creates user-visible routes, or the plan contains explicit UI/frontend/design language:
- Load the skill before implementing
ce-frontend-design - Follow its detection, guidance, and verification flow
- If the skill produced a verification screenshot, it satisfies Phase 4's screenshot requirement -- no need to capture separately. If the skill fell back to mental review (no browser access), Phase 4's screenshot capture still applies
- Load the
-
Track Progress
- Keep the task list updated as you complete tasks
- Note any blockers or unexpected discoveries
- Create new tasks if scope expands
- Keep user informed of major milestones
- When the plan defines U-IDs for Implementation Units, or the plan or origin document carries stable R-IDs (and optionally A/F/AE IDs), reference them in blockers, deferred-work notes, task summaries, and final verification — not routine status updates. U-IDs anchor units across plan edits; R/A/F/AE anchor product intent across the brainstorm-plan handoff. Use the IDs the plan supplies and do not invent ones it does not. This preserves traceability without burying signal under noise.
-
任务执行循环按优先级顺序处理每个任务:
while (tasks remain): - 将任务标记为进行中 - 读取计划中引用的文件或阶段0中发现的文件 - **若单元的工作已存在且符合计划意图**(文件具备预期功能,或当前代码已满足单元的`Verification`标准),则该工作可能已在之前的分支或会话中交付。验证匹配后,将任务标记为完成并继续。请勿静默重新实现。 - 在代码库中查找类似模式 - 为正在修改的实现文件查找现有测试文件(测试发现——见下文) - 若`delegation_active`为true:切换到Codex委托执行循环 (见`references/codex-delegation-workflow.md`) - 否则:遵循现有约定实现 - 添加、更新或移除测试以匹配实现变更(见下文测试发现) - 运行系统级测试检查(见下文) - 变更后运行测试 - 评估测试覆盖率:该任务是否变更了行为?若是,是否编写或更新了测试?若未添加测试,是否有合理理由(例如纯配置变更,无行为变更)? - 将任务标记为完成 - 评估是否进行增量提交(见下文)若单元带有,需遵守该说明。对于测试优先的单元,在实现该单元前先编写失败的测试。对于特性优先的单元,在变更前先捕获现有行为。对于无Execution note的单元,务实推进。Execution note执行姿态防护措施:- 测试优先工作时,请勿在同一步骤中编写测试和实现
- 请勿跳过验证新测试在修复或实现功能前是否失败的步骤
- 测试优先工作时,请勿过度实现超出当前行为范围的内容
- 对于 trivial重命名、纯配置和纯样式工作,可跳过测试优先流程
测试发现 — 在修改文件前,查找其现有测试文件(搜索导入、引用或与实现文件命名模式一致的测试/规范文件)。若计划指定了测试场景或测试文件,从这些开始,然后检查计划可能未列举的额外测试覆盖率。实现文件的变更应伴随相应的测试更新——为新行为添加新测试,为变更行为修改测试,为删除行为移除或更新测试。测试场景完整性 — 在为功能单元编写测试前,检查计划的是否涵盖该单元适用的所有类别。若某类别缺失或场景模糊(例如"validates correctly"未指定输入和预期结果),在编写测试前根据单元自身上下文补充:Test scenarios类别 适用场景 缺失时如何推导 正常路径 所有功能单元均适用 从单元的目标和方法中读取核心输入/输出对 边缘情况 当单元有明确边界(输入、状态、并发)时 识别边界值、空/nil输入和并发访问模式 错误/失败路径 当单元有失败模式(验证、外部调用、权限)时 列举单元应拒绝的无效输入、应执行的权限/认证拒绝,以及应处理的下游失败 集成 当单元跨层(回调、中间件、多服务)时 识别跨层链并编写不使用模拟的场景来测试 系统级测试检查 — 在标记任务完成前,暂停并思考:问题 操作 运行时会触发什么? 回调、中间件、观察者、事件处理程序——从你的变更向外追踪两层。 读取你接触的模型的回调、请求链中的中间件、 钩子的实际代码(而非文档)。after_*我的测试是否覆盖了真实链路? 若所有依赖均被模拟,测试仅证明你的逻辑在隔离环境中有效——无法证明交互是否正常。 至少编写一个使用真实对象贯穿完整回调/中间件链的集成测试。交互层不使用模拟。 失败是否会留下孤立状态? 若你的代码在调用外部服务前持久化状态(数据库行、缓存、文件),服务失败时会发生什么?重试是否会创建重复数据? 使用真实对象追踪失败路径。若在风险调用前创建了状态,测试失败是否会清理状态或重试是否具有幂等性。 还有哪些接口暴露此功能? 混合类、DSL、替代入口点(Agent vs Chat vs ChatMethods)。 在相关类中搜索该方法/行为。若需要保持一致性,立即添加——不要留到后续。 各层的错误策略是否一致? 重试中间件 + 应用回退 + 框架错误处理——它们是否冲突或导致重复执行? 列出每层的具体错误类。验证你的捕获列表是否与下层实际抛出的错误匹配。 可跳过场景:无回调、无状态持久化、无并行接口的叶节点变更。若变更为纯添加(新辅助方法、新视图片段),检查仅需10秒,答案为"无触发内容,跳过"。重点关注场景:任何涉及带回调的模型、带回退/重试的错误处理,或通过多个接口暴露的功能的变更。 -
增量提交完成每个任务后,评估是否创建增量提交:
提交时机... 不提交时机... 逻辑单元完成(模型、服务、组件) 大型单元的一小部分 测试通过 + 有意义的进度 测试失败 即将切换上下文(后端 → 前端) 纯脚手架,无行为变更 即将尝试有风险/不确定的变更 需要"WIP"提交消息 启发式规则:"我能否编写描述完整、有价值变更的提交消息?若可以,提交。若消息为'WIP'或'partial X',等待。"若计划包含实现单元,将其作为提交边界的起始指南——但需根据实现过程中的发现调整。若单元比预期大,可能需要多次提交;或相关的小单元可一起提交。使用每个单元的目标来指导提交消息。提交工作流:bash# 1. 验证测试通过(使用项目的测试命令) # 示例:bin/rails test, npm test, pytest, go test, etc. # 2. 仅暂存与此逻辑单元相关的文件(不要使用`git add .`) git add <files related to this logical unit> # 3. 使用符合规范的消息提交 git commit -m "feat(scope): description of this unit"处理合并冲突:若在变基或合并过程中出现冲突,立即解决。增量提交使冲突解决更简单,因为每个提交都小而集中。注意:增量提交使用简洁的规范消息,无署名页脚。最终阶段4的提交/PR包含完整署名。并行子代理模式:当单元以并行子代理运行时,子代理不提交——协调器在整个并行批次完成后处理暂存和提交(见阶段1第4步中的并行子代理约束)。本节的提交指南适用于内联和串行执行,以及协调器在并行批次完成后的提交决策。 -
遵循现有模式
- 计划应引用类似代码——先读取这些文件
- 完全匹配命名约定
- 尽可能重用现有组件
- 遵循项目编码标准(见AGENTS.md;仅当仓库仍保留兼容性垫片时才使用CLAUDE.md)
- 若有疑问,搜索类似实现
-
持续测试
- 每次重大变更后运行相关测试
- 不要等到最后才测试
- 立即修复失败
- 为新行为添加新测试,为变更行为更新测试,为删除行为移除测试
- 带模拟的单元测试证明逻辑在隔离环境中有效。带真实对象的集成测试证明各层协同工作。 若你的变更涉及回调、中间件或错误处理——两者都需要。
-
逐步简化完成一组相关实现单元后(或每完成2-3个单元),审查最近变更的文件以寻找简化机会——合并重复模式、提取共享辅助方法、提高代码重用性和效率。使用子代理时这一点尤其重要,因为每个代理都在孤立上下文中工作,无法跨单元发现模式。不要在每个单元完成后都进行简化——早期模式可能看似重复,但在后续单元中可能会有意分化。等待自然的阶段边界或发现累积的复杂性时再进行简化。若有技能或等效功能,可使用。否则,自行审查变更文件以寻找重用和合并机会。
/simplify -
Figma设计同步(如适用)对于涉及Figma设计的UI工作:
- 遵循设计规范实现组件
- 迭代使用ce-figma-design-sync代理进行比较
- 修复识别出的视觉差异
- 重复直到实现与设计匹配
-
前端设计指南(如适用)对于无Figma设计的UI任务——实现涉及视图、模板、组件、布局或页面文件,创建用户可见路由,或计划包含明确的UI/前端/设计语言:
- 实现前加载技能
ce-frontend-design - 遵循其检测、指导和验证流程
- 若技能生成了验证截图,则满足阶段4的截图要求——无需单独捕获。若技能退化为人工审查(无浏览器访问),阶段4的截图捕获仍适用
- 实现前加载
-
跟踪进度
- 完成任务后更新任务列表
- 记录任何阻塞或意外发现
- 若范围扩大,创建新任务
- 向用户通报重大里程碑
- 若计划为实现单元定义了U-ID,或计划/origin文档包含稳定的R-ID(以及可选的A/F/AE ID),在阻塞记录、延迟工作记录、任务摘要和最终验证中引用这些ID——常规状态更新无需引用。U-ID在计划编辑后仍能锚定单元;R/A/F/AE在头脑风暴-计划交接后仍能锚定产品意图。使用计划提供的ID,不要自行创建。这样既能保持可追溯性,又不会因过多噪音掩盖信号。
Phase 3-4: Quality Check and Ship It
阶段3-4:质量检查与交付
When all Phase 2 tasks are complete and execution transitions to quality check, read for the full shipping workflow: quality checks, code review, final validation, PR creation, and notification.
references/shipping-workflow.md当阶段2的所有任务完成,执行过渡到质量检查时,请阅读获取完整的交付工作流:质量检查、代码审查、最终验证、PR创建和通知。
references/shipping-workflow.mdCodex Delegation Mode
Codex委托模式
When is true after argument parsing, read for the complete delegation workflow: pre-checks, batching, prompt template, execution loop, and result classification.
delegation_activereferences/codex-delegation-workflow.md参数解析后若为true,请阅读获取完整的委托工作流:预检查、批处理、提示模板、执行循环和结果分类。
delegation_activereferences/codex-delegation-workflow.mdKey Principles
核心原则
Start Fast, Execute Faster
快速启动,高效执行
- Get clarification once at the start, then execute
- Don't wait for perfect understanding - ask questions and move
- The goal is to finish the feature, not create perfect process
- 开始时一次性澄清,然后执行
- 不要等待完美理解——提问并推进
- 目标是完成功能,而非创建完美流程
The Plan is Your Guide
计划是你的指南
- Work documents should reference similar code and patterns
- Load those references and follow them
- Don't reinvent - match what exists
- 任务文档应引用类似代码和模式
- 加载这些参考资料并遵循
- 不要重新发明——匹配现有内容
Test As You Go
逐步测试
- Run tests after each change, not at the end
- Fix failures immediately
- Continuous testing prevents big surprises
- 每次变更后运行测试,而非最后才测试
- 立即修复失败
- 持续测试避免重大意外
Quality is Built In
内置质量保障
- Follow existing patterns
- Write tests for new code
- Run linting before pushing
- Review every change — inline for simple additive work, full review for everything else
- 遵循现有模式
- 为新代码编写测试
- 推送前运行代码检查
- 审查每个变更——简单添加工作可内联审查,其他工作需全面审查
Ship Complete Features
交付完整功能
- Mark all tasks completed before moving on
- Don't leave features 80% done
- A finished feature that ships beats a perfect feature that doesn't
- 完成所有任务后再推进
- 不要让功能停留在80%完成的状态
- 交付的完成功能优于未交付的完美功能
Common Pitfalls to Avoid
需避免的常见陷阱
- Analysis paralysis - Don't overthink, read the plan and execute
- Skipping clarifying questions - Ask now, not after building wrong thing
- Ignoring plan references - The plan has links for a reason
- Testing at the end - Test continuously or suffer later
- Forgetting to track progress - Update task status as you go or lose track of what's done
- 80% done syndrome - Finish the feature, don't move on early
- Skipping review - Every change gets reviewed; only the depth varies
- Re-scoping the plan into human-time phases - The plan's Implementation Units define the scope of execution. Do not estimate human-hours per unit, propose multi-day breakdowns, or ask the user to pick a subset of units for "this session". Agents execute at agent speed, and context-window pressure is addressed by subagent dispatch (Phase 1 Step 4), not by phased sessions. If a plan-file input is genuinely too large for a single execution, say so plainly and suggest the user return to to reduce scope — don't invent session phases as a workaround. For bare-prompt input, Phase 0's Large routing already handles oversized work
/ce-plan
- 分析瘫痪 - 不要过度思考,阅读计划并执行
- 跳过澄清问题 - 现在提问,不要等到构建错误内容后
- 忽略计划参考资料 - 计划中的链接是有原因的
- 最后才测试 - 持续测试,否则后续会吃苦头
- 忘记跟踪进度 - 及时更新任务状态,否则会忘记已完成的工作
- 80%完成综合征 - 完成功能,不要提前推进
- 跳过审查 - 每个变更都要审查;仅审查深度不同
- 将计划重新划分为人工时间阶段 - 计划的实现单元定义了执行范围。不要估算每个单元的人工工时,不要提出多日分解,也不要让用户选择"本次会话"要执行的单元子集。代理以代理速度执行,上下文窗口压力通过子代理调度(阶段1第4步)解决,而非分阶段会话。若计划文件输入确实过大,无法一次执行,请直接告知用户并建议返回缩小范围——不要发明会话阶段作为变通方法。对于纯提示词输入,阶段0的大型路由已处理超大任务
/ce-plan