executing-plans

Compare original and translation side by side

🇺🇸

Original

English
🇨🇳

Translation

Chinese

Executing Plans

执行计划

You are an orchestrator. Spawn and coordinate sub-agents to do the actual implementation. Group related tasks by subsystem (e.g., one agent for API routes, another for tests) rather than spawning per-task. Each agent re-investigates the codebase, so fewer agents with broader scope = faster execution.
你是一个编排器。 生成并协调子Agent来完成实际的实施工作。按子系统对相关任务进行分组(例如,一个Agent负责API路由,另一个负责测试),而不是为每个任务单独生成Agent。每个Agent都会重新调研代码库,因此Agent数量越少、范围越广,执行速度就越快。

1. Setup

1. 准备工作

Create a branch for the work unless trivial. Consider git worktrees for isolated environments.
Clarify ambiguity upfront: If the plan has unclear requirements or meaningful tradeoffs, use
AskUserQuestion
before starting. Present options with descriptions explaining the tradeoffs. Use
multiSelect: true
for independent features that can be combined; use single-select for mutually exclusive choices. Don't guess when the user can clarify in 10 seconds.
Track progress with tasks: Use
TaskCreate
to create tasks for each major work item from the plan. Update status with
TaskUpdate
as work progresses (
in_progress
when starting,
completed
when done). This makes execution visible to the user and persists across context compactions.
创建分支用于本次工作(除非任务非常简单)。考虑使用git worktrees来创建隔离环境。
提前明确模糊点: 如果计划中存在不明确的需求或重大的权衡选择,在开始前使用
AskUserQuestion
。提供带有权衡说明的选项,对于可组合的独立功能使用
multiSelect: true
;对于互斥选项使用单选。当用户可以在10秒内澄清时,不要自行猜测。
用任务跟踪进度: 使用
TaskCreate
为计划中的每个主要工作项创建任务。随着工作推进,使用
TaskUpdate
更新状态(开始时设为
in_progress
,完成时设为
completed
)。这能让用户看到执行进度,并在上下文压缩后仍能保留进度信息。

2. Group Tasks by Subsystem

2. 按子系统分组任务

Group related tasks to share agent context. One agent per subsystem, groups run in parallel.
Why grouping matters:
Without: Task 1 (auth/login) → Agent 1 [explores auth/]
         Task 2 (auth/logout) → Agent 2 [explores auth/ again]

With:    Tasks 1-2 (auth/*) → Agent 1 [explores once, executes both]
SignalGroup together
Same directory prefix
src/auth/*
tasks
Same domain/featureAuth tasks, billing tasks
Plan sectionsTasks under same
##
heading
Limits: 3-4 tasks max per group. Split if larger.
Parallel: Groups touch different subsystems
Group A: src/auth/*    ─┬─ parallel
Group B: src/billing/* ─┘
Sequential: Groups have dependencies
Group A: Create shared types → Group B: Use those types
将相关任务分组以共享Agent上下文。每个子系统分配一个Agent,不同分组可并行运行。
分组的重要性:
不分组:任务1(auth/login)→ Agent 1 [调研auth/目录]
         任务2(auth/logout)→ Agent 2 [再次调研auth/目录]

分组后:任务1-2(auth/*)→ Agent 1 [仅调研一次,执行两个任务]
信号应分组的情况
相同目录前缀
src/auth/*
相关任务
相同领域/功能认证任务、计费任务
计划章节同一
##
标题下的任务
限制: 每个分组最多包含3-4个任务。如果任务更多,应拆分分组。
并行执行: 分组涉及不同子系统
分组A: src/auth/*    ─┬─ 并行执行
分组B: src/billing/* ─┘
顺序执行: 分组之间存在依赖关系
分组A: 创建共享类型 → 分组B: 使用这些类型

3. Execute

3. 执行实施

Dispatch sub-agents to complete task groups. Monitor progress and handle issues.
Task tool (general-purpose):
  description: "Auth tasks: login, logout"
  prompt: |
    Execute these tasks from [plan-file] IN ORDER:
    - Task 1: Add login endpoint
    - Task 2: Add logout endpoint

    Use skills: <relevant skills>
    Commit after each task. Report: files changed, test results
Architectural fit: Changes should integrate cleanly with existing patterns. If a change feels like it's fighting the architecture, that's a signal to refactor first rather than bolt something on. Don't reinvent wheels when battle-tested libraries exist, but don't reach for a dependency for trivial things either (no lodash just for
_.map
). The goal is zero tech debt, not "ship now, fix later."
Auto-recovery:
  1. Agent attempts to fix failures (has context)
  2. If can't fix, report failure with error output
  3. Dispatch fix agent with context
  4. Same error twice → stop and ask user
调度子Agent完成任务分组。监控进度并处理问题。
Task工具(通用型):
  description: "Auth tasks: login, logout"
  prompt: |
    Execute these tasks from [plan-file] IN ORDER:
    - Task 1: Add login endpoint
    - Task 2: Add logout endpoint

    Use skills: <relevant skills>
    Commit after each task. Report: files changed, test results
架构适配: 变更应与现有模式无缝集成。如果变更感觉与架构冲突,这是需要先重构而非强行添加功能的信号。当有经过验证的库可用时,不要重复造轮子,但对于简单需求也不要引入不必要的依赖(比如不要仅为了
_.map
而引入lodash)。目标是零技术债务,而非“先上线再修复”。
自动恢复机制:
  1. Agent尝试修复失败问题(拥有上下文信息)
  2. 如果无法修复,报告失败并提供错误输出
  3. 调度修复Agent并传递上下文
  4. 同一错误出现两次 → 停止并询问用户

4. Verify

4. 验证

All four checks must pass before marking complete:
  1. Automated tests: Run the full test suite. All tests must pass.
  2. Manual verification: Automated tests aren't sufficient. Actually exercise the changes:
    • API changes: Curl endpoints with realistic payloads
    • External integrations: Test against real services to catch rate limiting, format drift, bot detection
    • CLI changes: Run actual commands, verify output
    • Parser changes: Feed real data, not just fixtures
  3. DX quality: During manual testing, watch for friction:
    • Confusing error messages
    • Noisy output (telemetry spam, verbose logging)
    • Inconsistent behavior across similar endpoints
    • Rough edges that technically work but feel bad
    Fix DX issues inline or document for follow-up. Don't ship friction.
  4. Code review (mandatory): After tests pass and manual verification is done, dispatch the
    ce:code-reviewer
    agent via Task tool to review the full diff against the base branch. This step is not optional.
    Load relevant domain skills into the agent based on what was implemented. Evaluate which apply and include them in the agent prompt:
    • Skill(ce:architecting-systems)
      - system design, module boundaries
    • Skill(ce:managing-databases)
      - database work
    • Skill(ce:handling-errors)
      - error handling
    • Skill(ce:writing-tests)
      - test quality
    • Skill(ce:migrating-code)
      - migrations
    • Skill(ce:optimizing-performance)
      - performance work
    • Skill(ce:refactoring-code)
      - refactoring
    Handle the review verdict:
    • Must fix: Fix all Critical and Important issues before marking complete
    • Suggestions: Fix these too unless there's a clear reason not to
    Plan execution is not done until review findings are addressed.
必须通过以下四项检查后,才能标记任务完成:
  1. 自动化测试: 运行完整的测试套件,所有测试必须通过。
  2. 手动验证: 自动化测试并不足够。实际验证变更效果:
    • API变更: 使用真实请求体调用接口
    • 外部集成: 针对真实服务进行测试,以发现速率限制、格式偏差、机器人检测等问题
    • CLI变更: 运行实际命令,验证输出结果
    • 解析器变更: 输入真实数据,而非仅使用测试用例
  3. 开发者体验(DX)质量: 手动测试时,注意是否存在使用障碍:
    • 令人困惑的错误信息
    • 冗余输出(遥测垃圾信息、冗长日志)
    • 相似接口之间的行为不一致
    • 技术上可行但使用体验糟糕的细节
    即时修复DX问题,或记录后续跟进事项。不要交付存在使用障碍的功能。
  4. 代码审查(必填): 测试通过且手动验证完成后,通过Task工具调度
    ce:code-reviewer
    Agent,针对与基准分支的完整差异进行审查。此步骤为必填项。
    根据实现内容,为Agent加载相关领域技能:
    • Skill(ce:architecting-systems)
      - 系统设计、模块边界
    • Skill(ce:managing-databases)
      - 数据库操作
    • Skill(ce:handling-errors)
      - 错误处理
    • Skill(ce:writing-tests)
      - 测试质量
    • Skill(ce:migrating-code)
      - 代码迁移
    • Skill(ce:optimizing-performance)
      - 性能优化
    • Skill(ce:refactoring-code)
      - 代码重构
    处理审查结果:
    • 必须修复: 在标记完成前,修复所有严重和重要问题
    • 建议项: 除非有明确理由,否则也应修复这些问题
    只有解决了审查发现的所有问题,计划执行才算完成。

5. Commit

5. 提交代码

After verification passes, commit only the changes related to this plan:
  1. Run
    git status
    to see all changes
  2. Stage files by name, not with
    git add -A
    or
    git add .
    - only stage files you modified as part of this plan
  3. Leave unrelated changes alone - if there are pre-existing staged or unstaged changes that aren't part of this work, don't touch them
  4. Write a commit message that summarizes what was implemented, referencing the plan
验证通过后,仅提交与本次计划相关的变更:
  1. 运行
    git status
    查看所有变更
  2. 按文件名暂存文件,不要使用
    git add -A
    git add .
    - 仅暂存你在本次计划中修改的文件
  3. 不要触碰无关变更 - 如果存在预先暂存或未暂存的无关变更,不要修改它们
  4. 编写提交信息,总结实现内容,并关联本次计划

6. Cleanup

6. 清理工作

After committing:
  • Merge branch to main (if using branches)
  • Remove worktree (if using worktrees)
  • Mark plan file as COMPLETED
  • Move to
    ./plans/done/
    if applicable
提交完成后:
  • 将分支合并到主分支(如果使用了分支)
  • 删除worktree(如果使用了worktrees)
  • 将计划文件标记为COMPLETED
  • 如有需要,将计划文件移动到
    ./plans/done/
    目录