long-task-coordinator

Compare original and translation side by side

🇺🇸

Original

English
🇨🇳

Translation

Chinese

Long Task Coordinator

长期任务协调器

Keep long-running work recoverable, stateful, and honest.
确保长期运行的工作可恢复、有状态且真实可信。

When to Use This Skill

何时使用该Skill

Use this skill when the work:
  • Spans multiple turns or multiple sessions
  • Involves handoffs to workers, subagents, or background jobs
  • Needs explicit waiting states instead of "still looking" updates
  • Must survive interruption and resume from a durable state file
Skip this skill for small, single-turn tasks. Use
planning-with-files
when simple planning is enough and recovery logic is not the main concern.
当工作满足以下任一条件时,可使用该Skill:
  • 涉及多轮交互或多个会话
  • 需要将工作交接给执行者、subagents或后台作业
  • 需要明确的等待状态,而非“仍在处理中”的模糊更新
  • 必须能够在中断后从持久化状态文件恢复
对于小型单轮任务,无需使用该Skill。如果仅需简单规划且恢复逻辑不是核心关注点,可使用
planning-with-files

Related Skills

相关Skills

  • planning-with-files
    keeps multi-step work organized in files.
  • workflow-orchestrator
    chains follow-up skills after milestones.
  • long-task-coordinator
    makes long-running work resumable, auditable, and safe to hand off.
  • planning-with-files
    :将多步骤工作整理到文件中。
  • workflow-orchestrator
    :在里程碑后链式调用后续Skills。
  • long-task-coordinator
    :使长期运行的工作可恢复、可审计且能安全交接。

Core Rules

核心规则

1. Create one source of truth

1. 建立单一可信数据源

For any real long task, maintain one durable state file. Chat history is not a reliable state store.
The state file should capture at least:
  • Goal
  • Success criteria
  • Current status
  • Current step
  • Completed work
  • Next action
  • Next checkpoint
  • Blockers
  • Active owners or workers
对于任何真正的长期任务,需维护一个持久化状态文件。聊天记录并非可靠的状态存储方式。
状态文件至少应包含:
  • 目标
  • 成功标准
  • 当前状态
  • 当前步骤
  • 已完成工作
  • 下一步动作
  • 下一个检查点
  • 阻塞因素
  • 当前负责人或执行者

2. Separate roles only when needed

2. 仅在必要时拆分角色

Use the smallest role model that fits the task:
  • Origin: owns the goal and acceptance criteria
  • Coordinator: owns state, sequencing, and recovery
  • Worker: executes bounded sub-work
  • Watchdog: checks liveness and recovery only
Simple tasks can collapse these roles into one agent. Long or delegated tasks should make the split explicit.
使用最贴合任务的最小角色模型:
  • 发起者:拥有目标和验收标准
  • 协调者:负责状态、任务排序和恢复
  • 执行者:执行限定范围的子工作
  • 监控者:仅负责检查存活状态和恢复
简单任务可将这些角色合并到一个Agent中。对于长期或委托式任务,应明确拆分角色。

3. Run every cycle in this order

3. 按固定顺序执行每个周期

For each coordination round:
text
READ -> RECOVER -> DECIDE -> PERSIST -> REPORT -> END
Do not report conclusions before the state file has been updated.
每个协调周期需按以下顺序执行:
text
READ -> RECOVER -> DECIDE -> PERSIST -> REPORT -> END
在更新状态文件前,不得报告结论。

4. Treat
awaiting-result
as a valid state

4. 将
awaiting-result
视为有效状态

If a worker or background job was dispatched successfully, the task is not failing just because the result is not back yet.
Valid transitions include:
  • running -> awaiting-result
  • awaiting-result -> running
  • running -> paused
  • running -> complete
如果已成功将工作分派给执行者或后台作业,仅因结果未返回并不代表任务失败。
有效的状态转换包括:
  • running -> awaiting-result
  • awaiting-result -> running
  • running -> paused
  • running -> complete

5. Non-terminal rounds must create real progress

5. 非终结周期必须产生实际进展

A coordination round is only valid if it does at least one of the following:
  • Dispatches bounded work
  • Consumes new results
  • Updates the current stage or decision
  • Persists a new next step or checkpoint
  • Performs explicit recovery
If nothing changed, do not pretend the task advanced.
一个协调周期只有满足以下至少一项条件才视为有效:
  • 分派了限定范围的工作
  • 接收了新的结果
  • 更新了当前阶段或决策
  • 持久化了新的下一步动作或检查点
  • 执行了明确的恢复操作
如果没有任何变化,不得假装任务有进展。

6. Keep recovery separate from domain work

6. 将恢复逻辑与业务工作分离

Recovery answers:
  • Did execution drift from the saved state?
  • Is the expected worker result still pending?
  • Do we need to wait, retry, or re-dispatch?
Domain work answers:
  • What should we build, analyze, or deliver next?
Recover first, then continue domain work.
恢复逻辑需解决以下问题:
  • 实际执行是否偏离了已保存的状态?
  • 委托工作的结果是否仍在等待?
  • 是否需要等待、重试或重新分派工作?
业务工作需解决以下问题:
  • 下一步应构建、分析或交付什么?
先执行恢复,再继续业务工作。

Operating Workflow

操作流程

Step 1: Decide whether the task needs coordination

步骤1:判断任务是否需要协调

Use this skill when at least one is true:
  • The task will outlive the current turn
  • The task will hand off work to another execution unit
  • The task needs checkpoints, polling, or scheduled follow-up
  • The task has enough complexity that loss of state would be expensive
当满足以下至少一项条件时,使用该Skill:
  • 任务会超出当前轮次的生命周期
  • 任务需要将工作交接给其他执行单元
  • 任务需要检查点、轮询或定时跟进
  • 任务复杂度较高,丢失状态会造成较大损失

Step 2: Create or load the state file

步骤2:创建或加载状态文件

Prefer a path that is easy to rediscover, such as:
  • docs/<topic>-execution-plan.md
  • docs/<topic>-state.md
  • worklog/<topic>-state.md
If no durable state exists yet, create one from
references/workflow.md
.
优先选择易于重新查找的路径,例如:
  • docs/<topic>-execution-plan.md
  • docs/<topic>-state.md
  • worklog/<topic>-state.md
如果尚未存在持久化状态文件,可基于
references/workflow.md
创建。

Step 3: Recover before acting

步骤3:先恢复再执行操作

At the start of every new round:
  1. Read the state file
  2. Check whether the recorded next step still makes sense
  3. Confirm whether any delegated work returned
  4. Repair stale assumptions before new action
在每个新周期开始时:
  1. 读取状态文件
  2. 检查记录的下一步动作是否仍然合理
  3. 确认委托工作是否已返回结果
  4. 在执行新动作前修复过时的假设

Step 4: Persist before reporting

步骤4:先持久化再报告

After deciding the next action:
  1. Update the state file
  2. Record new status, owners, blockers, and checkpoint
  3. Only then report progress to the user or caller
确定下一步动作后:
  1. 更新状态文件
  2. 记录新的状态、负责人、阻塞因素和检查点
  3. 之后再向用户或调用者报告进展

Step 5: Close the round honestly

步骤5:如实结束周期

End each round with one of these states:
  • running
  • awaiting-result
  • paused
  • blocked
  • complete
The reported status should match the persisted status exactly.
每个周期需以以下状态之一结束:
  • running
  • awaiting-result
  • paused
  • blocked
  • complete
报告的状态必须与持久化的状态完全一致。

Output Expectations

输出预期

When using this skill, produce updates that are grounded in saved state:
  • What status the task is in now
  • What changed this round
  • What is expected next
  • What would unblock or complete the task
使用该Skill时,需基于已保存的状态生成更新内容,包括:
  • 任务当前的状态
  • 本周期的变化
  • 下一步预期
  • 如何解除阻塞或完成任务

Acceptance Criteria

验收标准

Treat the coordination work as complete only when all relevant items below are true:
  • A durable state file exists in a predictable path
  • The saved status matches the real task state
  • Completed work, next action, and blockers are recorded explicitly
  • Any delegated work has a named owner and a return condition
  • The final report is derived from the persisted state, not from transient reasoning
If the task is not truly complete, end in
running
,
awaiting-result
,
paused
, or
blocked
rather than pretending the work is done
只有当以下所有相关条件都满足时,才视为协调工作完成:
  • 在可预测路径下存在持久化状态文件
  • 保存的状态与任务实际状态一致
  • 已明确记录已完成工作、下一步动作和阻塞因素
  • 所有委托工作都有指定负责人和返回条件
  • 最终报告基于持久化状态生成,而非临时推理
如果任务未真正完成,需以
running
awaiting-result
paused
blocked
状态结束,而非假装工作已完成

Anti-Patterns

反模式

Avoid:
  • Reconstructing progress from memory instead of the state file
  • Reporting a conclusion before saving it
  • Marking waiting as failure
  • Ending a round with no new action and no state change
  • Mixing recovery checks with domain decisions in one fuzzy step
需避免:
  • 从记忆而非状态文件重建进展
  • 在保存结论前就进行报告
  • 将等待状态标记为失败
  • 结束周期时未产生新动作且未更改状态
  • 将恢复检查与业务决策混为一个模糊步骤

References

参考资料

  • references/workflow.md
    - Detailed workflow, state template, and recovery checklist
  • references/workflow.md
    - 详细工作流、状态模板和恢复检查清单