long-task-coordinator

Compare original and translation side by side

🇺🇸

Original

English

🇨🇳

Translation

Chinese

Long Task Coordinator

长期任务协调器

Keep long-running work recoverable, stateful, and honest.

确保长期运行的工作可恢复、有状态且真实可信。

When to Use This Skill

何时使用该Skill

Use this skill when the work:

Spans multiple turns or multiple sessions
Involves handoffs to workers, subagents, or background jobs
Needs explicit waiting states instead of "still looking" updates
Must survive interruption and resume from a durable state file

Skip this skill for small, single-turn tasks. Use

planning-with-files

when simple planning is enough and recovery logic is not the main concern.

当工作满足以下任一条件时，可使用该Skill：

涉及多轮交互或多个会话
需要将工作交接给执行者、subagents或后台作业
需要明确的等待状态，而非“仍在处理中”的模糊更新
必须能够在中断后从持久化状态文件恢复

对于小型单轮任务，无需使用该Skill。如果仅需简单规划且恢复逻辑不是核心关注点，可使用

planning-with-files

。

Related Skills

Core Rules

核心规则

1. Create one source of truth

1. 建立单一可信数据源

For any real long task, maintain one durable state file. Chat history is not a reliable state store.

The state file should capture at least:

Goal
Success criteria
Current status
Current step
Completed work
Next action
Next checkpoint
Blockers
Active owners or workers

对于任何真正的长期任务，需维护一个持久化状态文件。聊天记录并非可靠的状态存储方式。

状态文件至少应包含：

目标
成功标准
当前状态
当前步骤
已完成工作
下一步动作
下一个检查点
阻塞因素
当前负责人或执行者

2. Separate roles only when needed

2. 仅在必要时拆分角色

Use the smallest role model that fits the task:

Origin: owns the goal and acceptance criteria
Coordinator: owns state, sequencing, and recovery
Worker: executes bounded sub-work
Watchdog: checks liveness and recovery only

Simple tasks can collapse these roles into one agent. Long or delegated tasks should make the split explicit.

使用最贴合任务的最小角色模型：

发起者：拥有目标和验收标准
协调者：负责状态、任务排序和恢复
执行者：执行限定范围的子工作
监控者：仅负责检查存活状态和恢复

简单任务可将这些角色合并到一个Agent中。对于长期或委托式任务，应明确拆分角色。

3. Run every cycle in this order

3. 按固定顺序执行每个周期

For each coordination round:

text

READ -> RECOVER -> DECIDE -> PERSIST -> REPORT -> END

Do not report conclusions before the state file has been updated.

每个协调周期需按以下顺序执行：

text

READ -> RECOVER -> DECIDE -> PERSIST -> REPORT -> END

在更新状态文件前，不得报告结论。

4. Treat

awaiting-result

as a valid state

4. 将

awaiting-result

视为有效状态

If a worker or background job was dispatched successfully, the task is not failing just because the result is not back yet.

Valid transitions include:

```
running -> awaiting-result
```
```
awaiting-result -> running
```
```
running -> paused
```
```
running -> complete
```

如果已成功将工作分派给执行者或后台作业，仅因结果未返回并不代表任务失败。

有效的状态转换包括：

```
running -> awaiting-result
```
```
awaiting-result -> running
```
```
running -> paused
```
```
running -> complete
```

5. Non-terminal rounds must create real progress

5. 非终结周期必须产生实际进展

A coordination round is only valid if it does at least one of the following:

Dispatches bounded work
Consumes new results
Updates the current stage or decision
Persists a new next step or checkpoint
Performs explicit recovery

If nothing changed, do not pretend the task advanced.

一个协调周期只有满足以下至少一项条件才视为有效：

分派了限定范围的工作
接收了新的结果
更新了当前阶段或决策
持久化了新的下一步动作或检查点
执行了明确的恢复操作

如果没有任何变化，不得假装任务有进展。

6. Keep recovery separate from domain work

6. 将恢复逻辑与业务工作分离

Recovery answers:

Did execution drift from the saved state?
Is the expected worker result still pending?
Do we need to wait, retry, or re-dispatch?

Domain work answers:

What should we build, analyze, or deliver next?

Recover first, then continue domain work.

恢复逻辑需解决以下问题：

实际执行是否偏离了已保存的状态？
委托工作的结果是否仍在等待？
是否需要等待、重试或重新分派工作？

业务工作需解决以下问题：

下一步应构建、分析或交付什么？

先执行恢复，再继续业务工作。

Operating Workflow

操作流程

Step 1: Decide whether the task needs coordination

步骤1：判断任务是否需要协调

Use this skill when at least one is true:

The task will outlive the current turn
The task will hand off work to another execution unit
The task needs checkpoints, polling, or scheduled follow-up
The task has enough complexity that loss of state would be expensive

当满足以下至少一项条件时，使用该Skill：

任务会超出当前轮次的生命周期
任务需要将工作交接给其他执行单元
任务需要检查点、轮询或定时跟进
任务复杂度较高，丢失状态会造成较大损失

Step 2: Create or load the state file

步骤2：创建或加载状态文件

Prefer a path that is easy to rediscover, such as:

```
docs/<topic>-execution-plan.md
```
```
docs/<topic>-state.md
```
```
worklog/<topic>-state.md
```

If no durable state exists yet, create one from

references/workflow.md

优先选择易于重新查找的路径，例如：

```
docs/<topic>-execution-plan.md
```
```
docs/<topic>-state.md
```
```
worklog/<topic>-state.md
```

如果尚未存在持久化状态文件，可基于

references/workflow.md

创建。

Step 3: Recover before acting

步骤3：先恢复再执行操作

At the start of every new round:

Read the state file
Check whether the recorded next step still makes sense
Confirm whether any delegated work returned
Repair stale assumptions before new action

在每个新周期开始时：

读取状态文件
检查记录的下一步动作是否仍然合理
确认委托工作是否已返回结果
在执行新动作前修复过时的假设

Step 4: Persist before reporting

步骤4：先持久化再报告

After deciding the next action:

Update the state file
Record new status, owners, blockers, and checkpoint
Only then report progress to the user or caller

确定下一步动作后：

更新状态文件
记录新的状态、负责人、阻塞因素和检查点
之后再向用户或调用者报告进展

Step 5: Close the round honestly

步骤5：如实结束周期

End each round with one of these states:

```
running
```
```
awaiting-result
```
```
paused
```
```
blocked
```
```
complete
```

The reported status should match the persisted status exactly.

每个周期需以以下状态之一结束：

```
running
```
```
awaiting-result
```
```
paused
```
```
blocked
```
```
complete
```

报告的状态必须与持久化的状态完全一致。

Output Expectations

输出预期

When using this skill, produce updates that are grounded in saved state:

What status the task is in now
What changed this round
What is expected next
What would unblock or complete the task

使用该Skill时，需基于已保存的状态生成更新内容，包括：

任务当前的状态
本周期的变化
下一步预期
如何解除阻塞或完成任务

Acceptance Criteria

验收标准

Treat the coordination work as complete only when all relevant items below are true:

A durable state file exists in a predictable path
The saved status matches the real task state
Completed work, next action, and blockers are recorded explicitly
Any delegated work has a named owner and a return condition
The final report is derived from the persisted state, not from transient reasoning

If the task is not truly complete, end in

running

awaiting-result

paused

, or

blocked

rather than pretending the work is done

只有当以下所有相关条件都满足时，才视为协调工作完成：

在可预测路径下存在持久化状态文件
保存的状态与任务实际状态一致
已明确记录已完成工作、下一步动作和阻塞因素
所有委托工作都有指定负责人和返回条件
最终报告基于持久化状态生成，而非临时推理

如果任务未真正完成，需以

running

、

awaiting-result

、

paused

或

blocked

状态结束，而非假装工作已完成

Anti-Patterns

反模式

Avoid:

Reconstructing progress from memory instead of the state file
Reporting a conclusion before saving it
Marking waiting as failure
Ending a round with no new action and no state change
Mixing recovery checks with domain decisions in one fuzzy step

需避免：

从记忆而非状态文件重建进展
在保存结论前就进行报告
将等待状态标记为失败
结束周期时未产生新动作且未更改状态
将恢复检查与业务决策混为一个模糊步骤

References

参考资料

```
references/workflow.md
```
- Detailed workflow, state template, and recovery checklist

```
references/workflow.md
```
- 详细工作流、状态模板和恢复检查清单