swarm-migrate

Compare original and translation side by side

🇺🇸

Original

English
🇨🇳

Translation

Chinese

/ork:swarm-migrate — Cross-Repo Migration Swarm

/ork:swarm-migrate — 跨仓库迁移集群

One command, N repos, one coordinator, one ledger.
一条命令,N个仓库,一个协调器,一个台账。

When to use

适用场景

Use when the same transformation needs to land in 3 or more repos with the same shape (workflow bump, dependency upgrade, codemod, lint-rule introduction, secret rotation, runbook header). Don't use for one-repo work — that's
/ork:implement
. Don't use for novel exploration — that's
/ork:brainstorm
.
This skill exists because the 275-session insights showed 25 sessions burned coordinating PR cascades manually (M164 deploy-migration, M17 yg-mcp-core extraction, @v1 reusable workflow rollout across 14 repos). The pattern was always: pick a repo, branch, apply, push, watch CI, repeat. This automates the repeat.
适用于需要在3个及以上结构相似的仓库中执行相同变更的场景(如工作流升级、依赖版本更新、codemod应用、lint规则引入、密钥轮换、运行手册头更新)。请勿用于单仓库工作——单仓库操作请使用
/ork:implement
。请勿用于创新性探索——此类场景请使用
/ork:brainstorm
开发该技能的原因是,基于275次会话的分析发现,有25次会话耗费在手动协调PR流程上(如M164部署迁移、M17 yg-mcp-core抽取、在14个仓库中推广@v1可复用工作流)。这些工作的模式始终如一:选择仓库、创建分支、应用变更、推送代码、等待CI完成,重复上述步骤。而本技能将这一重复流程自动化。

Inputs

输入参数

A YAML spec at
swarm-specs/<name>.yaml
:
yaml
name: bump-actions-checkout-v4
description: "Pin @actions/checkout to v4 across all repos"
输入为位于
swarm-specs/<name>.yaml
的YAML配置文件:
yaml
name: bump-actions-checkout-v4
description: "Pin @actions/checkout to v4 across all repos"

Topology — repos in dependency order. Coordinator only proceeds

Topology — repos in dependency order. Coordinator only proceeds

to a downstream repo after every upstream parent has merged green.

to a downstream repo after every upstream parent has merged green.

repos:
  • path: ~/coding/yonatan-hq/platform upstream: []
  • path: ~/coding/yonatan-hq/ventures/jobscraper upstream: [platform] # waits for platform to merge first
repos:
  • path: ~/coding/yonatan-hq/platform upstream: []
  • path: ~/coding/yonatan-hq/ventures/jobscraper upstream: [platform] # waits for platform to merge first

Transformation — applied identically per repo. The agent runs this

Transformation — applied identically per repo. The agent runs this

inside the isolated worktree, then verifies with the next field.

inside the isolated worktree, then verifies with the next field.

transform: type: codemod # codemod | regex | command command: | grep -rl 'actions/checkout@v3' .github/workflows |
xargs sed -i '' 's|actions/checkout@v3|actions/checkout@v4|g'
transform: type: codemod # codemod | regex | command command: | grep -rl 'actions/checkout@v3' .github/workflows |
xargs sed -i '' 's|actions/checkout@v3|actions/checkout@v4|g'

Verification — must pass before PR opens. Coordinator skips the repo

Verification — must pass before PR opens. Coordinator skips the repo

if it fails locally (records skip-reason in ledger).

if it fails locally (records skip-reason in ledger).

verify:
  • command: "git diff --quiet" expect: nonzero # must have changes
  • command: "grep -r 'actions/checkout@v3' .github/workflows" expect: nonzero # zero matches = clean
verify:
  • command: "git diff --quiet" expect: nonzero # must have changes
  • command: "grep -r 'actions/checkout@v3' .github/workflows" expect: nonzero # zero matches = clean

PR shape — title, body, base branch

PR shape — title, body, base branch

pr: branch_prefix: chore/bump-checkout-v4 title: "chore(ci): pin @actions/checkout to v4" body_file: swarm-specs/bump-actions-checkout-v4.pr.md base: main labels: [chore, ci]
pr: branch_prefix: chore/bump-checkout-v4 title: "chore(ci): pin @actions/checkout to v4" body_file: swarm-specs/bump-checkout-v4.pr.md base: main labels: [chore, ci]

CI gate — coordinator waits for required checks to pass before

CI gate — coordinator waits for required checks to pass before

moving downstream. Set to false for dry-run, or in repos without CI.

moving downstream. Set to false for dry-run, or in repos without CI.

ci_gate: required_checks: ["build", "test"] timeout_minutes: 20 on_failure: pause # pause | skip | abort
ci_gate: required_checks: ["build", "test"] timeout_minutes: 20 on_failure: pause # pause | skip | abort

Limits

Limits

max_parallel: 4 abort_on_novel_failure: true
undefined
max_parallel: 4 abort_on_novel_failure: true
undefined

How it works

工作原理

                    ┌──────────────────────────────────┐
                    │      COORDINATOR (you)           │
                    │  reads spec → builds DAG →       │
                    │  writes .swarm-state.json        │
                    └────────────┬─────────────────────┘
              ┌──────────────────┼──────────────────┐
              ▼                  ▼                  ▼
        ┌──────────┐       ┌──────────┐       ┌──────────┐
        │ WORKER A │       │ WORKER B │       │ WORKER C │
        │ (repo 1) │       │ (repo 2) │       │ (repo 3) │
        └────┬─────┘       └────┬─────┘       └────┬─────┘
             │                  │                  │
             └─────────── isolated worktrees ──────┘
             │ each: clone branch, transform,
             │       verify, push, open PR,
             │       wait for CI, report
        ┌─────────────────────────────────────────────┐
        │            .swarm-state.json                │
        │  rolling ledger of {repo, status,           │
        │  pr_url, ci_state, last_action_at}          │
        └─────────────────────────────────────────────┘
Each worker is a
Agent
tool invocation (subagent type
git-operations-engineer
for plumbing or
backend-system-architect
for schema-flavored migrations). The coordinator (you, this skill) reads the ledger between waves and decides whether to release downstream waves or pause.
                    ┌──────────────────────────────────┐
                    │      COORDINATOR (you)           │
                    │  reads spec → builds DAG →       │
                    │  writes .swarm-state.json        │
                    └────────────┬─────────────────────┘
              ┌──────────────────┼──────────────────┐
              ▼                  ▼                  ▼
        ┌──────────┐       ┌──────────┐       ┌──────────┐
        │ WORKER A │       │ WORKER B │       │ WORKER C │
        │ (repo 1) │       │ (repo 2) │       │ (repo 3) │
        └────┬─────┘       └────┬─────┘       └────┬─────┘
             │                  │                  │
             └─────────── isolated worktrees ──────┘
             │ each: clone branch, transform,
             │       verify, push, open PR,
             │       wait for CI, report
        ┌─────────────────────────────────────────────┐
        │            .swarm-state.json                │
        │  rolling ledger of {repo, status,           │
        │  pr_url, ci_state, last_action_at}          │
        └─────────────────────────────────────────────┘
每个Worker都是一次
Agent
工具调用(对于底层操作,子Agent类型为
git-operations-engineer
;对于模式化迁移,子Agent类型为
backend-system-architect
)。协调器(即执行本技能的你)会在不同阶段读取台账,决定是否启动下一批次任务或暂停。

Phase 1 — Spec validation

阶段1 — 配置文件验证

Load
<spec-file.yaml>
. Verify:
  • Every
    repos[].path
    exists and is a git repo (use
    git -C <path> rev-parse
    checks).
  • The
    transform.command
    returns 0 in a dry-run mode (or
    transform.type: codemod
    resolves to a known codemod registered in
    swarm-specs/codemods/
    ).
  • Every
    upstream
    reference points to a declared repo (no dangling deps).
  • pr.body_file
    exists and is non-empty.
If any check fails, abort and print the table of failures. Do NOT proceed.
加载
<spec-file.yaml>
并验证:
  • 每个
    repos[].path
    对应的路径存在且为Git仓库(通过
    git -C <path> rev-parse
    检查)。
  • transform.command
    在试运行模式下返回0(或
    transform.type: codemod
    指向
    swarm-specs/codemods/
    中已注册的已知codemod)。
  • 所有
    upstream
    引用均指向已声明的仓库(无悬空依赖)。
  • pr.body_file
    存在且非空。
若任意检查失败,则终止任务并打印失败列表,请勿继续执行。

Phase 2 — Topology sort

阶段2 — 拓扑排序

Build a DAG from
upstream
edges. Detect cycles → abort. Group nodes by topological wave (wave 0 = no deps, wave 1 = depends only on wave 0, …). Coordinator releases one wave at a time.
Write
.swarm-state.json
at the repo root running the skill:
json
{
  "spec": "swarm-specs/bump-actions-checkout-v4.yaml",
  "started_at": "2026-05-16T17:00:00Z",
  "waves": [
    { "id": 0, "repos": ["platform"] },
    { "id": 1, "repos": ["jobscraper"] }
  ],
  "repos": {
    "platform": { "status": "pending", "pr_url": null, "ci_state": null, "last_action_at": null },
    "jobscraper": { "status": "blocked", "blocked_on": ["platform"], "pr_url": null }
  }
}
基于
upstream
依赖关系构建DAG图。若检测到循环依赖则终止任务。按拓扑批次对节点分组(批次0:无依赖;批次1:仅依赖批次0的仓库;依此类推)。协调器将逐批次启动任务。
在执行本技能的仓库根目录下写入
.swarm-state.json
json
{
  "spec": "swarm-specs/bump-actions-checkout-v4.yaml",
  "started_at": "2026-05-16T17:00:00Z",
  "waves": [
    { "id": 0, "repos": ["platform"] },
    { "id": 1, "repos": ["jobscraper"] }
  ],
  "repos": {
    "platform": { "status": "pending", "pr_url": null, "ci_state": null, "last_action_at": null },
    "jobscraper": { "status": "blocked", "blocked_on": ["platform"], "pr_url": null }
  }
}

Phase 3 — Dispatch wave

阶段3 — 调度批次任务

For each repo in the current wave, in parallel (bounded by
max_parallel
):
  1. Worktree — create an isolated worktree at
    <repo>/../<repo>-swarm-<spec-name>
    off
    origin/<base>
    . Never mutate the live working tree.
  2. Branch
    git checkout -b <branch_prefix>-<short-sha>
    .
  3. Transform — run
    transform.command
    (or apply codemod). Capture stdout to
    .swarm-logs/<repo>-transform.log
    .
  4. Verify — run each
    verify[].command
    , assert exit matches
    expect
    . On mismatch, mark repo
    skipped
    in ledger with reason, do not push.
  5. Push + PR — push branch, open PR via
    gh pr create
    . Update ledger with PR URL.
  6. Watch CI — poll
    gh pr checks <n>
    every 45s up to
    ci_gate.timeout_minutes
    . Update
    ci_state
    in ledger on every state transition.
Use the
Agent
tool with
subagent_type: ork:git-operations-engineer
for steps 1–5 to keep main context lean. The coordinator only reads the ledger.
对于当前批次中的每个仓库,并行执行(受
max_parallel
限制):
  1. Worktree — 在
    <repo>/../<repo>-swarm-<spec-name>
    路径下创建基于
    origin/<base>
    的独立worktree,绝不修改本地活跃工作树。
  2. 分支 — 执行
    git checkout -b <branch_prefix>-<short-sha>
    创建分支。
  3. 变更 — 运行
    transform.command
    (或应用codemod)。将标准输出捕获至
    .swarm-logs/<repo>-transform.log
  4. 验证 — 运行每个
    verify[].command
    ,断言退出码与
    expect
    匹配。若不匹配,则在台账中标记仓库为
    skipped
    并记录原因,不推送代码。
  5. 推送 + 创建PR — 推送分支,通过
    gh pr create
    创建PR。在台账中更新PR链接。
  6. 监控CI — 每隔45秒轮询
    gh pr checks <n>
    ,直至达到
    ci_gate.timeout_minutes
    设定的超时时间。每次状态变更时更新台账中的
    ci_state
步骤1-5使用
Agent
工具并指定
subagent_type: ork:git-operations-engineer
,以保持主上下文简洁。协调器仅负责读取台账。

Phase 4 — Wave gate

阶段4 — 批次闸门

After every wave, check the ledger:
  • All
    green
    → release the next wave.
  • Any
    pending CI
    → keep polling.
  • Any
    red CI
    → consult
    ci_gate.on_failure
    :
    • pause
      → halt the swarm, write a summary to
      .swarm-state.json
      , surface the failing logs, ask the user.
    • skip
      → mark repo
      failed-ci
      , continue with siblings (but block downstream unless they explicitly don't depend on this repo).
    • abort
      → terminate the swarm, leave open PRs as-is, never merge.
每个批次任务完成后,检查台账:
  • 所有仓库状态为
    green
    → 启动下一批次任务。
  • 存在状态为
    pending CI
    的仓库 → 持续轮询。
  • 存在状态为
    red CI
    的仓库 → 遵循
    ci_gate.on_failure
    配置:
    • pause
      → 暂停集群任务,在
      .swarm-state.json
      中写入总结,展示失败日志并询问用户。
    • skip
      → 标记仓库为
      failed-ci
      ,继续执行同批次其他仓库任务(但下游仓库会被阻塞,除非它们明确不依赖该仓库)。
    • abort
      → 终止集群任务,保留已创建的PR,绝不自动合并。

Phase 5 — Auto-rebase on conflicts

阶段5 — 冲突自动变基

If a downstream repo's worker hits a merge conflict on rebase (because an upstream merged), the worker:
  1. Re-fetches the upstream's merge commit SHA.
  2. Attempts
    git rebase origin/<base>
    . If clean → push, ledger update.
  3. If conflicts → mark the conflict files in the ledger, do NOT auto-resolve, surface to the coordinator. Conflicts are the most common place auto-fixers ship broken code.
若下游仓库的Worker在变基时遇到合并冲突(因上游仓库已合并),Worker将执行以下操作:
  1. 重新拉取上游仓库的合并提交SHA。
  2. 尝试执行
    git rebase origin/<base>
    。若变基无冲突 → 推送代码并更新台账。
  3. 若存在冲突 → 在台账中标记冲突文件,不自动解决,将问题反馈给协调器。冲突是自动修复工具最容易引入错误代码的场景。

Phase 6 — Final report

阶段6 — 最终报告

When all waves complete (or the swarm pauses/aborts), emit a single markdown report under
.swarm-logs/<spec-name>-report.md
:
markdown
undefined
当所有批次任务完成(或集群暂停/终止)时,在
.swarm-logs/<spec-name>-report.md
中生成一份Markdown报告:
markdown
undefined

Swarm report: bump-actions-checkout-v4

Swarm report: bump-actions-checkout-v4

Completed: 12/14 repos · paused: 2 · duration: 47 min
repostatusPRCIduration
platformmerged#3456green8 min
jobscrapermerged#281green6 min
...
dormant-repo-1skipped(no CI runner configured)
trading-aipaused#99red(novel failure — see logs)
Completed: 12/14 repos · paused: 2 · duration: 47 min
repostatusPRCIduration
platformmerged#3456green8 min
jobscrapermerged#281green6 min
...
dormant-repo-1skipped(no CI runner configured)
trading-aipaused#99red(novel failure — see logs)

Novel failures (escalated)

Novel failures (escalated)

  • trading-ai #99: pyproject lockfile mismatch — see .swarm-logs/trading-ai-ci.log
undefined
  • trading-ai #99: pyproject lockfile mismatch — see .swarm-logs/trading-ai-ci.log
undefined

Hard rules

硬性规则

  • Never merge a PR. The swarm opens PRs; humans merge them. Auto-merge can be armed by the user with
    gh pr merge --auto
    post-swarm if they want.
  • Never force-push. If a worker can't fast-forward, it pauses.
  • Never roam outside the spec's declared
    repos[]
    . Even if a transformation seems like it'd help elsewhere.
  • Always quarantine credentials. Workers run with the user's gh auth; the coordinator never logs tokens, just the URLs.
  • Always respect existing branch protections. If
    gh pr create
    fails because of required reviewers or other rules, that's a feature, not a bug to work around.
  • 绝不自动合并PR。集群仅负责创建PR,合并操作由人工完成。若用户需要自动合并,可在集群任务完成后执行
    gh pr merge --auto
  • 绝不强制推送。若Worker无法快进推送,则暂停任务。
  • 绝不超出配置文件中
    repos[]
    声明的仓库范围。即使变更似乎对其他仓库有帮助也不行。
  • 始终隔离凭证。Worker使用用户的gh权限运行;协调器绝不会记录令牌,仅记录链接。
  • 始终遵守现有分支保护规则。若
    gh pr create
    因需要指定审核人或其他规则而失败,这是预期功能,而非需要解决的bug。

Failure modes you'll actually hit

实际会遇到的故障模式

ModeWhat it looks likeMitigation
Stale lockfileCI red on
npm ci
after dependency bump
Spec includes a
post_transform.command: npm install
step
Branch protection blocks PR creation
gh pr create
exits non-zero
Coordinator marks repo
blocked-by-protection
, surfaces to user
Topology cyclePhase 2 abortRe-spec the upstream edges
Coordinator crash mid-flight
.swarm-state.json
half-written
Skill is resumable: re-run with same spec, it reads the ledger and skips
merged
/
green
repos
Worker subagent hangsNo ledger update for >5 minCoordinator times out the agent, marks repo
worker-timeout
, surfaces logs
模式表现缓解措施
过期锁文件依赖升级后
npm ci
导致CI失败
在配置文件中添加
post_transform.command: npm install
步骤
分支保护阻止PR创建
gh pr create
退出码非零
协调器标记仓库为
blocked-by-protection
并反馈给用户
拓扑循环依赖阶段2终止任务重新定义上游依赖关系
协调器中途崩溃
.swarm-state.json
写入不完整
技能支持恢复:使用相同配置文件重新运行,它会读取台账并跳过已
merged
/
green
的仓库
Worker子Agent挂起超过5分钟未更新台账协调器终止Agent超时,标记仓库为
worker-timeout
并展示日志

Related Skills

相关技能

  • Upstream
    /ork:brainstorm
    to design the spec,
    /ork:visualize-plan
    to ASCII-preview the DAG before dispatch.
  • Downstream
    /ork:verify
    per repo after merge,
    /status
    for org-wide sweep,
    /ci-debug
    if a worker hits a CI red.
  • Composes with
    /ork:create-pr
    (each worker calls into it),
    /ork:github-operations
    (bulk-update labels/milestones post-swarm).
  • 前置技能 — 使用
    /ork:brainstorm
    设计配置文件,使用
    /ork:visualize-plan
    在调度前以ASCII预览DAG图。
  • 后置技能 — 合并后使用
    /ork:verify
    验证单个仓库,使用
    /status
    进行全组织范围检查,若Worker遇到CI失败则使用
    /ci-debug
    排查问题。
  • 组合技能 — 与
    /ork:create-pr
    (每个Worker都会调用它)、
    /ork:github-operations
    (集群任务完成后批量更新标签/里程碑)配合使用。

What this skill does NOT do

本技能不支持的操作

  • Does not invent the spec. You write the spec; the skill executes it.
  • Does not perform schema migrations across DBs (use a single-repo skill plus
    /ork:database-patterns
    ).
  • Does not orchestrate production deploys — open PRs only; deploy is a separate gate (the platform's deploy-operator).
  • Does not bypass
    /ork:create-pr
    's playground-gate rule — each PR body must include a playground reference if the repo enforces it.
  • 不自动生成配置文件。配置文件需由用户编写,技能仅负责执行。
  • 不支持跨数据库的模式迁移(请使用单仓库技能配合
    /ork:database-patterns
    )。
  • 不编排生产部署——仅创建PR;部署是独立的环节(由平台的deploy-operator负责)。
  • 不绕过
    /ork:create-pr
    的playground-gate规则——若仓库强制要求,每个PR正文必须包含playground引用。

Example invocation

调用示例

bash
undefined
bash
undefined

Dry-run: build the DAG, verify spec, do NOT push or open PRs

试运行:构建DAG图,验证配置文件,不推送代码或创建PR

/ork:swarm-migrate swarm-specs/bump-actions-checkout-v4.yaml --dry-run
/ork:swarm-migrate swarm-specs/bump-actions-checkout-v4.yaml --dry-run

Live: dispatch up to 4 workers in parallel

正式运行:并行调度最多4个Worker

/ork:swarm-migrate swarm-specs/bump-actions-checkout-v4.yaml --max-parallel=4
/ork:swarm-migrate swarm-specs/bump-actions-checkout-v4.yaml --max-parallel=4

Resume after pause: same command, the ledger remembers

暂停后恢复:执行相同命令,台账会记录进度

/ork:swarm-migrate swarm-specs/bump-actions-checkout-v4.yaml
undefined
/ork:swarm-migrate swarm-specs/bump-actions-checkout-v4.yaml
undefined

Why this exists (one paragraph)

开发背景(简述)

You ran 25 sessions in a single month coordinating cross-repo PRs by hand. The 14-repo @v1 workflow rollout, the M17 yg-mcp-core extraction, the M164 deploy-migration. Every one of those sessions had the same shape: a coordinator (you) holding the DAG in your head, dispatching workers (you, sequentially) in different terminal tabs, hand-rolling a status table in your notes. This skill makes the coordinator a YAML file and the workers parallel subagents. The DAG, the ledger, the auto-rebase, the wave gating — all the bookkeeping you were doing manually — get codified once. You write the spec, you walk away, you come back to a report.
你在一个月内手动协调了25次跨仓库PR会话。包括14个仓库的@v1工作流推广、M17 yg-mcp-core抽取、M164部署迁移。每一次会话的流程都一模一样:你作为协调器在脑中记住DAG图,在不同终端标签页中依次执行Worker任务,在笔记中手动维护状态表。本技能将协调器逻辑固化为YAML文件,将Worker任务改为并行子Agent执行。DAG管理、台账记录、自动变基、批次闸门——所有你手动完成的繁琐工作——都被一次性编码实现。你只需编写配置文件,然后离开,回来时就能看到完整报告。