swarm

Compare original and translation side by side

🇺🇸

Original

English
🇨🇳

Translation

Chinese

Swarm

Swarm

Process many independent items in parallel.
create
builds a table handle;
run
fans work out across rows and merges results back. One row = one unit of work — swarm handles batching automatically.
并行处理多个独立任务。
create
方法创建表格句柄;
run
方法将任务分发至各行并合并结果。一行代表一个工作单元——Swarm 会自动处理批量任务。

Flow

流程

  1. Create. Build a table from a source — files, a glob pattern, or pre-parsed records. One row per item. Returns a handle.
  2. Run. Dispatch an
    instruction
    template across rows. Results are merged back into the table. Returns
    { completed, failed, skipped, failures }
    .
  3. Aggregate. Use
    rows()
    and plain JS to count, filter, or summarize. Do not spawn additional subagents for aggregation.
  4. Retry. Re-run with
    filter: { column: "<col>", exists: false }
    to reprocess only failed rows.
  1. 创建:从源数据(文件、通配符模式或预解析记录)构建表格,每行对应一个任务项,返回句柄。
  2. 运行:将
    instruction
    模板分发至各行,结果合并回表格,返回
    { completed, failed, skipped, failures }
  3. 汇总:使用
    rows()
    和原生JS进行计数、过滤或总结,汇总时不要生成额外的子代理。
  4. 重试:使用
    filter: { column: "<col>", exists: false }
    重新运行,仅处理失败的行。

Choosing a source

选择数据源

glob
/
filePaths
— one file = one row. Use when each file is an independent unit of work. Each row gets
{ id, file }
; the subagent reads the file itself via the
{file}
placeholder.
tasks
— pass pre-built records directly. Use when the data lives inside a file (JSONL, CSV, JSON array). Read and parse the file first inside
eval
, then pass the records. One record = one row — do not group multiple items into a single row.
For small files (under ~500 lines), parse and create in one block:
javascript
const { create } = await import("@/skills/swarm");
const raw = await tools.readFile({ file_path: "/data.jsonl" });
const records = raw.trim().split("\n").map(l => JSON.parse(l));
const table = await create({ tasks: records });
console.log(table);
For large files, read in chunks of 500 lines to avoid truncation:
javascript
const { create } = await import("@/skills/swarm");
let records = [];
let offset = 0;
while (true) {
  const chunk = await tools.readFile({ file_path: "/data.txt", offset, limit: 500 });
  const lines = chunk.split("\n").filter(l => l.trim());
  for (const l of lines) { records.push({ id: `r${records.length}`, text: l }); }
  if (lines.length < 500) break;
  offset += 500;
}
const table = await create({ tasks: records });
console.log(table);
When the file is too large to parse and dispatch in one
eval
call, split across two blocks. Only the block that calls swarm functions needs the import:
javascript
// eval 1: parse only — no swarm import needed
const raw = await tools.readFile({ file_path: "/data.jsonl" });
globalThis.records = raw.trim().split("\n").map(l => JSON.parse(l));
console.log(`Parsed ${globalThis.records.length} records`);
javascript
// eval 2: create and dispatch
const { create, run } = await import("@/skills/swarm");
const table = await create({ tasks: globalThis.records });
const result = await run(table.id, {
  instruction: "Classify {text}",
  responseSchema: {
    type: "object",
    properties: { label: { type: "string" } },
    required: ["label"],
  },
});
console.log(result);
Passing
filePaths: ["/data.jsonl"]
would produce a table with one row pointing at the file — not one row per record inside it.
glob
/
filePaths
— 一个文件对应一行。适用于每个文件都是独立工作单元的场景。每行包含
{ id, file }
;子代理通过
{file}
占位符自行读取文件。
tasks
— 直接传入预构建的记录。适用于数据存储在文件(JSONL、CSV、JSON数组)中的场景。先在
eval
内读取并解析文件,再传入记录。一条记录对应一行——不要将多个任务项分组到同一行中。
对于小文件(约500行以内),可在一个代码块中完成解析和创建:
javascript
const { create } = await import("@/skills/swarm");
const raw = await tools.readFile({ file_path: "/data.jsonl" });
const records = raw.trim().split("\n").map(l => JSON.parse(l));
const table = await create({ tasks: records });
console.log(table);
对于大文件,按500行分块读取以避免截断:
javascript
const { create } = await import("@/skills/swarm");
let records = [];
let offset = 0;
while (true) {
  const chunk = await tools.readFile({ file_path: "/data.txt", offset, limit: 500 });
  const lines = chunk.split("\n").filter(l => l.trim());
  for (const l of lines) { records.push({ id: `r${records.length}`, text: l }); }
  if (lines.length < 500) break;
  offset += 500;
}
const table = await create({ tasks: records });
console.log(table);
当文件过大无法在一次
eval
调用中完成解析和分发时,拆分为两个代码块。只有调用Swarm函数的代码块需要导入:
javascript
// eval 1: 仅解析——无需导入Swarm
const raw = await tools.readFile({ file_path: "/data.jsonl" });
globalThis.records = raw.trim().split("\n").map(l => JSON.parse(l));
console.log(`已解析 ${globalThis.records.length} 条记录`);
javascript
// eval 2: 创建并分发
const { create, run } = await import("@/skills/swarm");
const table = await create({ tasks: globalThis.records });
const result = await run(table.id, {
  instruction: "Classify {text}",
  responseSchema: {
    type: "object",
    properties: { label: { type: "string" } },
    required: ["label"],
  },
});
console.log(result);
传入
filePaths: ["/data.jsonl"]
会生成一个仅包含一行的表格(指向该文件)——而非文件内每条记录对应一行。

When to use
subagentType

何时使用
subagentType

Omit
subagentType
for classification, extraction, labeling, and any task where a single model call with structured output is sufficient. This is the default and is significantly cheaper and faster — each dispatch is a direct model call, no tools, no iteration.
Set
subagentType
when the task requires tools, file access, or multi-step reasoning. Each dispatch runs a full agentic loop with the named subagent.
javascript
// Direct model call — classification, no tools needed
await run(table.id, {
  instruction: "Classify {text}",
  responseSchema: { type: "object", properties: { label: { type: "string" } }, required: ["label"] },
});

// Subagent — needs to read files and reason over multiple steps
await run(table.id, {
  subagentType: "reviewer",
  instruction: "Review {file} for security issues.",
  responseSchema: { type: "object", properties: { finding: { type: "string" } }, required: ["finding"] },
});
对于分类、提取、标注等仅需单次模型调用并返回结构化输出的任务,无需设置
subagentType
。这是默认模式,成本更低、速度更快——每次调度都是直接的模型调用,无需工具,无需迭代。
当任务需要工具、文件访问或多步推理时,设置
subagentType
。每次调度会运行指定子代理的完整智能体循环。
javascript
// 直接模型调用——分类任务,无需工具
await run(table.id, {
  instruction: "Classify {text}",
  responseSchema: { type: "object", properties: { label: { type: "string" } }, required: ["label"] },
});

// 子代理——需要读取文件并进行多步推理
await run(table.id, {
  subagentType: "reviewer",
  instruction: "Review {file} for security issues.",
  responseSchema: { type: "object", properties: { finding: { type: "string" } }, required: ["finding"] },
});

Instruction + context

指令 + 上下文

instruction
is a per-item template with
{column}
placeholders. Placeholders are resolved by the framework — your column names appear in prompts as references to the values listed alongside, never as raw template syntax. Subagents do the work — do not process items yourself in JS and write the results into rows.
context
is free-form prose prepended to every subagent prompt. Use it for shared background: domain terms, classification rules, examples, etc.
javascript
const { create, run } = await import("@/skills/swarm");

const table = await create({ glob: "src/**/*.ts" });
const r = await run(table.id, {
  subagentType: "reviewer",
  instruction: "Review {file} for security issues. List findings or write 'no issues'.",
  context: "TypeScript Express backend using Prisma ORM. Focus on injection, auth bypass, path traversal.",
  responseSchema: {
    type: "object",
    properties: { review: { type: "string" } },
    required: ["review"],
  },
});
console.log(r);
// → { completed: 45, failed: 2, skipped: 0, failures: [...] }
instruction
是带有
{column}
占位符的逐任务模板。占位符由框架解析——列名会作为对应值的引用出现在提示词中,而非原始模板语法。子代理负责执行任务——不要在JS中自行处理任务项并将结果写入行。
context
是添加到每个子代理提示词开头的自由格式文本。用于提供共享背景信息:领域术语、分类规则、示例等。
javascript
const { create, run } = await import("@/skills/swarm");

const table = await create({ glob: "src/**/*.ts" });
const r = await run(table.id, {
  subagentType: "reviewer",
  instruction: "Review {file} for security issues. List findings or write 'no issues'.",
  context: "TypeScript Express后端,使用Prisma ORM。重点关注注入、权限绕过、路径遍历问题。",
  responseSchema: {
    type: "object",
    properties: { review: { type: "string" } },
    required: ["review"],
  },
});
console.log(r);
// → { completed: 45, failed: 2, skipped: 0, failures: [...] }

Structured output

结构化输出

responseSchema
is required. Schema properties become top-level columns on each row and constrain what subagents can return.
javascript
const { run } = await import("@/skills/swarm");
await run(table.id, {
  instruction: "Classify: {text}",
  responseSchema: {
    type: "object",
    properties: {
      sentiment: { type: "string", enum: ["positive", "negative", "neutral"] },
    },
    required: ["sentiment"],
  },
});
// Row after: { id: "r1", text: "...", sentiment: "positive" }
responseSchema
是必填项。Schema属性会成为每行的顶级列,并限制子代理的返回内容。
javascript
const { run } = await import("@/skills/swarm");
await run(table.id, {
  instruction: "Classify: {text}",
  responseSchema: {
    type: "object",
    properties: {
      sentiment: { type: "string", enum: ["positive", "negative", "neutral"] },
    },
    required: ["sentiment"],
  },
});
// 处理后的数据行:{ id: "r1", text: "...", sentiment: "positive" }

Batching

批量处理

By default, swarm auto-batches to keep total dispatches under 10. For small tables (≤10 rows) each row gets its own subagent call. For larger tables, rows are grouped automatically.
Set
batchSize
to control grouping:
  • Number — uniform batch size for all rows.
    batchSize: 1
    forces per-row dispatch;
    batchSize: 20
    groups in twenties.
  • Function
    (row, rowCount) => number
    . Returns the desired batch size for each row. Rows with the same batch size are grouped together, then chunked. Allows mixed dispatch where some rows go solo and others batch.
javascript
const { create, run } = await import("@/skills/swarm");
const table = await create({ tasks: items });

// Complex items get individual attention; simple ones batch together
await run(table.id, {
  instruction: "Analyze {text}",
  responseSchema: {
    type: "object",
    properties: { analysis: { type: "string" } },
    required: ["analysis"],
  },
  batchSize: (row) => (row.token_count > 1000 ? 1 : 10),
});
Batch sizes are clamped to [1, 50] after evaluation.
默认情况下,Swarm会自动批量处理,使总调度次数保持在10次以内。对于小型表格(≤10行),每行单独调用子代理;对于大型表格,行将自动分组。
设置
batchSize
可控制分组方式:
  • 数字——所有行使用统一的批量大小。
    batchSize: 1
    强制逐行调度;
    batchSize: 20
    每20行一组。
  • 函数——
    (row, rowCount) => number
    。返回每行所需的批量大小。具有相同批量大小的行会被分组,然后拆分。支持混合调度,部分行单独处理,部分行批量处理。
javascript
const { create, run } = await import("@/skills/swarm");
const table = await create({ tasks: items });

// 复杂任务项单独处理;简单任务项批量处理
await run(table.id, {
  instruction: "Analyze {text}",
  responseSchema: {
    type: "object",
    properties: { analysis: { type: "string" } },
    required: ["analysis"],
  },
  batchSize: (row) => (row.token_count > 1000 ? 1 : 10),
});
批量大小在计算后会被限制在[1, 50]范围内。

Aggregation

汇总

After
run()
, use
rows()
and plain JS — no additional subagents needed.
javascript
const { rows } = await import("@/skills/swarm");
const data = await rows(table.id, { columns: ["sentiment"] });
const counts = {};
data.forEach(r => { counts[r.sentiment] = (counts[r.sentiment] || 0) + 1 });
console.log(counts);
// → { positive: 120, negative: 45, neutral: 35 }
调用
run()
后,使用
rows()
和原生JS进行汇总——无需额外子代理。
javascript
const { rows } = await import("@/skills/swarm");
const data = await rows(table.id, { columns: ["sentiment"] });
const counts = {};
data.forEach(r => { counts[r.sentiment] = (counts[r.sentiment] || 0) + 1 });
console.log(counts);
// → { positive: 120, negative: 45, neutral: 35 }

Chaining passes

链式调用

run
updates the table in place — chain calls to accumulate columns.
javascript
const { create, run } = await import("@/skills/swarm");
const table = await create({ tasks: interviews });
await run(table.id, {
  instruction: "Classify sentiment of {text}",
  responseSchema: {
    type: "object",
    properties: { sentiment: { type: "string", enum: ["positive", "negative", "neutral"] } },
    required: ["sentiment"],
  },
});
await run(table.id, {
  filter: { column: "sentiment", equals: "negative" },
  instruction: "Summarize why {text} had negative sentiment.",
  responseSchema: {
    type: "object",
    properties: { summary: { type: "string" } },
    required: ["summary"],
  },
});
run
会就地更新表格——可通过链式调用累积列数据。
javascript
const { create, run } = await import("@/skills/swarm");
const table = await create({ tasks: interviews });
await run(table.id, {
  instruction: "Classify sentiment of {text}",
  responseSchema: {
    type: "object",
    properties: { sentiment: { type: "string", enum: ["positive", "negative", "neutral"] } },
    required: ["sentiment"],
  },
});
await run(table.id, {
  filter: { column: "sentiment", equals: "negative" },
  instruction: "Summarize why {text} had negative sentiment.",
  responseSchema: {
    type: "object",
    properties: { summary: { type: "string" } },
    required: ["summary"],
  },
});

Action-only tasks

仅执行动作的任务

When subagents perform actions (write a file, apply a fix) rather than return data, use a simple schema with a status or marker field. The
exists: false
filter still works for retries.
javascript
const { create, run } = await import("@/skills/swarm");
const fixedSchema = {
  type: "object",
  properties: { fixed: { type: "string" } },
  required: ["fixed"],
};
const table = await create({ glob: "src/**/*.ts" });
await run(table.id, {
  subagentType: "fixer",
  instruction: "Add missing JSDoc to all exported functions in {file}.",
  responseSchema: fixedSchema,
});
// retry any that failed
await run(table.id, {
  subagentType: "fixer",
  instruction: "Add missing JSDoc to all exported functions in {file}.",
  responseSchema: fixedSchema,
  filter: { column: "fixed", exists: false },
});
当子代理执行动作(写入文件、修复问题)而非返回数据时,使用包含状态或标记字段的简单Schema。
exists: false
过滤器仍可用于重试。
javascript
const { create, run } = await import("@/skills/swarm");
const fixedSchema = {
  type: "object",
  properties: { fixed: { type: "string" } },
  required: ["fixed"],
};
const table = await create({ glob: "src/**/*.ts" });
await run(table.id, {
  subagentType: "fixer",
  instruction: "Add missing JSDoc to all exported functions in {file}.",
  responseSchema: fixedSchema,
});
// 重试失败的任务
await run(table.id, {
  subagentType: "fixer",
  instruction: "Add missing JSDoc to all exported functions in {file}.",
  responseSchema: fixedSchema,
  filter: { column: "fixed", exists: false },
});

Filtering

过滤规则

javascript
{ column: "status", equals: "done" }
{ column: "status", notEquals: "done" }
{ column: "category", in: ["A", "B"] }
{ column: "result", exists: false }      // not yet processed
{ and: [filter1, filter2] }
{ or: [filter1, filter2] }
javascript
{ column: "status", equals: "done" }
{ column: "status", notEquals: "done" }
{ column: "category", in: ["A", "B"] }
{ column: "result", exists: false }      // 未处理
{ and: [filter1, filter2] }
{ or: [filter1, filter2] }

Technical notes

技术说明

  • Only import
    @/skills/swarm
    in blocks where you call swarm functions.
    Data preparation (reading files, parsing, storing in
    globalThis
    ) does not need the import. Destructure only what you use:
    { create }
    ,
    { run }
    ,
    { create, run }
    , etc.
  • Console output is capped at ~5 KB. Never log raw file contents — log only counts and short samples.
  • readFile
    inside
    eval
    returns raw content — no line-number prefixes.
    Request at most 500 lines per call. For files with more than 500 lines, loop with incrementing
    offset
    .
  • When building a table from a file, read it inside
    eval
    .
    Data read inside the sandbox stays there; it never enters the agent's context window.
  • Never write to
    .swarm/
    directly.
    Always use
    create()
    .
  • Everything the subagent needs must be in
    instruction
    +
    context
    .
    Subagents can't see the agent's context.
  • Row ids must be unique.
    create()
    rejects sources that produce duplicate ids. For
    tasks
    , that's a caller-side responsibility; for
    glob
    /
    filePaths
    , ids are auto-disambiguated by parent directory.
  • Unknown columns fail fast. If
    instruction
    references
    {foo}
    and no matched row provides
    foo
    ,
    run()
    throws before any subagent is dispatched.
  • 仅在调用Swarm函数的代码块中导入
    @/skills/swarm
    。数据准备(读取文件、解析、存储到
    globalThis
    )无需导入。仅解构所需的方法:
    { create }
    { run }
    { create, run }
    等。
  • 控制台输出上限约为5 KB。切勿记录原始文件内容——仅记录计数和简短样本。
  • eval
    内的
    readFile
    返回原始内容——无行号前缀
    。每次调用最多请求500行。对于超过500行的文件,使用递增的
    offset
    循环读取。
  • 从文件构建表格时,在
    eval
    内读取文件
    。沙箱内读取的数据会保留在沙箱中;不会进入智能体的上下文窗口。
  • 切勿直接写入
    .swarm/
    目录
    。始终使用
    create()
    方法。
  • 子代理所需的所有信息必须包含在
    instruction
    +
    context
    。子代理无法查看智能体的上下文。
  • 行ID必须唯一
    create()
    会拒绝生成重复ID的数据源。对于
    tasks
    ,这是调用方的责任;对于
    glob
    /
    filePaths
    ,ID会通过父目录自动去重。
  • 未知列会快速失败。如果
    instruction
    引用
    {foo}
    但没有匹配的行提供
    foo
    run()
    会在调度任何子代理前抛出错误。

API Reference

API参考

create(source)

create(source)

Create a table. Returns a handle
{ id, count, columns }
.
SourceDescription
{ glob: "src/**/*.ts" }
or
{ glob: ["src/**/*.ts", "lib/**/*.ts"] }
Match files by one or more patterns. Columns:
id
,
file
{ filePaths: ["a.ts", "b.ts"] }
Explicit file list. Columns:
id
,
file
{ tasks: [{ id: "t1", text: "..." }] }
Custom rows. Each must have
id
创建表格。返回句柄
{ id, count, columns }
数据源描述
{ glob: "src/**/*.ts" }
{ glob: ["src/**/*.ts", "lib/**/*.ts"] }
通过一个或多个模式匹配文件。列:
id
,
file
{ filePaths: ["a.ts", "b.ts"] }
明确的文件列表。列:
id
,
file
{ tasks: [{ id: "t1", text: "..." }] }
自定义行数据。每行必须包含
id

run(tableId, options)

run(tableId, options)

Dispatch work across rows. Returns
{ completed, failed, skipped, failures }
.
OptionDefaultDescription
instruction
(required)Template with
{column}
placeholders
responseSchema
(required)JSON Schema (
type: "object"
) — properties become row columns
context
Prose prepended to every subagent prompt
filter
Only dispatch matching rows
subagentType
Name of subagent to dispatch to. When set, runs a full agentic loop. When omitted, runs a direct model call
batchSize
autoNumber or
(row, rowCount) => number
. Auto caps dispatches at 10;
1
= per-row; function = per-row sizing
concurrency
10
Max concurrent subagent dispatches (clamped to 1–10)
将任务分发至各行。返回
{ completed, failed, skipped, failures }
选项默认值描述
instruction
必填带有
{column}
占位符的模板
responseSchema
必填JSON Schema(
type: "object"
)——属性会成为行的列
context
添加到每个子代理提示词开头的文本
filter
仅调度匹配条件的行
subagentType
要调度的子代理名称。设置后会运行完整的智能体循环;未设置时运行直接模型调用
batchSize
自动数字或
(row, rowCount) => number
。自动将调度次数限制在10次以内;
1
=逐行调度;函数=按行设置批量大小
concurrency
10
最大并发子代理调度数(限制在1–10之间)

rows(tableId, options?)

rows(tableId, options?)

Retrieve rows. Use for inspection and JS-based aggregation.
OptionDescription
filter
Only return matching rows
columns
Project to specific columns
limit
Max rows returned
获取行数据。用于检查和基于JS的汇总。
选项描述
filter
仅返回匹配条件的行
columns
仅返回指定列
limit
返回的最大行数