multi-agent-patterns

Compare original and translation side by side

🇺🇸

Original

English

🇨🇳

Translation

Chinese

Multi-Agent Architecture Patterns

多智能体架构模式

Multi-agent architectures distribute work across multiple language model instances, each with its own context window. When designed well, this distribution enables capabilities beyond single-agent limits. When designed poorly, it introduces coordination overhead that negates benefits. The critical insight is that sub-agents exist primarily to isolate context, not to anthropomorphize role division.

多智能体架构将工作分配到多个语言模型实例中，每个实例都有独立的上下文窗口。设计良好的架构能够突破单智能体的能力限制；设计不佳则会引入协调开销，抵消其优势。核心设计思路是：子智能体的主要作用是隔离上下文，而非拟人化地划分角色。

When to Activate

激活场景

Activate this skill when:

Single-agent context limits constrain task complexity
Tasks decompose naturally into parallel subtasks
Different subtasks require different tool sets or system prompts
Building systems that must handle multiple domains simultaneously
Scaling agent capabilities beyond single-context limits
Designing production agent systems with multiple specialized components

在以下场景中激活本技能：

单智能体的上下文限制制约了任务复杂度
任务可自然分解为并行子任务
不同子任务需要不同的工具集或系统提示词
构建需同时处理多个领域的系统
突破单上下文限制，扩展智能体能力
设计包含多个专业化组件的生产级智能体系统

Core Concepts

核心概念

Multi-agent systems address single-agent context limitations through distribution. Three dominant patterns exist: supervisor/orchestrator for centralized control, peer-to-peer/swarm for flexible handoffs, and hierarchical for layered abstraction. The critical design principle is context isolation—sub-agents exist primarily to partition context rather than to simulate organizational roles.

Effective multi-agent systems require explicit coordination protocols, consensus mechanisms that avoid sycophancy, and careful attention to failure modes including bottlenecks, divergence, and error propagation.

多智能体系统通过分布式架构解决单智能体的上下文限制问题。目前主要有三种主流模式：用于集中控制的监督者/编排者模式、支持灵活交接的对等/集群模式，以及分层抽象的层级模式。核心设计原则是上下文隔离——子智能体的主要作用是划分上下文，而非模拟组织角色。

高效的多智能体系统需要明确的协调协议、避免附和行为的共识机制，同时需重点关注瓶颈、偏离目标、错误传播等故障模式。

Detailed Topics

详细主题

Why Multi-Agent Architectures

多智能体架构的价值

The Context Bottleneck Single agents face inherent ceilings in reasoning capability, context management, and tool coordination. As tasks grow more complex, context windows fill with accumulated history, retrieved documents, and tool outputs. Performance degrades according to predictable patterns: the lost-in-middle effect, attention scarcity, and context poisoning.

Multi-agent architectures address these limitations by partitioning work across multiple context windows. Each agent operates in a clean context focused on its subtask. Results aggregate at a coordination layer without any single context bearing the full burden.

The Token Economics Reality Multi-agent systems consume significantly more tokens than single-agent approaches. Production data shows:

Architecture	Token Multiplier	Use Case
Single agent chat	1× baseline	Simple queries
Single agent with tools	~4× baseline	Tool-using tasks
Multi-agent system	~15× baseline	Complex research/coordination

Research on the BrowseComp evaluation found that three factors explain 95% of performance variance: token usage (80% of variance), number of tool calls, and model choice. This validates the multi-agent approach of distributing work across agents with separate context windows to add capacity for parallel reasoning.

Critically, upgrading to better models often provides larger performance gains than doubling token budgets. Claude Sonnet 4.5 showed larger gains than doubling tokens on earlier Sonnet versions. GPT-5.2's thinking mode similarly outperforms raw token increases. This suggests model selection and multi-agent architecture are complementary strategies.

The Parallelization Argument Many tasks contain parallelizable subtasks that a single agent must execute sequentially. A research task might require searching multiple independent sources, analyzing different documents, or comparing competing approaches. A single agent processes these sequentially, accumulating context with each step.

Multi-agent architectures assign each subtask to a dedicated agent with a fresh context. All agents work simultaneously, then return results to a coordinator. The total real-world time approaches the duration of the longest subtask rather than the sum of all subtasks.

The Specialization Argument Different tasks benefit from different agent configurations: different system prompts, different tool sets, different context structures. A general-purpose agent must carry all possible configurations in context. Specialized agents carry only what they need.

Multi-agent architectures enable specialization without combinatorial explosion. The coordinator routes to specialized agents; each agent operates with lean context optimized for its domain.

上下文瓶颈 单智能体在推理能力、上下文管理和工具协调方面存在固有上限。随着任务复杂度提升，上下文窗口会被累积的历史记录、检索到的文档和工具输出填满，性能会出现可预见的下降：中间丢失效应、注意力稀缺、上下文污染。

多智能体架构通过将工作分配到多个上下文窗口来解决这些问题。每个智能体在专注于其子任务的干净上下文中运行，结果在协调层聚合，无需单个上下文承担全部负载。

Token 经济现实 多智能体系统消耗的 Token 远多于单智能体方案。生产数据显示：

架构类型	Token 乘数	适用场景
单智能体对话	1×基准值	简单查询
带工具的单智能体	~4×基准值	需使用工具的任务
多智能体系统	~15×基准值	复杂研究/协调任务

BrowseComp 评估研究发现，三个因素决定了95%的性能差异：Token 用量（占80%）、工具调用次数和模型选择。这验证了多智能体架构通过在独立上下文窗口的智能体间分配工作，提升并行推理能力的有效性。

关键结论：升级到更优模型通常比加倍 Token 预算带来的性能提升更显著。Claude Sonnet 4.5 相比早期 Sonnet 版本，仅通过模型升级就实现了比加倍 Token 更优的效果。GPT-5.2 的思考模式同样优于单纯增加 Token 用量。这表明模型选择与多智能体架构是互补的优化策略。

并行化优势 许多任务包含可并行执行的子任务，但单智能体只能按顺序处理。例如研究任务可能需要搜索多个独立来源、分析不同文档或对比多种方案，单智能体需逐步处理并累积上下文。

多智能体架构会将每个子任务分配给拥有全新上下文的专用智能体，所有智能体同时工作，再将结果返回给协调者。总耗时接近耗时最长的子任务时长，而非所有子任务时长之和。

专业化优势 不同任务需要不同的智能体配置：不同的系统提示词、工具集或上下文结构。通用智能体需在上下文中加载所有可能的配置，而专业化智能体仅需加载自身所需的配置。

多智能体架构可实现专业化，同时避免组合爆炸问题。协调者会将任务路由到对应专业智能体，每个智能体在针对其领域优化的精简上下文中运行。

Architectural Patterns

架构模式

Pattern 1: Supervisor/Orchestrator The supervisor pattern places a central agent in control, delegating to specialists and synthesizing results. The supervisor maintains global state and trajectory, decomposes user objectives into subtasks, and routes to appropriate workers.

User Query -> Supervisor -> [Specialist, Specialist, Specialist] -> Aggregation -> Final Output

When to use: Complex tasks with clear decomposition, tasks requiring coordination across domains, tasks where human oversight is important.

Advantages: Strict control over workflow, easier to implement human-in-the-loop interventions, ensures adherence to predefined plans.

Disadvantages: Supervisor context becomes bottleneck, supervisor failures cascade to all workers, "telephone game" problem where supervisors paraphrase sub-agent responses incorrectly.

The Telephone Game Problem and Solution LangGraph benchmarks found supervisor architectures initially performed 50% worse than optimized versions due to the "telephone game" problem where supervisors paraphrase sub-agent responses incorrectly, losing fidelity.

The fix: implement a

forward_message

tool allowing sub-agents to pass responses directly to users:

python

def forward_message(message: str, to_user: bool = True):
    """
    Forward sub-agent response directly to user without supervisor synthesis.
    
    Use when:
    - Sub-agent response is final and complete
    - Supervisor synthesis would lose important details
    - Response format must be preserved exactly
    """
    if to_user:
        return {"type": "direct_response", "content": message}
    return {"type": "supervisor_input", "content": message}

With this pattern, swarm architectures slightly outperform supervisors because sub-agents respond directly to users, eliminating translation errors.

Implementation note: Implement direct pass-through mechanisms allowing sub-agents to pass responses directly to users rather than through supervisor synthesis when appropriate.

Pattern 2: Peer-to-Peer/Swarm The peer-to-peer pattern removes central control, allowing agents to communicate directly based on predefined protocols. Any agent can transfer control to any other through explicit handoff mechanisms.

python

def transfer_to_agent_b():
    return agent_b  # Handoff via function return

agent_a = Agent(
    name="Agent A",
    functions=[transfer_to_agent_b]
)

When to use: Tasks requiring flexible exploration, tasks where rigid planning is counterproductive, tasks with emergent requirements that defy upfront decomposition.

Advantages: No single point of failure, scales effectively for breadth-first exploration, enables emergent problem-solving behaviors.

Disadvantages: Coordination complexity increases with agent count, risk of divergence without central state keeper, requires robust convergence constraints.

Implementation note: Define explicit handoff protocols with state passing. Ensure agents can communicate their context needs to receiving agents.

Pattern 3: Hierarchical Hierarchical structures organize agents into layers of abstraction: strategic, planning, and execution layers. Strategy layer agents define goals and constraints; planning layer agents break goals into actionable plans; execution layer agents perform atomic tasks.

Strategy Layer (Goal Definition) -> Planning Layer (Task Decomposition) -> Execution Layer (Atomic Tasks)

When to use: Large-scale projects with clear hierarchical structure, enterprise workflows with management layers, tasks requiring both high-level planning and detailed execution.

Advantages: Mirrors organizational structures, clear separation of concerns, enables different context structures at different levels.

Disadvantages: Coordination overhead between layers, potential for misalignment between strategy and execution, complex error propagation.

模式1：监督者/编排者 监督者模式由一个中央智能体掌控全局，将任务委派给专业智能体并汇总结果。监督者维护全局状态和任务轨迹，将用户目标分解为子任务，并路由到合适的执行智能体。

用户查询 -> 监督者 -> [专业智能体, 专业智能体, 专业智能体] -> 结果聚合 -> 最终输出

适用场景：可清晰分解的复杂任务、需要跨领域协调的任务、需要人工介入的任务。

优势：对工作流有严格控制，易于实现人在环干预，确保遵循预设计划。

劣势：监督者上下文会成为瓶颈，监督者故障会影响所有执行智能体，存在「传话游戏」问题——监督者可能错误转述子智能体的响应，导致信息失真。

传话游戏问题及解决方案 LangGraph 基准测试发现，监督者架构初始性能比优化版本低50%，原因是「传话游戏」问题：监督者错误转述子智能体的响应，导致信息丢失。

解决方案：实现

forward_message

工具，允许子智能体直接将响应传递给用户：

python

def forward_message(message: str, to_user: bool = True):
    """
    Forward sub-agent response directly to user without supervisor synthesis.
    
    Use when:
    - Sub-agent response is final and complete
    - Supervisor synthesis would lose important details
    - Response format must be preserved exactly
    """
    if to_user:
        return {"type": "direct_response", "content": message}
    return {"type": "supervisor_input", "content": message}

采用该模式后，集群架构的性能略优于监督者架构，因为子智能体可直接响应用户，消除了翻译误差。

实现注意事项：在合适场景下，实现直接传递机制，允许子智能体绕过监督者合成环节，直接将响应传递给用户。

模式2：对等/集群（Peer-to-Peer/Swarm） 对等模式移除中央控制，允许智能体基于预设协议直接通信。任何智能体都可通过明确的交接机制将控制权转移给其他智能体。

python

def transfer_to_agent_b():
    return agent_b  # Handoff via function return

agent_a = Agent(
    name="Agent A",
    functions=[transfer_to_agent_b]
)

适用场景：需要灵活探索的任务、刚性规划会起反作用的任务、需求随过程涌现且无法提前分解的任务。

优势：无单点故障，可有效扩展以支持广度优先探索，能实现涌现式问题解决行为。

劣势：协调复杂度随智能体数量增加而上升，无中央状态管理者时可能出现目标偏离，需要完善的收敛约束机制。

实现注意事项：定义带状态传递的明确交接协议，确保智能体可向接收方说明自身的上下文需求。

模式3：层级式（Hierarchical） 层级结构将智能体组织为不同抽象层：策略层、规划层和执行层。策略层智能体定义目标和约束；规划层智能体将目标拆解为可执行计划；执行层智能体执行原子任务。

策略层（目标定义） -> 规划层（任务分解） -> 执行层（原子任务）

适用场景：具有清晰层级结构的大型项目、带管理层的企业工作流、同时需要高层规划和细节执行的任务。

优势：镜像组织架构，关注点分离清晰，不同层级可使用不同的上下文结构。

劣势：层间协调开销大，策略与执行可能出现不一致，错误传播机制复杂。

Context Isolation as Design Principle

以上下文隔离为设计原则

The primary purpose of multi-agent architectures is context isolation. Each sub-agent operates in a clean context window focused on its subtask without carrying accumulated context from other subtasks.

Isolation Mechanisms Full context delegation: For complex tasks where the sub-agent needs complete understanding, the planner shares its entire context. The sub-agent has its own tools and instructions but receives full context for its decisions.

Instruction passing: For simple, well-defined subtasks, the planner creates instructions via function call. The sub-agent receives only the instructions needed for its specific task.

File system memory: For complex tasks requiring shared state, agents read and write to persistent storage. The file system serves as the coordination mechanism, avoiding context bloat from shared state passing.

Isolation Trade-offs Full context delegation provides maximum capability but defeats the purpose of sub-agents. Instruction passing maintains isolation but limits sub-agent flexibility. File system memory enables shared state without context passing but introduces latency and consistency challenges.

The right choice depends on task complexity, coordination needs, and acceptable latency.

多智能体架构的核心价值是上下文隔离。每个子智能体在专注于其子任务的干净上下文窗口中运行，无需携带其他子任务的累积上下文。

隔离机制 全上下文委托：对于需要子智能体完全理解的复杂任务，规划者会共享其全部上下文。子智能体拥有独立的工具和指令，但可获取完整上下文以辅助决策。

指令传递：对于简单、定义明确的子任务，规划者通过函数调用创建指令。子智能体仅接收完成其特定任务所需的指令。

文件系统内存：对于需要共享状态的复杂任务，智能体可读写持久化存储。文件系统作为协调机制，避免因传递共享状态导致的上下文膨胀。

隔离权衡 全上下文委托可实现最大能力，但违背了子智能体的设计初衷；指令传递可保持隔离，但限制了子智能体的灵活性；文件系统内存可实现无上下文传递的共享状态，但会引入延迟和一致性问题。

需根据任务复杂度、协调需求和可接受的延迟来选择合适的隔离机制。

Consensus and Coordination

共识与协调

The Voting Problem Simple majority voting treats hallucinations from weak models as equal to reasoning from strong models. Without intervention, multi-agent discussions devolve into consensus on false premises due to inherent bias toward agreement.

Weighted Voting Weight agent votes by confidence or expertise. Agents with higher confidence or domain expertise carry more weight in final decisions.

Debate Protocols Debate protocols require agents to critique each other's outputs over multiple rounds. Adversarial critique often yields higher accuracy on complex reasoning than collaborative consensus.

Trigger-Based Intervention Monitor multi-agent interactions for specific behavioral markers. Stall triggers activate when discussions make no progress. Sycophancy triggers detect when agents mimic each other's answers without unique reasoning.

投票问题 简单多数投票会将弱模型的幻觉与强模型的推理结果等同看待。若无干预，多智能体讨论可能会因固有趋同偏差而达成基于错误前提的共识。

加权投票 根据置信度或专业度为智能体投票赋予权重。置信度更高或领域专业度更强的智能体，其投票在最终决策中占比更大。

辩论协议 辩论协议要求智能体在多轮讨论中互相评判对方的输出。对抗性评判通常比协作式共识在复杂推理任务上的准确率更高。

触发式干预 监控多智能体交互中的特定行为标记。当讨论无进展时激活停滞触发器；当智能体无独立推理、仅模仿他人答案时激活附和行为触发器。

Framework Considerations

框架选型考量

Different frameworks implement these patterns with different philosophies. LangGraph uses graph-based state machines with explicit nodes and edges. AutoGen uses conversational/event-driven patterns with GroupChat. CrewAI uses role-based process flows with hierarchical crew structures.

不同框架对这些模式的实现理念不同。LangGraph 使用基于图的状态机，包含明确的节点和边；AutoGen 使用对话/事件驱动模式，支持GroupChat；CrewAI 使用基于角色的流程，支持层级式团队结构。

Practical Guidance

实践指南

Failure Modes and Mitigations

故障模式与缓解方案

Failure: Supervisor Bottleneck The supervisor accumulates context from all workers, becoming susceptible to saturation and degradation.

Mitigation: Implement output schema constraints so workers return only distilled summaries. Use checkpointing to persist supervisor state without carrying full history.

Failure: Coordination Overhead Agent communication consumes tokens and introduces latency. Complex coordination can negate parallelization benefits.

Mitigation: Minimize communication through clear handoff protocols. Batch results where possible. Use asynchronous communication patterns.

Failure: Divergence Agents pursuing different goals without central coordination can drift from intended objectives.

Mitigation: Define clear objective boundaries for each agent. Implement convergence checks that verify progress toward shared goals. Use time-to-live limits on agent execution.

Failure: Error Propagation Errors in one agent's output propagate to downstream agents that consume that output.

Mitigation: Validate agent outputs before passing to consumers. Implement retry logic with circuit breakers. Use idempotent operations where possible.

故障：监督者瓶颈 监督者会累积所有执行智能体的上下文，容易出现饱和和性能下降。

缓解方案：实现输出 schema 约束，让执行智能体仅返回提炼后的摘要；使用 checkpoint 机制持久化监督者状态，无需携带完整历史记录。

故障：协调开销 智能体通信会消耗 Token 并引入延迟，复杂的协调可能抵消并行化的优势。

缓解方案：通过明确的交接协议减少通信；尽可能批量处理结果；使用异步通信模式。

故障：目标偏离 无中央协调时，各智能体可能因追求不同目标而偏离预期任务。

缓解方案：为每个智能体定义清晰的目标边界；实现收敛检查，验证是否向共享目标推进；为智能体执行设置存活时间限制。

故障：错误传播 某个智能体输出中的错误会传播到依赖该输出的下游智能体。

缓解方案：在传递给下游智能体前验证输出；实现带熔断机制的重试逻辑；尽可能使用幂等操作。

Examples

示例

Example 1: Research Team Architecture

text

Supervisor
├── Researcher (web search, document retrieval)
├── Analyzer (data analysis, statistics)
├── Fact-checker (verification, validation)
└── Writer (report generation, formatting)

Example 2: Handoff Protocol

python

def handle_customer_request(request):
    if request.type == "billing":
        return transfer_to(billing_agent)
    elif request.type == "technical":
        return transfer_to(technical_agent)
    elif request.type == "sales":
        return transfer_to(sales_agent)
    else:
        return handle_general(request)

示例1：研究团队架构

text

监督者
├── 研究员（网页搜索、文档检索）
├── 分析师（数据分析、统计）
├── 事实核查员（验证、确认）
└── 撰稿人（报告生成、格式编排）

示例2：交接协议

python

def handle_customer_request(request):
    if request.type == "billing":
        return transfer_to(billing_agent)
    elif request.type == "technical":
        return transfer_to(technical_agent)
    elif request.type == "sales":
        return transfer_to(sales_agent)
    else:
        return handle_general(request)

Guidelines

指导原则

Design for context isolation as the primary benefit of multi-agent systems
Choose architecture pattern based on coordination needs, not organizational metaphor
Implement explicit handoff protocols with state passing
Use weighted voting or debate protocols for consensus
Monitor for supervisor bottlenecks and implement checkpointing
Validate outputs before passing between agents
Set time-to-live limits to prevent infinite loops
Test failure scenarios explicitly

以上下文隔离作为多智能体系统的核心设计目标
根据协调需求选择架构模式，而非照搬组织隐喻
实现带状态传递的明确交接协议
使用加权投票或辩论协议达成共识
监控监督者瓶颈并实现 checkpoint 机制
在智能体间传递前验证输出
设置存活时间限制以避免无限循环
明确测试故障场景

Integration

集成关系

This skill builds on context-fundamentals and context-degradation. It connects to:

memory-systems - Shared state management across agents
tool-design - Tool specialization per agent
context-optimization - Context partitioning strategies

本技能基于上下文基础和上下文退化技能构建，关联以下技能：

memory-systems - 跨智能体共享状态管理
tool-design - 智能体专用工具设计
context-optimization - 上下文划分策略

References

参考资料

Internal reference:

Frameworks Reference - Detailed framework implementation patterns

Related skills in this collection:

context-fundamentals - Context basics
memory-systems - Cross-agent memory
context-optimization - Partitioning strategies

External resources:

LangGraph Documentation - Multi-agent patterns and state management
AutoGen Framework - GroupChat and conversational patterns
CrewAI Documentation - Hierarchical agent processes
Research on Multi-Agent Coordination - Survey of multi-agent systems

内部参考：

Frameworks Reference - 框架实现模式详情

本集合中的相关技能：

context-fundamentals - 上下文基础
memory-systems - 跨智能体内存
context-optimization - 划分策略

外部资源：

LangGraph Documentation - 多智能体模式与状态管理
AutoGen Framework - GroupChat与对话模式
CrewAI Documentation - 层级式智能体流程
Research on Multi-Agent Coordination - 多智能体系统综述

Skill Metadata

技能元数据

Created: 2025-12-20 Last Updated: 2025-12-20 Author: Agent Skills for Context Engineering Contributors Version: 1.0.0

创建时间: 2025-12-20 最后更新时间: 2025-12-20 作者: 上下文工程智能体技能贡献者版本: 1.0.0