backend-integrator

Compare original and translation side by side

🇺🇸

Original

English

🇨🇳

Translation

Chinese

Backend Integrator

后端集成指南

This skill provides the complete checklist and patterns for integrating a new LLM backend into MassGen. A full integration touches ~15 files across the codebase.

本工具提供了将新LLM后端集成到MassGen的完整检查清单和模式规范，完整的集成需要修改代码库中约15个文件。

When to Use This Skill

何时使用本工具

Adding a new LLM provider/backend
Auditing an existing backend for missing integration points
Understanding what files to modify when extending backend capabilities

添加新的LLM提供商/后端
审计现有后端缺失的集成点
了解扩展后端能力时需要修改哪些文件

Integration Architecture

集成架构

Backend Type Decision:
  Stateless + OpenAI-compatible API   → subclass ChatCompletionsBackend
  Stateless + custom API              → subclass CustomToolAndMCPBackend
  Stateless + Response API format     → subclass ResponseBackend
  Stateful CLI wrapper (like Codex, Gemini CLI) → subclass LLMBackend directly
  Stateful SDK wrapper (like Claude Code, Copilot) → subclass LLMBackend directly

后端类型决策:
  无状态 + 兼容OpenAI的API   → 继承ChatCompletionsBackend
  无状态 + 自定义API              → 继承CustomToolAndMCPBackend
  无状态 + Response API格式     → 继承ResponseBackend
  有状态CLI封装（如Codex、Gemini CLI） → 直接继承LLMBackend
  有状态SDK封装（如Claude Code、Copilot） → 直接继承LLMBackend

Complete Checklist

完整检查清单

Phase 1: Core Implementation (3 files)

阶段1：核心实现（3个文件）

1.1 Backend Class

1.1 后端类

File:

massgen/backend/<name>.py

Choose base class:

```
LLMBackend
```
— bare minimum, you handle everything
```
CustomToolAndMCPBackend
```
— adds MCP + custom tool support (most common)
```
ChatCompletionsBackend
```
— for OpenAI-compatible APIs (inherits from above)
```
ResponseBackend
```
— for OpenAI Response API format

Required methods:

python

async def stream_with_tools(self, messages, tools, **kwargs) -> AsyncGenerator[StreamChunk, None]:
    """Main streaming method. Yield StreamChunks."""

def get_provider_name(self) -> str:
    """Return provider name string (e.g., 'OpenAI', 'Codex')."""

def get_filesystem_support(self) -> FilesystemSupport:
    """Return NONE, NATIVE, or MCP."""

StreamChunk types to yield:

Type	When	Key fields
`"content"`	Text output	`content="..."`
`"tool_calls"`	Tool invocation	`tool_calls=[{id, name, arguments}]`
`"reasoning"`	Thinking/reasoning delta	`reasoning_delta="..."`
`"reasoning_done"`	Reasoning complete	`reasoning_text="..."`
`"reasoning_summary"`	Reasoning summary delta	`reasoning_summary_delta="..."`
`"reasoning_summary_done"`	Reasoning summary complete	`reasoning_summary_text="..."`
`"complete_message"`	Full assistant message	`complete_message={...}`
`"complete_response"`	Raw API response	`response={...}`
`"done"`	Stream complete	`usage={prompt_tokens, completion_tokens, total_tokens}`
`"error"`	Error occurred	`error="..."`
`"agent_status"`	Status update	`status="...", detail="..."`
`"backend_status"`	Backend-level status	`status="...", detail="..."`
`"compression_status"`	Compression event	`status="...", detail="..."`
`"hook_execution"`	Hook ran	`hook_info={...}, tool_call_id="..."`

Common fields on all chunks:

source

(agent/orchestrator ID),

display

(bool, default True).

Token tracking — call one of:

python

self._update_token_usage_from_api_response(usage_dict, model)  # If API returns usage
self._estimate_token_usage(messages, response_text, model)      # Fallback

Timing — call in stream_with_tools:

python

self.start_api_call_timing(self.model)       # Before API call
self.record_first_token()                     # On first content chunk
self.end_api_call_timing(success=True/False)  # After completion

For stateful backends (CLI/SDK wrappers), also implement:

python

def is_stateful(self) -> bool: return True
async def clear_history(self) -> None: ...
async def reset_state(self) -> None: ...

Compression support — inherit

StreamingBufferMixin

and call:

python

self._clear_streaming_buffer(**kwargs)       # Start of stream
self._finalize_streaming_buffer(agent_id=id) # End of stream

文件:

massgen/backend/<name>.py

选择基类:

```
LLMBackend
```
— 最基础的基类，所有逻辑自行实现
```
CustomToolAndMCPBackend
```
— 新增MCP + 自定义工具支持（最常用）
```
ChatCompletionsBackend
```
— 适用于兼容OpenAI的API（继承自上述基类）
```
ResponseBackend
```
— 适用于OpenAI Response API格式

必须实现的方法:

python

async def stream_with_tools(self, messages, tools, **kwargs) -> AsyncGenerator[StreamChunk, None]:
    """主流式方法，返回StreamChunk流。"""

def get_provider_name(self) -> str:
    """返回提供商名称字符串（如'OpenAI'、'Codex'）。"""

def get_filesystem_support(self) -> FilesystemSupport:
    """返回NONE、NATIVE或MCP。"""

需要返回的StreamChunk类型:

类型	触发时机	关键字段
`"content"`	文本输出	`content="..."`
`"tool_calls"`	工具调用	`tool_calls=[{id, name, arguments}]`
`"reasoning"`	思考/推理增量	`reasoning_delta="..."`
`"reasoning_done"`	推理完成	`reasoning_text="..."`
`"reasoning_summary"`	推理摘要增量	`reasoning_summary_delta="..."`
`"reasoning_summary_done"`	推理摘要完成	`reasoning_summary_text="..."`
`"complete_message"`	完整助手消息	`complete_message={...}`
`"complete_response"`	原始API响应	`response={...}`
`"done"`	流结束	`usage={prompt_tokens, completion_tokens, total_tokens}`
`"error"`	发生错误	`error="..."`
`"agent_status"`	状态更新	`status="...", detail="..."`
`"backend_status"`	后端层面状态	`status="...", detail="..."`
`"compression_status"`	压缩事件	`status="...", detail="..."`
`"hook_execution"`	Hook执行完成	`hook_info={...}, tool_call_id="..."`

所有Chunk的通用字段:

source

（Agent/编排器ID）、

display

（布尔值，默认True）。

Token统计 — 调用以下方法之一:

python

self._update_token_usage_from_api_response(usage_dict, model)  # 如果API返回用量统计
self._estimate_token_usage(messages, response_text, model)      # 兜底估算方案

计时统计 — 在stream_with_tools中调用:

python

self.start_api_call_timing(self.model)       # API调用前调用
self.record_first_token()                     # 收到第一个内容Chunk时调用
self.end_api_call_timing(success=True/False)  # 调用完成后调用

针对有状态后端（CLI/SDK封装），还需要实现:

python

def is_stateful(self) -> bool: return True
async def clear_history(self) -> None: ...
async def reset_state(self) -> None: ...

压缩支持 — 继承

StreamingBufferMixin

并调用:

python

self._clear_streaming_buffer(**kwargs)       # 流开始时调用
self._finalize_streaming_buffer(agent_id=id) # 流结束时调用

1.2 Formatter (if needed)

1.2 格式化器（按需实现）

File:

massgen/formatter/<name>_formatter.py

Only needed if the API uses a non-standard message/tool format (not OpenAI chat completions format). Subclass

FormatterBase

and implement

format_messages()

format_tools()

format_mcp_tools()

Existing formatters:

```
_claude_formatter.py
```
— Anthropic Messages API
```
_gemini_formatter.py
```
— Gemini API
```
_chat_completions_formatter.py
```
— OpenAI/generic (reuse for compatible APIs)
```
_response_formatter.py
```
— OpenAI Response API format

文件:

massgen/formatter/<name>_formatter.py

仅当API使用非标准消息/工具格式（非OpenAI聊天补全格式）时需要实现。继承

FormatterBase

并实现

format_messages()

、

format_tools()

、

format_mcp_tools()

方法。

现有格式化器:

```
_claude_formatter.py
```
— Anthropic Messages API
```
_gemini_formatter.py
```
— Gemini API
```
_chat_completions_formatter.py
```
— OpenAI/通用兼容API（可复用给兼容接口）
```
_response_formatter.py
```
— OpenAI Response API格式

1.3 API Params Handler (if needed)

1.3 API参数处理器（按需实现）

File:

massgen/api_params_handler/<name>_api_params_handler.py

Only needed if the backend calls an HTTP API and needs to filter/transform YAML config params before passing to the API. Subclass

APIParamsHandlerBase

CLI/SDK wrappers (Codex, Claude Code) typically don't need this — they build commands directly.

文件:

massgen/api_params_handler/<name>_api_params_handler.py

仅当后端调用HTTP API，需要在传递给API前过滤/转换YAML配置参数时需要实现。继承

APIParamsHandlerBase

。

CLI/SDK封装（Codex、Claude Code）通常不需要该组件，它们会直接构建命令。

Phase 2: Registration (4 files)

阶段2：注册（4个文件）

2.1 Backend init.py

2.1 后端init.py

File:

massgen/backend/__init__.py

python

from .your_backend import YourBackend

文件:

massgen/backend/__init__.py

python

from .your_backend import YourBackend

Add to all

添加到all列表

undefined

undefined

2.2 CLI Backend Mapping

2.2 CLI后端映射

File:

massgen/cli.py

Add to

create_backend()

function:

python

elif backend_type == "your_backend":
    api_key = kwargs.get("api_key") or os.getenv("YOUR_API_KEY")
    if not api_key:
        raise ConfigurationError(
            _api_key_error_message("YourBackend", "YOUR_API_KEY", config_path)
        )
    return YourBackend(api_key=api_key, **kwargs)

For CLI-based backends that don't need API keys, skip the key check.

文件:

massgen/cli.py

添加到

create_backend()

函数:

python

elif backend_type == "your_backend":
    api_key = kwargs.get("api_key") or os.getenv("YOUR_API_KEY")
    if not api_key:
        raise ConfigurationError(
            _api_key_error_message("YourBackend", "YOUR_API_KEY", config_path)
        )
    return YourBackend(api_key=api_key, **kwargs)

不需要API密钥的基于CLI的后端可以跳过密钥检查。

2.3 Capabilities Registry

2.3 能力注册表

File:

massgen/backend/capabilities.py

Add entry to

BACKEND_CAPABILITIES

python

"your_backend": BackendCapabilities(
    backend_type="your_backend",
    provider_name="YourProvider",
    supported_capabilities={"mcp", "web_search", ...},
    builtin_tools=["web_search"],  # Provider-native tools
    filesystem_support="mcp",      # "none", "mcp", or "native"
    models=["model-a", "model-b"], # Newest first
    default_model="model-a",
    env_var="YOUR_API_KEY",        # Or None
    notes="...",
    model_release_dates={"model-a": "2025-06"},
    base_url="https://api.example.com/v1",  # If applicable
)

文件:

massgen/backend/capabilities.py

添加条目到

BACKEND_CAPABILITIES

python

"your_backend": BackendCapabilities(
    backend_type="your_backend",
    provider_name="YourProvider",
    supported_capabilities={"mcp", "web_search", ...},
    builtin_tools=["web_search"],  # 提供商原生工具
    filesystem_support="mcp",      # "none"、"mcp"或"native"
    models=["model-a", "model-b"], # 按发布时间从新到旧排序
    default_model="model-a",
    env_var="YOUR_API_KEY",        # 无则填None
    notes="...",
    model_release_dates={"model-a": "2025-06"},
    base_url="https://api.example.com/v1",  # 适用时填写
)

2.4 Config Validator (if needed)

2.4 配置校验器（按需实现）

File:

massgen/config_validator.py

Add backend-specific validation to

_validate_backend()

if there are special rules (e.g., required params, incompatible combinations).

文件:

massgen/config_validator.py

如果有特殊规则（如必填参数、不兼容的参数组合），添加后端专属校验逻辑到

_validate_backend()

方法。

Phase 3: Token Management (1 file)

阶段3：Token管理（1个文件）

3.1 Pricing

3.1 定价

File:

massgen/token_manager/token_manager.py

Check LiteLLM first — only add to

PROVIDER_PRICING

if the model is NOT in the LiteLLM database. Provider name must match

get_provider_name()

exactly (case-sensitive).

文件:

massgen/token_manager/token_manager.py

优先检查LiteLLM — 仅当模型不在LiteLLM数据库中时才添加到

PROVIDER_PRICING

。提供商名称必须与

get_provider_name()

返回值完全匹配（大小写敏感）。

Phase 4: Excluded Params (2 files, if adding new YAML params)

阶段4：排除参数（2个文件，仅当新增YAML参数时需要）

4.1 Base Class Exclusions

4.1 基类排除参数

File:

massgen/backend/base.py

get_base_excluded_config_params()

文件:

massgen/backend/base.py

get_base_excluded_config_params()

4.2 API Params Handler Exclusions

4.2 API参数处理器排除参数

File:

massgen/api_params_handler/_api_params_handler_base.py

get_base_excluded_params()

Both must stay in sync. Add any new framework-level YAML params that should NOT be passed to the provider API.

文件:

massgen/api_params_handler/_api_params_handler_base.py

get_base_excluded_params()

两处必须保持同步，添加所有不应该传递给提供商API的框架层面YAML参数。

Phase 5: Authentication

阶段5：鉴权

5.1 API Key Backends (standard)

5.1 API密钥后端（标准方案）

Most backends use API keys. Set

env_var

capabilities.py

and add the key check in

cli.py

python

undefined

大多数后端使用API密钥，在

capabilities.py

中设置

env_var

，并在

cli.py

中添加密钥检查:

python

undefined

cli.py

api_key = kwargs.get("api_key") or os.getenv("YOUR_API_KEY") if not api_key: raise ConfigurationError(...) return YourBackend(api_key=api_key, **kwargs)

undefined

api_key = kwargs.get("api_key") or os.getenv("YOUR_API_KEY") if not api_key: raise ConfigurationError(...) return YourBackend(api_key=api_key, **kwargs)

undefined

5.2 OAuth / Subscription Auth (CLI/SDK wrappers)

5.2 OAuth / 订阅鉴权（CLI/SDK封装）

For backends that support OAuth (like Codex, Claude Code), implement a multi-tier auth cascade:

python

def __init__(self, api_key=None, **kwargs):
    # Tier 1: Explicit API key
    self.api_key = api_key or os.getenv("YOUR_API_KEY")
    # Tier 2: Check cached OAuth tokens
    self.use_oauth = not bool(self.api_key)
    self.auth_file = Path.home() / ".your_cli" / "auth.json"

async def _ensure_authenticated(self):
    if self.api_key:
        os.environ["YOUR_API_KEY"] = self.api_key
        return
    if self._has_cached_credentials():
        return
    # Tier 3: Initiate OAuth flow
    await self._initiate_oauth_flow()

async def _initiate_oauth_flow(self, use_device_flow=False):
    """Wrap the CLI's login command."""
    cmd = [self._cli_path, "login"]
    if use_device_flow:
        cmd.append("--device-auth")  # For headless/SSH
    proc = await asyncio.create_subprocess_exec(*cmd, ...)

In
cli.py
: Skip API key check — delegate auth to the backend:

python

elif backend_type == "your_backend":
    # Auth handled by backend (API key or OAuth)
    return YourBackend(**kwargs)

In
capabilities.py
: Set

env_var

to the API key name (optional, not required for OAuth):

python

env_var="YOUR_API_KEY",  # Optional - can use OAuth instead
notes="Works with subscription auth (via `your-cli login`) or YOUR_API_KEY."

NOTE: We only offer oauth through supported SDKs or programmatic command line usage as suggested by the provider. We must find the accepted way of using oauth for each backend to ensure we are supporting it correctly (e.g., use the Claude Agent SDK, NOT Claude Code with API spoofing).

支持OAuth的后端（如Codex、Claude Code）需要实现多层鉴权降级逻辑:

python

def __init__(self, api_key=None, **kwargs):
    # 层级1：显式传入的API密钥
    self.api_key = api_key or os.getenv("YOUR_API_KEY")
    # 层级2：检查缓存的OAuth令牌
    self.use_oauth = not bool(self.api_key)
    self.auth_file = Path.home() / ".your_cli" / "auth.json"

async def _ensure_authenticated(self):
    if self.api_key:
        os.environ["YOUR_API_KEY"] = self.api_key
        return
    if self._has_cached_credentials():
        return
    # 层级3：启动OAuth流程
    await self._initiate_oauth_flow()

async def _initiate_oauth_flow(self, use_device_flow=False):
    """封装CLI的登录命令。"""
    cmd = [self._cli_path, "login"]
    if use_device_flow:
        cmd.append("--device-auth")  # 适用于无界面/SSH场景
    proc = await asyncio.create_subprocess_exec(*cmd, ...)

在
cli.py
中: 跳过API密钥检查，将鉴权逻辑委托给后端本身:

python

elif backend_type == "your_backend":
    # 鉴权由后端自行处理（API密钥或OAuth）
    return YourBackend(**kwargs)

在
capabilities.py
中: 将

env_var

设置为API密钥变量名（可选，OAuth场景不需要）:

python

env_var="YOUR_API_KEY",  # 可选 - 也可以使用OAuth鉴权
notes="支持订阅鉴权（通过`your-cli login`）或YOUR_API_KEY。"

注意：我们仅通过提供商官方支持的SDK或程序化命令行方式提供OAuth支持，必须为每个后端找到官方认可的OAuth使用方式以确保兼容性（例如使用Claude Agent SDK，而非通过API欺骗使用Claude Code）。

5.3 No Auth (local inference)

5.3 无鉴权（本地推理）

For local servers (LM Studio, vLLM, SGLang), set

env_var=None

in capabilities. No key check in

cli.py

Reference implementations:

```
codex.py
```
— Full OAuth with browser + device code flows

claude_code.py

— 3-tier:

CLAUDE_CODE_API_KEY

ANTHROPIC_API_KEY

-> subscription login

```
lmstudio
```
/
```
vllm
```
/
```
sglang
```
— No auth (
```
env_var=None
```
)

对于本地服务（LM Studio、vLLM、SGLang），在capabilities中设置

env_var=None

，

cli.py

中不需要密钥检查。

参考实现:

```
codex.py
```
— 完整的浏览器+设备码OAuth流程

claude_code.py

— 三层鉴权:

CLAUDE_CODE_API_KEY

ANTHROPIC_API_KEY

-> 订阅登录

```
lmstudio
```
/
```
vllm
```
/
```
sglang
```
— 无鉴权（
```
env_var=None
```
）

Phase 6: Custom Tools & MCP

阶段6：自定义工具与MCP

6.1 Standard Path (API backends)

6.1 标准路径（API后端）

If inheriting from

CustomToolAndMCPBackend

, MCP and custom tools work automatically:

MCP servers from YAML
```
mcp_servers
```
config
Custom tools from
```
custom_tools
```
/
```
custom_tools_path
```
config
Filesystem MCP injected by
```
FilesystemManager
```
when
```
cwd
```
is set
Tool execution handled by
```
ToolManager
```

如果继承自

CustomToolAndMCPBackend

，MCP和自定义工具会自动生效:

来自YAML
```
mcp_servers
```
配置的MCP服务
来自
```
custom_tools
```
/
```
custom_tools_path
```
配置的自定义工具
当设置
```
cwd
```
时由
```
FilesystemManager
```
注入的文件系统MCP
工具执行由
```
ToolManager
```
处理

6.2 Multimodal Tools (all backend types)

6.2 多模态工具（所有后端类型）

When

enable_multimodal_tools: true

is set, the backend must register

read_media

and

generate_media

custom tools. Use the shared helper in

massgen/backend/base.py

python

from .base import get_multimodal_tool_definitions

当设置

enable_multimodal_tools: true

时，后端必须注册

read_media

和

generate_media

自定义工具，使用

massgen/backend/base.py

中的共享辅助方法:

python

from .base import get_multimodal_tool_definitions

In init:

在init中:

enable_multimodal = self.config.get("enable_multimodal_tools", False) or kwargs.get("enable_multimodal_tools", False) if enable_multimodal: custom_tools.extend(get_multimodal_tool_definitions())


- **API backends** (`CustomToolAndMCPBackend` subclasses): handled automatically — base class calls `get_multimodal_tool_definitions()` and registers via `_register_custom_tools()`
- **CLI/SDK backends** (`LLMBackend` subclasses like Codex, Claude Code): must do this explicitly in `__init__`, then wrap as MCP server (see §6.4)

**Important**: Always use `get_multimodal_tool_definitions()` — never inline the tool dicts. This keeps the definitions in one place.

enable_multimodal = self.config.get("enable_multimodal_tools", False) or kwargs.get("enable_multimodal_tools", False) if enable_multimodal: custom_tools.extend(get_multimodal_tool_definitions())


- **API后端**（`CustomToolAndMCPBackend`子类）: 自动处理 — 基类会调用`get_multimodal_tool_definitions()`并通过`_register_custom_tools()`注册
- **CLI/SDK后端**（`LLMBackend`子类如Codex、Claude Code）: 必须在`__init__`中显式执行该逻辑，然后封装为MCP服务（参见§6.4）

**重要**: 始终使用`get_multimodal_tool_definitions()`，不要直接内联工具定义，保证定义的统一维护。

6.3 CLI/SDK Wrappers — MCP Servers

6.3 CLI/SDK封装 — MCP服务

These backends must configure the CLI/SDK's own MCP system:

Codex: Write project-scoped

.codex/config.toml

in the workspace (

-C

dir). Codex reads this automatically. Convert MassGen's

mcp_servers

list to TOML format:

toml

[mcp_servers.filesystem]
command = "npx"
args = ["-y", "@modelcontextprotocol/server-filesystem", "/workspace"]

Claude Code: Pass

mcp_servers

dict directly to SDK options. The SDK handles server lifecycle:

python

options = {"mcp_servers": {"filesystem": {"command": "npx", "args": [...]}}}

这类后端需要配置CLI/SDK自带的MCP系统:

Codex: 在工作区（

-C

目录）写入项目级的

.codex/config.toml

，Codex会自动读取该文件，将MassGen的

mcp_servers

列表转换为TOML格式:

toml

[mcp_servers.filesystem]
command = "npx"
args = ["-y", "@modelcontextprotocol/server-filesystem", "/workspace"]

Claude Code: 将

mcp_servers

字典直接传递给SDK选项，SDK会处理服务的生命周期:

python

options = {"mcp_servers": {"filesystem": {"command": "npx", "args": [...]}}}

6.4 CLI/SDK Wrappers — Custom Tools

6.4 CLI/SDK封装 — 自定义工具

Custom tools need special handling since the LLM runs inside an external process:

Preferred: Wrap as MCP server (what

claude_code.py

does):

Load custom tools via
```
ToolManager
```
(schemas + executors)
Create an MCP server that exposes each tool
Register the MCP server with the CLI/SDK
When the LLM calls the MCP tool, the wrapper executes via
```
ToolManager
```

python

undefined

自定义工具需要特殊处理，因为LLM运行在外部进程中:

推荐方案：封装为MCP服务（

claude_code.py

采用的方案）:

通过
```
ToolManager
```
加载自定义工具（schema + 执行器）
创建暴露每个工具的MCP服务
将MCP服务注册到CLI/SDK
当LLM调用MCP工具时，封装层通过
```
ToolManager
```
执行逻辑

python

undefined

claude_code.py pattern (simplified)

claude_code.py简化示例

def _create_sdk_mcp_server_from_custom_tools(self): tool_schemas = self._custom_tool_manager.fetch_tool_schemas() mcp_tools = [] for schema in tool_schemas: name = schema["function"]["name"] async def wrapper(args, tool_name=name): return await self._execute_massgen_custom_tool(tool_name, args) mcp_tools.append(tool(name=name, ...)(wrapper)) return create_sdk_mcp_server(name="massgen_custom_tools", tools=mcp_tools)


**For CLI wrappers without SDK MCP** (like Codex):
Use the shared `massgen/mcp_tools/custom_tools_server.py` utility. It creates a standalone fastmcp server that wraps ToolManager tools via stdio transport.

```python
from ..mcp_tools.custom_tools_server import build_server_config, write_tool_specs

def _setup_custom_tools_mcp(self, custom_tools):
    # 1. Write tool specs JSON to workspace
    specs_path = Path(self.cwd) / ".codex" / "custom_tool_specs.json"
    write_tool_specs(custom_tools, specs_path)

    # 2. Get MCP server config (fastmcp run ... --tool-specs ...)
    server_config = build_server_config(
        tool_specs_path=specs_path,
        allowed_paths=[self.cwd],
        agent_id="my_backend",
    )

    # 3. Add to mcp_servers list (written to workspace config before launch)
    self.mcp_servers.append(server_config)

The server is launched by the CLI as a subprocess and connects via stdio. Cleanup the specs file on

reset_state()

Reference:

codex.py

uses this pattern.

custom_tools_server.py

also provides

build_server_config()

which returns a ready-to-use MCP server dict.

Fallback: System prompt injection

Describe tools as JSON schema in the system prompt
Parse structured output (JSON blocks) for tool calls
Only use when MCP wrapping isn't feasible


**不支持SDK MCP的CLI封装**（如Codex）:
使用共享工具`massgen/mcp_tools/custom_tools_server.py`，它会创建独立的fastmcp服务，通过stdio传输封装ToolManager的工具。

```python
from ..mcp_tools.custom_tools_server import build_server_config, write_tool_specs

def _setup_custom_tools_mcp(self, custom_tools):
    # 1. 将工具规范JSON写入工作区
    specs_path = Path(self.cwd) / ".codex" / "custom_tool_specs.json"
    write_tool_specs(custom_tools, specs_path)

    # 2. 获取MCP服务配置（fastmcp run ... --tool-specs ...）
    server_config = build_server_config(
        tool_specs_path=specs_path,
        allowed_paths=[self.cwd],
        agent_id="my_backend",
    )

    # 3. 添加到mcp_servers列表（启动前写入工作区配置）
    self.mcp_servers.append(server_config)

服务会作为子进程由CLI启动，通过stdio通信，在

reset_state()

时清理规范文件。

参考:

codex.py

使用了该模式，

custom_tools_server.py

还提供了

build_server_config()

方法返回可直接使用的MCP服务字典。

兜底方案：系统提示注入

在系统提示中以JSON schema形式描述工具
从结构化输出（JSON块）中解析工具调用
仅当MCP封装不可行时使用

6.5 Custom Tool YAML Config

6.5 自定义工具YAML配置

yaml

backend:
  type: your_backend
  custom_tools:
    - path: "massgen/tool/_basic"          # Directory with TOOL.md
      function: "two_num_tool"
    - path: "path/to/tool.py"             # Single file
      function: ["func_a", "func_b"]
      preset_args: [{"timeout": 30}, {}]
  custom_tools_path: "massgen/tool/"      # Auto-discover from directory
  auto_discover_custom_tools: true        # Discover from registry

MassGen workflow tools (new_answer, vote, etc.): Always injected via system prompt — these are coordination-level, not executable tools. See §6.7 for full details.

yaml

backend:
  type: your_backend
  custom_tools:
    - path: "massgen/tool/_basic"          # 包含TOOL.md的目录
      function: "two_num_tool"
    - path: "path/to/tool.py"             # 单个文件
      function: ["func_a", "func_b"]
      preset_args: [{"timeout": 30}, {}]
  custom_tools_path: "massgen/tool/"      # 从目录自动发现
  auto_discover_custom_tools: true        # 从注册表发现

MassGen工作流工具（new_answer、vote等）: 始终通过系统提示注入，这些是协调层面的工具，而非可执行工具，详情参见§6.7。

6.7 Workflow Tool Integration (vote, new_answer, etc.)

6.7 工作流工具集成（vote、new_answer等）

Workflow tools are NOT native function-calling tools — they're injected into the system prompt and parsed from the model's text output. This pattern is shared between Claude Code and Codex backends.

Shared helpers in

massgen/backend/base.py

python

from .base import build_workflow_instructions, parse_workflow_tool_calls, extract_structured_response

工作流工具不是原生函数调用工具，它们被注入到系统提示中，从模型的文本输出中解析，该模式在Claude Code和Codex后端中通用。

massgen/backend/base.py
中的共享辅助方法:

python

from .base import build_workflow_instructions, parse_workflow_tool_calls, extract_structured_response

build_workflow_instructions(tools) → str

Filters tools to workflow tools, returns instruction text with usage examples.

筛选出工作流工具，返回包含使用示例的指导文本

Returns "" if no workflow tools present.

无工作流工具时返回空字符串

parse_workflow_tool_calls(text) → List[Dict]

Extracts JSON tool calls from text output.

从文本输出中提取JSON工具调用

Returns standard format: {id, type, function: {name, arguments}}

返回标准格式: {id, type, function: {name, arguments}}

extract_structured_response(text) → Optional[Dict]

Low-level: extracts {"tool_name": "...", "arguments": {...}} from text.

底层方法: 从文本中提取{"tool_name": "...", "arguments": {...}}

Tries: ```json blocks → regex → brace-matching → line-by-line.

尝试顺序: ```json块 → 正则匹配 → 括号匹配 → 逐行解析


**For API backends**: Workflow tools are passed as normal function tools — no injection needed.

**For CLI/SDK backends** (Codex, Claude Code): The model can't receive native function tool definitions, so:

1. **Build instructions**: Call `build_workflow_instructions(tools)` to get the instruction text
2. **Inject into system prompt**: Append to whatever mechanism the backend uses for system prompts
3. **Accumulate text output**: Track all `content` chunks during streaming
4. **Parse after streaming**: Call `parse_workflow_tool_calls(accumulated_text)` to extract tool calls
5. **Yield as done chunk**: `yield StreamChunk(type="done", tool_calls=workflow_tool_calls)`

**Codex-specific**: Instructions go into `AGENTS.md` at the workspace root (Codex auto-reads this). The orchestrator's system message is also extracted from messages and included, since Codex only receives a single user prompt via CLI — the system message from the orchestrator would otherwise be lost. This approach was utilized as the developer instructions approach wasn't working.

**Claude Code-specific**: Instructions are built by `_build_system_prompt_with_workflow_tools()` which also adds tool calling sections, then passed as `system_prompt` to the SDK.

It is an open question as to whether typical workflow tool parsing works (e.g., Claude Code) or if we need to use MCP tools (e.g., Codex).


**API后端**: 工作流工具作为普通函数工具传递，无需注入。

**CLI/SDK后端**（Codex、Claude Code）: 模型无法接收原生函数工具定义，因此需要:

1. **构建指导文本**: 调用`build_workflow_instructions(tools)`获取指导文本
2. **注入到系统提示**: 追加到后端使用的系统提示机制中
3. **累积文本输出**: 流式传输过程中跟踪所有`content`块
4. **流式传输结束后解析**: 调用`parse_workflow_tool_calls(accumulated_text)`提取工具调用
5. **作为完成块返回**: `yield StreamChunk(type="done", tool_calls=workflow_tool_calls)`

**Codex专属逻辑**: 指导文本写入工作区根目录的`AGENTS.md`（Codex会自动读取）。编排器的系统消息也会从消息中提取并包含在内，因为Codex仅通过CLI接收单个用户提示，否则编排器的系统消息会丢失。该方案是因为开发者指导方案无法生效而采用的。

**Claude Code专属逻辑**: 指导文本由`_build_system_prompt_with_workflow_tools()`构建，该方法还会添加工具调用章节，然后作为`system_prompt`传递给SDK。

目前仍存在的问题是：常规工作流工具解析是否生效（如Claude Code），还是需要使用MCP工具（如Codex）。

6.7.1 MCP-Based vs Text-Based Workflow Tools

6.7.1 基于MCP vs 基于文本的工作流工具

When deciding between MCP-based and text-based workflow tools for CLI/SDK backends:

MCP-based approach (attempted but not always reliable):

Register workflow tools as an MCP server
Pass to the SDK/CLI's MCP system

Caveat: MCP tool naming varies by SDK. Claude Code prefixes with

mcp__{server_name}__

# If server is "massgen_workflow_tools", tool names become:
# "new_answer" → "mcp__massgen_workflow_tools__new_answer"
# "vote" → "mcp__massgen_workflow_tools__vote"

When using MCP workflow tools, use

build_workflow_mcp_instructions()

with the appropriate prefix:

python

instructions = build_workflow_mcp_instructions(
    tools,
    mcp_prefix="mcp__massgen_workflow_tools__"
)

Text-based approach (current fallback for Claude Code):

Inject workflow tool instructions into system prompt
Parse JSON tool calls from model's text output using
```
parse_workflow_tool_calls()
```
More reliable but requires robust parsing

Current status: Claude Code uses text-based workflow tools (JSON parsing) because MCP-based approach was unreliable. Codex uses text-based as well via AGENTS.md instructions.

为CLI/SDK后端选择工作流工具方案时:

基于MCP的方案（已尝试但并非始终可靠）:

将工作流工具注册为MCP服务
传递给SDK/CLI的MCP系统

注意: MCP工具命名因SDK而异，Claude Code会添加

mcp__{server_name}__

前缀:

# 如果服务名为"massgen_workflow_tools"，工具名称会变为:
# "new_answer" → "mcp__massgen_workflow_tools__new_answer"
# "vote" → "mcp__massgen_workflow_tools__vote"

使用MCP工作流工具时，调用

build_workflow_mcp_instructions()

并传入对应前缀:

python

instructions = build_workflow_mcp_instructions(
    tools,
    mcp_prefix="mcp__massgen_workflow_tools__"
)

基于文本的方案（Claude Code当前的兜底方案）:

将工作流工具指导注入到系统提示
使用
```
parse_workflow_tool_calls()
```
从模型的文本输出中解析JSON工具调用
更可靠但需要健壮的解析逻辑

当前状态: Claude Code使用基于文本的工作流工具（JSON解析），因为基于MCP的方案不可靠；Codex也通过AGENTS.md指导使用基于文本的方案。

6.6 Provider-Native Tools: Keep vs. Override

6.6 提供商原生工具：保留 vs 覆盖

CLI/SDK-based agents (Codex, Claude Code) come with their own built-in tools (file editing, shell execution, web search, sub-agents, etc.). When integrating, decide which to keep and which to override with MassGen equivalents. Document these decisions in

get_tool_category_overrides()

on the

NativeToolBackendMixin

— see §6.8 and Phase 10.2 for details.

General rule: Prefer the provider's native tools unless MassGen has a specific reason to override. Native tools are optimized for the provider's model and avoid extra MCP overhead.

Override when MassGen adds value:

Sub-agents/Task tools: Override with MassGen's sub-agents — theirs can't participate in MassGen coordination (voting, consensus, intelligence sharing)
Filesystem tools: May override if MassGen needs path permission enforcement or workspace isolation beyond what the provider offers

Keep provider-native:

File read/write/edit: Provider tools are model-optimized
Shell/command execution: Provider sandboxing is already configured
Web search: Provider integration is usually seamless

Implementation: Use the provider's tool filtering mechanism to disable specific tools:

python

undefined

基于CLI/SDK的Agent（Codex、Claude Code）自带内置工具（文件编辑、Shell执行、网页搜索、子Agent等），集成时需要决定哪些保留，哪些用MassGen的等价工具覆盖。在

NativeToolBackendMixin

的

get_tool_category_overrides()

方法中记录这些决策，详情参见§6.8和阶段10.2。

通用规则: 优先使用提供商原生工具，除非MassGen有明确的覆盖理由。原生工具针对提供商模型做了优化，避免额外的MCP开销。

需要覆盖的场景（MassGen能提供额外价值）:

子Agent/任务工具: 用MassGen的子Agent覆盖 — 提供商的子Agent无法参与MassGen的协调机制（投票、共识、情报共享）
文件系统工具: 如果MassGen需要路径权限 enforcement 或工作区隔离能力，超出提供商提供的能力范围时可以覆盖

保留提供商原生工具的场景:

文件读/写/编辑: 提供商工具针对模型做了优化
Shell/命令执行: 提供商的沙箱已经配置完成
网页搜索: 提供商的集成通常更流畅

实现方式: 使用提供商的工具过滤机制禁用特定工具:

python

undefined

Codex: via .codex/config.toml

Codex: 通过.codex/config.toml配置

[mcp_servers.some_server] disabled_tools = ["task", "sub_agent"]

Claude Code: via SDK disallowed_tools param

Claude Code: 通过SDK的disallowed_tools参数

all_params["disallowed_tools"] = [ "Read", "Write", "Edit", "MultiEdit", # Use MassGen's MCP filesystem "Bash", "BashOutput", "KillShell", # Use MassGen's execute_command "LS", "Grep", "Glob", # Use MassGen's filesystem tools "TodoWrite", # MassGen has its own task tracking "NotebookEdit", "NotebookRead", "ExitPlanMode", # Security restrictions "Bash(rm*)", "Bash(sudo*)", "Bash(su*)", "Bash(chmod*)", "Bash(chown*)", ]

all_params["disallowed_tools"] = [ "Read", "Write", "Edit", "MultiEdit", # 使用MassGen的MCP文件系统 "Bash", "BashOutput", "KillShell", # 使用MassGen的execute_command "LS", "Grep", "Glob", # 使用MassGen的文件系统工具 "TodoWrite", # MassGen有自己的任务跟踪 "NotebookEdit", "NotebookRead", "ExitPlanMode", # 安全限制 "Bash(rm*)", "Bash(sudo*)", "Bash(su*)", "Bash(chmod*)", "Bash(chown*)", ]

Conditionally keep web tools:

条件保留网页工具:

if not enable_web_search: disallowed_tools.extend(["WebSearch", "WebFetch"])


**Key patterns**:
- Claude Code SDK supports glob patterns for Bash restrictions: `"Bash(rm*)"`, `"Bash(sudo*)"`
- Check `if "disallowed_tools" not in all_params` to allow user override via YAML config
- Codex uses `disabled_tools` per MCP server in `.codex/config.toml`; for built-in tools, Codex doesn't have a disable mechanism — use `--full-auto` to auto-approve instead

**Document in capabilities.py** which native tools the backend provides:
```python
builtin_tools=["file_edit", "shell", "web_search", "sub_agent"],
notes="Native tools: file ops, shell, web search, sub-agents. Override sub-agents with MassGen's."

if not enable_web_search: disallowed_tools.extend(["WebSearch", "WebFetch"])


**关键模式**:
- Claude Code SDK支持Bash限制的通配符模式: `"Bash(rm*)"`、`"Bash(sudo*)"`
- 检查`if "disallowed_tools" not in all_params`以允许用户通过YAML配置覆盖
- Codex在`.codex/config.toml`中针对每个MCP服务使用`disabled_tools`配置；对于内置工具，Codex没有禁用机制，使用`--full-auto`自动审批替代

**在capabilities.py中记录**后端提供的原生工具:
```python
builtin_tools=["file_edit", "shell", "web_search", "sub_agent"],
notes="原生工具: 文件操作、Shell、网页搜索、子Agent。子Agent使用MassGen实现覆盖。"

6.8 NativeToolBackendMixin

File:

massgen/backend/native_tool_mixin.py

For backends with built-in tools (CLI/SDK wrappers), use

NativeToolBackendMixin

to standardize tool filtering and hook integration:

python

from .native_tool_mixin import NativeToolBackendMixin

class YourBackend(NativeToolBackendMixin, LLMBackend):
    def __init__(self, **kwargs):
        super().__init__(**kwargs)
        self.__init_native_tool_mixin__()
        # Optional: initialize hook adapter
        self._init_native_hook_adapter(
            "massgen.mcp_tools.native_hook_adapters.YourBackendAdapter"
        )

    def get_disallowed_tools(self, config):
        """Return native tools to disable (MassGen has equivalents)."""
        return ["NativeTool1", "NativeTool2"]

Mixin provides:

Method	Purpose
`get_disallowed_tools(config)`	Abstract — declare which native tools to disable
`get_tool_category_overrides()`	Abstract — declare which MCP categories to skip/override
`supports_native_hooks()`	Check if hook adapter is available
`get_native_hook_adapter()`	Get the adapter instance
`set_native_hooks_config(config)`	Set MassGen hooks in native format
`_init_native_hook_adapter(path)`	Initialize adapter by import path

Reference implementations:

```
claude_code.py
```
— disables most native tools (Read, Write, Bash, etc.) in favor of MassGen MCP
```
codex.py
```
— keeps all native tools (MassGen skips attaching MCP equivalents instead)

文件:

massgen/backend/native_tool_mixin.py

对于带有内置工具的后端（CLI/SDK封装），使用

NativeToolBackendMixin

标准化工具过滤和Hook集成:

python

from .native_tool_mixin import NativeToolBackendMixin

class YourBackend(NativeToolBackendMixin, LLMBackend):
    def __init__(self, **kwargs):
        super().__init__(**kwargs)
        self.__init_native_tool_mixin__()
        # 可选：初始化Hook适配器
        self._init_native_hook_adapter(
            "massgen.mcp_tools.native_hook_adapters.YourBackendAdapter"
        )

    def get_disallowed_tools(self, config):
        """返回需要禁用的原生工具（MassGen有等价实现）。"""
        return ["NativeTool1", "NativeTool2"]

Mixin提供的方法:

方法	用途
`get_disallowed_tools(config)`	抽象方法 — 声明需要禁用的原生工具
`get_tool_category_overrides()`	抽象方法 — 声明需要跳过/覆盖的MCP类别
`supports_native_hooks()`	检查是否有可用的Hook适配器
`get_native_hook_adapter()`	获取适配器实例
`set_native_hooks_config(config)`	设置原生格式的MassGen Hook
`_init_native_hook_adapter(path)`	通过导入路径初始化适配器

参考实现:

```
claude_code.py
```
— 禁用大多数原生工具（Read、Write、Bash等），使用MassGen MCP替代
```
codex.py
```
— 保留所有原生工具，MassGen改为不挂载等价MCP工具

Phase 7: Testing (2+ files)

阶段7：测试（2+个文件）

7.1 Capabilities Test

7.1 能力测试

Automatically tested by

massgen/tests/test_backend_capabilities.py

once you add to capabilities.py.

bash

uv run pytest massgen/tests/test_backend_capabilities.py -v

添加到capabilities.py后，

massgen/tests/test_backend_capabilities.py

会自动测试。

bash

uv run pytest massgen/tests/test_backend_capabilities.py -v

7.2 Integration Test

7.2 集成测试

Create:

massgen/tests/test_<name>_integration.py

or add to cross-backend test scripts.

Cross-backend test pattern:

python

BACKEND_CONFIGS = {
    "claude": {"type": "claude", "model": "claude-haiku-4-5-20251001"},
    "openai": {"type": "openai", "model": "gpt-4o-mini"},
    "your_backend": {"type": "your_backend", "model": "model-a"},
}

创建:

massgen/tests/test_<name>_integration.py

或添加到跨后端测试脚本。

跨后端测试模式:

python

BACKEND_CONFIGS = {
    "claude": {"type": "claude", "model": "claude-haiku-4-5-20251001"},
    "openai": {"type": "openai", "model": "gpt-4o-mini"},
    "your_backend": {"type": "your_backend", "model": "model-a"},
}

7.3 Hook Firing Test (CLI/SDK backends) ⚠️ REQUIRED

7.3 Hook触发测试（CLI/SDK后端）⚠️ 必选

Script:

scripts/test_hook_backends.py

This is mandatory for every new CLI/SDK backend. The script verifies that the full hook pipeline actually fires during real streaming — not just that the hook data structures are correct.

Two modes:

Unit mode (no API calls — runs in CI):

bash

uv run python scripts/test_hook_backends.py --backend your_backend

Verifies: backend stores

GeneralHookManager

MidStreamInjectionHook

returns injection content,

HighPriorityTaskReminderHook

fires, combined hooks aggregate correctly.

E2E mode (real API calls):

bash

uv run python scripts/test_hook_backends.py --backend your_backend --e2e
uv run python scripts/test_hook_backends.py --backend your_backend --e2e --verbose

Verifies the complete live flow: PreToolUse hook fires → tool executes → PostToolUse hook fires → injection content is fed back to the model and acknowledged.

Adding your backend: Add an entry to

BACKEND_CONFIGS

in the script:

python

"your_backend": {
    "type": "your_backend",
    "model": "your-model",
    "description": "Your Backend description",
    "api_style": "openai",  # or "anthropic", "gemini"
},

Currently covered:

claude

openai

gemini

(native SDK),

openrouter

grok

Not yet covered:

codex

claude_code

copilot

gemini_cli

— these use file-based IPC or SDK-native hooks and need separate E2E hook verification.

Why this matters: Unit tests verify that hook data structures are correct. This script verifies that hooks actually fire during a real streaming turn. A backend can pass all unit tests and still silently drop hooks during live execution.

脚本:

scripts/test_hook_backends.py

所有新的CLI/SDK后端都必须通过该测试，脚本验证完整的Hook流水线在真实流式传输过程中确实会触发，而不仅仅是Hook数据结构正确。

两种模式:

单元模式（无API调用 — 可在CI中运行）:

bash

uv run python scripts/test_hook_backends.py --backend your_backend

验证项: 后端存储了

GeneralHookManager

、

MidStreamInjectionHook

返回注入内容、

HighPriorityTaskReminderHook

触发、组合Hook正确聚合。

E2E模式（真实API调用）:

bash

uv run python scripts/test_hook_backends.py --backend your_backend --e2e
uv run python scripts/test_hook_backends.py --backend your_backend --e2e --verbose

验证完整的实时流程: PreToolUse Hook触发 → 工具执行 → PostToolUse Hook触发 → 注入内容反馈给模型并被确认。

添加你的后端: 在脚本的

BACKEND_CONFIGS

中添加条目:

python

"your_backend": {
    "type": "your_backend",
    "model": "your-model",
    "description": "Your Backend描述",
    "api_style": "openai",  # 或"anthropic"、"gemini"
},

当前已覆盖:

claude

、

openai

、

gemini

（原生SDK）、

openrouter

、

grok

未覆盖:

codex

、

claude_code

、

copilot

、

gemini_cli

— 这些使用基于文件的IPC或SDK原生Hook，需要单独的E2E Hook验证。

重要性: 单元测试仅验证Hook数据结构正确，该脚本验证Hook在真实流式交互过程中确实会触发。后端可能通过所有单元测试，但在实际运行时仍可能静默丢失Hook。

7.4 Sandbox / Path Permission Test (CLI/SDK backends) ⚠️ REQUIRED

7.4 沙箱/路径权限测试（CLI/SDK后端）⚠️ 必选

Script:

scripts/test_native_tools_sandbox.py

This is mandatory for every new CLI/SDK backend with built-in file/shell tools. It runs real agents against a real filesystem with unique secrets and verifies that permission boundaries are actually enforced — not just that the enforcement code exists.

bash

undefined

脚本:

scripts/test_native_tools_sandbox.py

所有带有内置文件/Shell工具的新CLI/SDK后端都必须通过该测试，它会使用真实Agent操作真实文件系统，通过唯一密钥验证权限边界确实被强制执行，而不仅仅是 enforcement 代码存在。

bash

undefined

Test your new backend

测试新后端

uv run python scripts/test_native_tools_sandbox.py --backend your_backend

Use LLM judge to detect subtle leakage

使用LLM裁判检测细微泄露

uv run python scripts/test_native_tools_sandbox.py --backend your_backend --llm-judge


The script verifies the full permission matrix:

| Zone | Expected reads | Expected writes |
|------|---------------|-----------------|
| Workspace (cwd) | ✅ allowed | ✅ allowed |
| Writable context path | ✅ allowed | ✅ allowed |
| Read-only context path | ✅ allowed | ❌ blocked |
| Outside all contexts | depends on backend | ❌ blocked |
| Parent directory | depends on backend | ❌ blocked |
| `/tmp` | ✅ allowed | depends on backend |

It uses **unique secrets** (UUIDs written to each zone's files) to detect unauthorized reads even when the operation "fails" — if the secret string appears in the model's response, there's a leak regardless of error messages.

**Adding your backend**: Add an entry to `BACKEND_CONFIGS` in the script:
```python
"your_backend": {
    "module": "massgen.backend.your_backend",
    "class": "YourBackend",
    "model": "your-model",
    "blocks_reads_outside": True,   # Does your enforcement block reads?
    "blocks_tmp_writes": True,      # Does your enforcement block /tmp writes?
},

Currently covered:

claude_code

codex

Not yet covered:

copilot

gemini_cli

— must be added when those backends are used in production.

Why this matters: Permission callback and hook unit tests verify the enforcement logic in isolation. This script verifies that enforcement actually stops a live agent from accessing restricted paths. A backend can have correct hook logic and still allow unauthorized access if the hook isn't wired into the right execution path.

uv run python scripts/test_native_tools_sandbox.py --backend your_backend --llm-judge


脚本验证完整的权限矩阵:

| 区域 | 预期读权限 | 预期写权限 |
|------|---------------|-----------------|
| 工作区（cwd） | ✅ 允许 | ✅ 允许 |
| 可写上下文路径 | ✅ 允许 | ✅ 允许 |
| 只读上下文路径 | ✅ 允许 | ❌ 阻止 |
| 所有上下文外路径 | 依后端而定 | ❌ 阻止 |
| 父目录 | 依后端而定 | ❌ 阻止 |
| `/tmp` | ✅ 允许 | 依后端而定 |

它使用**唯一密钥**（写入每个区域文件的UUID）检测未授权读操作，即使操作“失败” — 如果密钥字符串出现在模型响应中，无论错误消息如何，都说明存在泄露。

**添加你的后端**: 在脚本的`BACKEND_CONFIGS`中添加条目:
```python
"your_backend": {
    "module": "massgen.backend.your_backend",
    "class": "YourBackend",
    "model": "your-model",
    "blocks_reads_outside": True,   # 你的权限控制是否阻止外部读？
    "blocks_tmp_writes": True,      # 你的权限控制是否阻止/tmp写操作？
},

当前已覆盖:

claude_code

、

codex

未覆盖:

copilot

、

gemini_cli

— 这些后端生产环境使用前必须添加测试。

重要性: 权限回调和Hook单元测试仅单独验证 enforcement 逻辑，该脚本验证 enforcement 确实能阻止实时Agent访问受限路径。后端可能有正确的Hook逻辑，但如果Hook没有接入正确的执行路径，仍可能允许未授权访问。

7.5 Config Validation

7.5 配置校验

bash

uv run python scripts/validate_all_configs.py

bash

uv run python scripts/validate_all_configs.py

Phase 8: Config Examples (2+ files)

阶段8：配置示例（2+个文件）

8.1 Single Agent

8.1 单Agent

Create:

massgen/configs/providers/<name>/single_<name>.yaml

创建:

massgen/configs/providers/<name>/single_<name>.yaml

8.2 Multi-Agent / Tools

8.2 多Agent / 工具

Create:

massgen/configs/providers/<name>/<name>_with_tools.yaml

创建:

massgen/configs/providers/<name>/<name>_with_tools.yaml

Phase 9: Documentation (3+ files)

阶段9：文档（3+个文件）

9.1 YAML Schema

File:

docs/source/reference/yaml_schema.rst

— document backend-specific params

文件:

docs/source/reference/yaml_schema.rst

— 记录后端专属参数

9.2 Backend Tables

9.2 后端表格

bash

uv run python docs/scripts/generate_backend_tables.py

bash

uv run python docs/scripts/generate_backend_tables.py

9.3 User Guide

9.3 用户指南

File:

docs/source/user_guide/backends.rst

— add section for new backend

文件:

docs/source/user_guide/backends.rst

— 为新后端添加章节

Phase 10: System Prompt Considerations

阶段10：系统提示注意事项

The system prompt is assembled by

massgen/system_message_builder.py

using priority-based sections from

massgen/system_prompt_sections.py

. Different sections are conditionally included based on backend capabilities and config flags.

系统提示由

massgen/system_message_builder.py

基于

massgen/system_prompt_sections.py

的优先级章节组装，根据后端能力和配置标志条件包含不同章节。

10.1 Section Categories

10.1 章节分类

Always included (all backends):

Section	Purpose
`AgentIdentitySection`	WHO the agent is
`EvaluationSection`	vote/new_answer coordination primitives
`CoreBehaviorsSection`	Action bias, parallel execution
`OutputFirstVerificationSection`	Quality iteration loop
`SkillsSection`	`openskills read` via command execution
`MemorySection`	Decision documentation
`WorkspaceStructureSection`	Workspace paths
`ProjectInstructionsSection`	CLAUDE.md/AGENTS.md
`SubagentSection`	MassGen subagent delegation
`BroadcastCommunicationSection`	Inter-agent communication
`PostEvaluationSection`	submit/restart
`FileSearchSection`	rg/sg universal CLI tools
`TaskContextSection`	CONTEXT.md creation
`NoveltyPressureSection`	Novelty/diversity pressure across agents
`ChangedocSection`	Change documentation

Conditional on config flags:

Section	Gate
`TaskPlanningSection`	`enable_task_planning=True`
`EvolvingSkillsSection`	`auto_discover_custom_tools=True` AND `enable_task_planning=True`
`PlanningModeSection`	`planning_mode_enabled=True`
`CodeBasedToolsSection`	`enable_code_based_tools=True` (CodeAct paradigm)
`CommandExecutionSection`	`enable_mcp_command_line=True`
`DecompositionSection`	Decomposition mode active
`MultimodalToolsSection`	`enable_multimodal_tools=True`

Model-specific:

Section	Gate
`GPT5GuidanceSection`	GPT-5 models only
`GrokGuidanceSection`	Grok models only

Adapted for native backends:

Section	Adaptation
`FilesystemOperationsSection`	`has_native_tools=True` → generic tool language
`FilesystemBestPracticesSection`	Comparison tool language adjusted

始终包含（所有后端）:

章节	用途
`AgentIdentitySection`	Agent身份定义
`EvaluationSection`	vote/new_answer协调原语
`CoreBehaviorsSection`	行为倾向、并行执行
`OutputFirstVerificationSection`	质量迭代循环
`SkillsSection`	通过命令执行 `openskills read`
`MemorySection`	决策记录
`WorkspaceStructureSection`	工作区路径
`ProjectInstructionsSection`	CLAUDE.md/AGENTS.md
`SubagentSection`	MassGen子Agent委托
`BroadcastCommunicationSection`	Agent间通信
`PostEvaluationSection`	submit/restart
`FileSearchSection`	rg/sg通用CLI工具
`TaskContextSection`	CONTEXT.md创建
`NoveltyPressureSection`	Agent间的新颖性/多样性要求
`ChangedocSection`	变更记录

依赖配置标志:

章节	开关
`TaskPlanningSection`	`enable_task_planning=True`
`EvolvingSkillsSection`	`auto_discover_custom_tools=True` 且 `enable_task_planning=True`
`PlanningModeSection`	`planning_mode_enabled=True`
`CodeBasedToolsSection`	`enable_code_based_tools=True` （CodeAct范式）
`CommandExecutionSection`	`enable_mcp_command_line=True`
`DecompositionSection`	分解模式激活
`MultimodalToolsSection`	`enable_multimodal_tools=True`

模型专属:

章节	开关
`GPT5GuidanceSection`	仅GPT-5模型
`GrokGuidanceSection`	仅Grok模型

针对原生后端适配:

章节	适配逻辑
`FilesystemOperationsSection`	`has_native_tools=True` → 使用通用工具表述
`FilesystemBestPracticesSection`	调整比较工具的表述

10.2

tool_category_overrides

10.2

tool_category_overrides

File:

massgen/backend/native_tool_mixin.py

The

get_tool_category_overrides()

abstract method on

NativeToolBackendMixin

declares which MassGen tool categories a backend handles natively. Each native backend implements this method:

python

@abstractmethod
def get_tool_category_overrides(self) -> Dict[str, str]:
    # "skip"     = backend has native equivalent
    # "override" = attach MassGen's version, disable native
    # (absent)   = attach MCP normally

Categories:

filesystem

command_execution

file_search

web_search

planning

subagents

Currently used by

system_message_builder.py

to:

Pass
```
has_native_tools=True
```
to filesystem sections when
```
filesystem: "skip"
```
Document which tool categories each backend handles natively

Note: MCP server injection filtering (which MCP servers to attach) is handled individually by each CLI/SDK backend (

codex.py

claude_code.py

). The

get_tool_category_overrides()

method serves as documentation and controls system prompt section behavior. Non-mixin backends (API-based) return

{}

by default via

system_message_builder.py

's fallback.

文件:

massgen/backend/native_tool_mixin.py

NativeToolBackendMixin

的抽象方法

get_tool_category_overrides()

声明后端原生支持的MassGen工具类别，每个原生后端都需要实现该方法:

python

@abstractmethod
def get_tool_category_overrides(self) -> Dict[str, str]:
    # "skip"     = 后端有原生等价实现
    # "override" = 挂载MassGen版本，禁用原生实现
    # （未声明）   = 正常挂载MCP

类别:

filesystem

、

command_execution

、

file_search

、

web_search

、

planning

、

subagents

当前

system_message_builder.py

使用该配置:

当
```
filesystem: "skip"
```
时，给文件系统章节传递
```
has_native_tools=True
```
参数
记录每个后端原生处理的工具类别

注意: MCP服务注入过滤（挂载哪些MCP服务）由每个CLI/SDK后端单独处理（

codex.py

、

claude_code.py

），

get_tool_category_overrides()

方法作为文档说明，同时控制系统提示章节行为。非Mixin后端（基于API的）

system_message_builder.py

默认返回空字典。

10.3 Path Permission Integration

10.3 路径权限集成

When

filesystem: "skip"

, the backend's native tools handle file ops — but they may not enforce MassGen's granular per-path permissions from

PathPermissionManager

(PPM). Each backend addresses this differently:

Two-layer defense (recommended for SDK backends with permission callbacks):

Layer 1 — Permission callback (coarse gate): Extract paths from the SDK's permission request, validate against PPM. Fail-open if no path extractable (defer to Layer 2).
Layer 2 — PreToolUse hook (fine-grained): The orchestrator auto-registers
```
PathPermissionManagerHook
```
as
```
PRE_TOOL_USE
```
for non-Claude-Code native backends. Full
```
toolName
```
/
```
toolArgs
```
available for precise validation.

Backend-specific approaches:

Backend	PPM Strategy
Claude Code	Own `add_dirs` sandboxing — PPM hook skipped by orchestrator
Copilot	Two-layer defense (permission callback + PPM PreToolUse hook). In Docker mode, container provides isolation + PPM as defense-in-depth
Codex	No native hook support — relies on Docker/workspace isolation
API backends	PPM enforced via `ToolManager` automatically

Reference:

copilot.py

(

_build_permission_callback

) and

orchestrator.py

(

_setup_native_hooks_for_agent

) for the two-layer pattern. See

massgen/filesystem_manager/_path_permission_manager.py

for PPM internals.

当

filesystem: "skip"

时，后端的原生工具处理文件操作，但它们可能不会强制执行MassGen的

PathPermissionManager

（PPM）的细粒度路径权限，每个后端的处理方式不同:

双层防御（推荐给支持权限回调的SDK后端）:

第一层 — 权限回调（粗粒度关口）: 从SDK的权限请求中提取路径，通过PPM验证。如果无法提取路径则失败开放（降级到第二层）。
第二层 — PreToolUse Hook（细粒度控制）: 编排器会为非Claude Code的原生后端自动注册
```
PathPermissionManagerHook
```
作为
```
PRE_TOOL_USE
```
Hook，可获取完整的
```
toolName
```
/
```
toolArgs
```
进行精确验证。

后端专属方案:

后端	PPM策略
Claude Code	自有 `add_dirs` 沙箱 — 编排器跳过PPM Hook
Copilot	双层防御（权限回调 + PPM PreToolUse Hook）。Docker模式下，容器提供隔离 + PPM作为深度防御
Codex	无原生Hook支持 — 依赖Docker/工作区隔离
API后端	`ToolManager` 自动强制执行PPM

参考:

copilot.py

（

_build_permission_callback

）和

orchestrator.py

（

_setup_native_hooks_for_agent

）的双层模式实现，PPM内部实现参见

massgen/filesystem_manager/_path_permission_manager.py

。

Phase 11: Hooks

阶段11：Hook

MassGen supports hooks for intercepting tool execution and other lifecycle events. Hooks run transparently (not documented in the system prompt).

Note that some backends may not support hooks (e.g., Codex). In this case, hooks will only be applied where possible within MassGen's tool execution framework.

MassGen支持Hook拦截工具执行和其他生命周期事件，Hook透明运行（不在系统提示中说明）。

注意部分后端可能不支持Hook（如Codex），这种情况下Hook仅会在MassGen的工具执行框架中尽可能应用。

11.1 Hook Types and Built-in Hooks

11.1 Hook类型和内置Hook

Hook types (

HookType

enum in

massgen/mcp_tools/hooks.py

HookType	Trigger
`PRE_TOOL_USE`	Before tool execution
`POST_TOOL_USE`	After tool execution

Built-in hook classes (

PatternHook

subclasses — registered by the orchestrator):

Hook Class	Type	Purpose
`PathPermissionManagerHook`	PreToolUse	Enforce per-path read/write permissions (in `_path_permission_manager.py` )
`RoundTimeoutPreHook`	PreToolUse	Enforce round time limits
`HumanInputHook`	PreToolUse	Inject human input into agent flow
`MidStreamInjectionHook`	PostToolUse	Inject coordination messages (voting reminders, etc.)
`SubagentCompleteHook`	PostToolUse	Notify when subagent completes
`BackgroundToolCompleteHook`	PostToolUse	Notify when background tool completes
`HighPriorityTaskReminderHook`	PostToolUse	Remind agent of current task
`MediaCallLedgerHook`	PostToolUse	Track media generation calls
`RoundTimeoutPostHook`	PostToolUse	Enforce round time limits

Hook类型（

massgen/mcp_tools/hooks.py

中的

HookType

枚举）:

HookType	触发时机
`PRE_TOOL_USE`	工具执行前
`POST_TOOL_USE`	工具执行后

内置Hook类（

PatternHook

子类 — 由编排器注册）:

Hook类	类型	用途
`PathPermissionManagerHook`	PreToolUse	执行路径级读/写权限（定义在 `_path_permission_manager.py` ）
`RoundTimeoutPreHook`	PreToolUse	执行轮次时间限制
`HumanInputHook`	PreToolUse	将人类输入注入到Agent流程
`MidStreamInjectionHook`	PostToolUse	注入协调消息（投票提醒等）
`SubagentCompleteHook`	PostToolUse	子Agent完成时通知
`BackgroundToolCompleteHook`	PostToolUse	后台工具完成时通知
`HighPriorityTaskReminderHook`	PostToolUse	提醒Agent当前任务
`MediaCallLedgerHook`	PostToolUse	跟踪媒体生成调用
`RoundTimeoutPostHook`	PostToolUse	执行轮次时间限制

11.2 Hook Architecture

11.2 Hook架构

GeneralHookManager (massgen/mcp_tools/hooks/)
    ├── Registers global and per-agent hooks
    ├── Executes hooks in order for MCP-based backends
    └── For native backends → NativeHookAdapter
            ├── ClaudeCodeNativeHookAdapter
            │   └── Converts to Claude Agent SDK HookMatcher format
            └── (future) CodexNativeHookAdapter

GeneralHookManager (massgen/mcp_tools/hooks/)
    ├── 注册全局和单Agent Hook
    ├── 为基于MCP的后端按顺序执行Hook
    └── 针对原生后端 → NativeHookAdapter
            ├── ClaudeCodeNativeHookAdapter
            │   └── 转换为Claude Agent SDK HookMatcher格式
            └── （未来）CodexNativeHookAdapter

11.3 Native Hook Adapters — How to Implement

11.3 原生Hook适配器 — 实现方法

For backends with native tool execution (CLI/SDK wrappers), MassGen hooks need to be bridged into the backend's hook system. This is done via

NativeHookAdapter

(base class in

massgen/mcp_tools/native_hook_adapters/base.py

Step 1: Create adapter subclass:

python

undefined

对于带有原生工具执行的后端（CLI/SDK封装），需要将MassGen Hook桥接到后端的Hook系统，通过

NativeHookAdapter

实现（基类位于

massgen/mcp_tools/native_hook_adapters/base.py

）。

步骤1: 创建适配器子类:

python

undefined

massgen/mcp_tools/native_hook_adapters/your_backend_adapter.py

from .base import NativeHookAdapter

class YourBackendNativeHookAdapter(NativeHookAdapter): def supports_hook_type(self, hook_type): return hook_type in (HookType.PRE_TOOL_USE, HookType.POST_TOOL_USE)

def convert_hook_to_native(self, hook, hook_type, context_factory=None):
    """Wrap MassGen hook as your backend's native hook format."""
    async def native_hook(tool_name, tool_args, context):
        event = self.create_hook_event_from_native(
            {"tool_name": tool_name, "tool_input": tool_args},
            hook_type, context_factory() if context_factory else {},
        )
        result = await hook.execute(event)
        return self.convert_hook_result_to_native(result, hook_type)
    return {"pattern": hook.matcher, "handler": native_hook}

def build_native_hooks_config(self, hook_manager, agent_id=None, context_factory=None):
    """Convert all registered hooks to native config."""
    config = {"PreToolUse": [], "PostToolUse": []}
    for hook_type, hooks in hook_manager.get_hooks(agent_id).items():
        for hook in hooks:
            native = self.convert_hook_to_native(hook, hook_type, context_factory)
            config[hook_type.value].append(native)
    return config

def merge_native_configs(self, *configs):
    """Merge permission hooks + MassGen hooks + user hooks."""
    merged = {"PreToolUse": [], "PostToolUse": []}
    for config in configs:
        for key in merged:
            merged[key].extend(config.get(key, []))
    return merged


**Step 2**: Initialize in backend `__init__`:
```python
from ..mcp_tools.native_hook_adapters import YourBackendNativeHookAdapter
self._native_hook_adapter = YourBackendNativeHookAdapter()

Step 3: Implement the backend interface methods:

python

def supports_native_hooks(self) -> bool:
    return self._native_hook_adapter is not None

def get_native_hook_adapter(self):
    return self._native_hook_adapter

def set_native_hooks_config(self, config):
    self._massgen_hooks_config = config

Step 4: Merge hooks when launching the backend process. The orchestrator calls

set_native_hooks_config()

with converted hooks; your backend merges with its own permission hooks when building options.

from .base import NativeHookAdapter

class YourBackendNativeHookAdapter(NativeHookAdapter): def supports_hook_type(self, hook_type): return hook_type in (HookType.PRE_TOOL_USE, HookType.POST_TOOL_USE)

def convert_hook_to_native(self, hook, hook_type, context_factory=None):
    """将MassGen Hook封装为后端原生Hook格式。"""
    async def native_hook(tool_name, tool_args, context):
        event = self.create_hook_event_from_native(
            {"tool_name": tool_name, "tool_input": tool_args},
            hook_type, context_factory() if context_factory else {},
        )
        result = await hook.execute(event)
        return self.convert_hook_result_to_native(result, hook_type)
    return {"pattern": hook.matcher, "handler": native_hook}

def build_native_hooks_config(self, hook_manager, agent_id=None, context_factory=None):
    """将所有注册的Hook转换为原生配置。"""
    config = {"PreToolUse": [], "PostToolUse": []}
    for hook_type, hooks in hook_manager.get_hooks(agent_id).items():
        for hook in hooks:
            native = self.convert_hook_to_native(hook, hook_type, context_factory)
            config[hook_type.value].append(native)
    return config

def merge_native_configs(self, *configs):
    """合并权限Hook + MassGen Hook + 用户Hook。"""
    merged = {"PreToolUse": [], "PostToolUse": []}
    for config in configs:
        for key in merged:
            merged[key].extend(config.get(key, []))
    return merged


**步骤2**: 在后端`__init__`中初始化:
```python
from ..mcp_tools.native_hook_adapters import YourBackendNativeHookAdapter
self._native_hook_adapter = YourBackendNativeHookAdapter()

步骤3: 实现后端接口方法:

python

def supports_native_hooks(self) -> bool:
    return self._native_hook_adapter is not None

def get_native_hook_adapter(self):
    return self._native_hook_adapter

def set_native_hooks_config(self, config):
    self._massgen_hooks_config = config

步骤4: 启动后端进程时合并Hook。编排器会调用

set_native_hooks_config()

传递转换后的Hook，后端构建选项时将其与自身的权限Hook合并。

11.4 Reference

11.4 参考

Hook framework:
```
massgen/mcp_tools/hooks/
```

Adapter base:

massgen/mcp_tools/native_hook_adapters/base.py

Claude Code adapter:

massgen/mcp_tools/native_hook_adapters/claude_code_adapter.py

Copilot adapter:

massgen/mcp_tools/native_hook_adapters/copilot_adapter.py

Path permissions:

massgen/filesystem_manager/_path_permission_manager.py

Orchestrator hook setup:
```
massgen/orchestrator.py
```
(search for
```
native_hook
```
)

Hook框架:
```
massgen/mcp_tools/hooks/
```

适配器基类:

massgen/mcp_tools/native_hook_adapters/base.py

Claude Code适配器:

massgen/mcp_tools/native_hook_adapters/claude_code_adapter.py

Copilot适配器:

massgen/mcp_tools/native_hook_adapters/copilot_adapter.py

路径权限:

massgen/filesystem_manager/_path_permission_manager.py

编排器Hook设置:
```
massgen/orchestrator.py
```
（搜索
```
native_hook
```
）

Common Patterns

通用模式

CLI Wrapper Backend (like Codex)

CLI封装后端（如Codex）

__init__: find CLI binary, set config defaults
stream_with_tools: build command -> spawn subprocess -> parse JSONL stdout -> yield StreamChunks
MCP: write project-scoped config file in workspace before launch
Custom tools: wrap as MCP server or inject via system prompt
Cleanup: remove workspace config on reset_state/clear_history

__init__: 查找CLI二进制文件，设置配置默认值
stream_with_tools: 构建命令 → 启动子进程 → 解析JSONL标准输出 → 返回StreamChunks
MCP: 启动前在工作区写入项目级配置文件
自定义工具: 封装为MCP服务或通过系统提示注入
清理: reset_state/clear_history时删除工作区配置

Docker Execution for CLI Backends

CLI后端的Docker执行

If a CLI-based backend doesn't support Docker natively, run the CLI inside the MassGen container:

Install the CLI in the Docker image (add to
```
npm install -g
```
in Dockerfile) or via
```
command_line_docker_packages.preinstall.npm
```

__init__

, detect

command_line_execution_mode: "docker"

and skip host CLI lookup

stream_with_tools

, use

container.client.api.exec_create()

exec_start(stream=True)

to run the CLI inside the container

If the CLI has its own sandbox, disable it — the container provides isolation
If the CLI needs host credentials, mount them read-only via
```
_build_credential_mounts()
```

IMPORTANT: Some CLI backends require

command_line_docker_network_mode: bridge

when running in Docker mode. This allows the container to make outbound network requests to APIs. Add config validation to enforce this:

python

undefined

如果基于CLI的后端原生不支持Docker，将CLI运行在MassGen容器内:

在Docker镜像中安装CLI（在Dockerfile中添加
```
npm install -g
```
）或通过
```
command_line_docker_packages.preinstall.npm
```
安装

在

__init__

中检测

command_line_execution_mode: "docker"

，跳过宿主机CLI查找

在

stream_with_tools

中使用

container.client.api.exec_create()

exec_start(stream=True)

在容器内运行CLI

如果CLI有自有沙箱，禁用它 — 容器提供隔离能力
如果CLI需要宿主机凭证，通过
```
_build_credential_mounts()
```
以只读方式挂载

重要: 部分CLI后端在Docker模式下需要

command_line_docker_network_mode: bridge

配置，允许容器对外发起API网络请求。添加配置校验强制该要求:

python

undefined

In config_validator.py

在config_validator.py中

if backend_type == "your_backend": execution_mode = backend_config.get("command_line_execution_mode") if execution_mode == "docker": if "command_line_docker_network_mode" not in backend_config: result.add_error( "YourBackend in Docker mode requires 'command_line_docker_network_mode'", f"{location}.command_line_docker_network_mode", "Add 'command_line_docker_network_mode: bridge' (required for network access)", )


**Reference**: `codex.py` implements this pattern with `_stream_docker()` / `_stream_local()` branching.

if backend_type == "your_backend": execution_mode = backend_config.get("command_line_execution_mode") if execution_mode == "docker": if "command_line_docker_network_mode" not in backend_config: result.add_error( "Docker模式下的YourBackend需要配置'command_line_docker_network_mode'", f"{location}.command_line_docker_network_mode", "添加'command_line_docker_network_mode: bridge'（网络访问必需）", )


**参考**: `codex.py`实现了该模式，通过`_stream_docker()` / `_stream_local()`分支处理。

Docker Execution for SDK Backends (like Copilot)

SDK后端的Docker执行（如Copilot）

SDK-based backends (in-process Python clients) cannot run inside Docker — the SDK stays on host. Instead, Docker isolates file/shell operations by:

Disable built-in tools in Docker mode via
```
excluded_tools
```
so the SDK doesn't use its native file/shell tools
Route all file/shell ops through MCP servers that execute inside the Docker container
Point
working_directory
to the Docker-mounted workspace path

python

undefined

基于SDK的后端（进程内Python客户端）无法运行在Docker内部，SDK保留在宿主机，Docker通过以下方式隔离文件/Shell操作:

Docker模式下通过
```
excluded_tools
```
禁用内置工具，避免SDK使用原生文件/Shell工具
所有文件/Shell操作路由到Docker容器内执行的MCP服务
将
```
working_directory
```
指向Docker挂载的工作区路径

python

undefined

In init:

在init中:

self._docker_execution = ( kwargs.get("command_line_execution_mode") == "docker" or self.config.get("command_line_execution_mode") == "docker" )

Property to check if Docker is actually active:

检查Docker是否实际激活的属性:

@property def _is_docker_mode(self) -> bool: if not self._docker_execution: return False if not self.filesystem_manager: return False dm = getattr(self.filesystem_manager, "docker_manager", None) if dm is None: return False agent_id = self.config.get("agent_id") if agent_id and dm.get_container(agent_id): return True return False

In get_disallowed_tools():

在get_disallowed_tools()中:

def get_disallowed_tools(self, config): if self._docker_execution: return ["editFile", "createFile", "deleteFile", "readFile", "listDirectory", "runShellCommand", "shellCommand"] return []

In stream_with_tools(), merge Docker-excluded tools into session config:

在stream_with_tools()中，合并Docker排除工具到会话配置:

if self._docker_execution: docker_excluded = self.get_disallowed_tools(self.config) if docker_excluded: excluded_tools = list(set((excluded_tools or []) + docker_excluded))


**Key differences from CLI Docker pattern**:
- SDK stays on host (no `exec_create` / `exec_start`)
- Built-in tools disabled via SDK session config (`excluded_tools`), not by running inside container
- MCP servers still execute inside container (same as CLI pattern)
- PPM permission callback still active as defense-in-depth

**Config validation**: Same pattern — require `command_line_docker_network_mode`:
```python
if backend_type == "copilot":
    execution_mode = backend_config.get("command_line_execution_mode")
    if execution_mode == "docker":
        if "command_line_docker_network_mode" not in backend_config:
            result.add_error(...)

Reference:

copilot.py

implements this pattern.

if self._docker_execution: docker_excluded = self.get_disallowed_tools(self.config) if docker_excluded: excluded_tools = list(set((excluded_tools or []) + docker_excluded))


**与CLI Docker模式的关键差异**:
- SDK保留在宿主机（无`exec_create` / `exec_start`调用）
- 内置工具通过SDK会话配置（`excluded_tools`）禁用，而非运行在容器内
- MCP服务仍在容器内执行（与CLI模式相同）
- PPM权限回调仍作为深度防御生效

**配置校验**: 相同模式 — 要求`command_line_docker_network_mode`:
```python
if backend_type == "copilot":
    execution_mode = backend_config.get("command_line_execution_mode")
    if execution_mode == "docker":
        if "command_line_docker_network_mode" not in backend_config:
            result.add_error(...)

参考:

copilot.py

实现了该模式。

Docker MCP Path Resolution

Docker MCP路径解析

MCP server configs are built by the orchestrator on the host with absolute host file paths (e.g.,

fastmcp run /host/path/massgen/mcp_tools/planning/_server.py:create_server

). Inside Docker, these paths don't exist unless explicitly mounted.

Solution:

_docker_manager.py

bind-mounts the

massgen/

package directory into the container at the same host path (read-only). This makes host-path-based MCP configs work as-is inside the container, using the latest source from the host rather than stale pip-installed modules.

MCP服务配置由宿主机上的编排器使用宿主机绝对路径构建（如

fastmcp run /host/path/massgen/mcp_tools/planning/_server.py:create_server

）。在Docker内部，这些路径不存在，除非显式挂载。

解决方案:

_docker_manager.py

将

massgen/

包目录以相同的宿主机路径绑定挂载到容器内（只读），使得基于宿主机路径的MCP配置在容器内无需修改即可生效，使用宿主机上的最新源码而非过时的pip安装模块。

Shell Sandboxing for Backends with Native Tools

带有原生工具的后端的Shell沙箱

This section applies only to backends that bring their own shell/file tools (e.g., CLI wrappers with built-in command execution). If a backend delegates all tool execution to MassGen's MCP servers, MassGen's

PathPermissionManager

handles sandboxing automatically.

When the backend has its own tools that can access the filesystem or run commands, you must map MassGen's permission model to the backend's sandbox mechanism:

MassGen concept	Expected access
Workspace	Write
Temp workspaces	Read-only
Read context paths	Read-only
Write context paths	Write — must be explicitly granted in backend sandbox config

The workspace is typically writable by default. Read access to the full filesystem is usually allowed. The key task is ensuring write context paths are added to the backend's writable allowlist (e.g.,

sandbox_workspace_write.writable_roots

for Codex,

allowed_directories

for Claude Code).

Docker mode: Skip backend sandbox config — the container IS the sandbox. Grant full access inside the container.

本节仅适用于自带Shell/文件工具的后端（如内置命令执行能力的CLI封装）。如果后端将所有工具执行委托给MassGen的MCP服务，MassGen的

PathPermissionManager

会自动处理沙箱。

当后端自有工具可以访问文件系统或运行命令时，必须将MassGen的权限模型映射到后端的沙箱机制:

MassGen概念	预期访问权限
工作区	可写
临时工作区	只读
读上下文路径	只读
写上下文路径	可写 — 必须显式添加到后端沙箱的可写允许列表

工作区通常默认可写，全文件系统读权限通常允许，核心任务是确保写上下文路径被添加到后端的可写允许列表（如Codex的

sandbox_workspace_write.writable_roots

、Claude Code的

allowed_directories

）。

Docker模式: 跳过后端沙箱配置 — 容器本身就是沙箱，在容器内部授予完全访问权限。

SDK Wrapper Backend (like Claude Code, Copilot)

SDK封装后端（如Claude Code、Copilot）

__init__: import SDK, configure options, detect Docker mode
stream_with_tools: call SDK with messages -> iterate events -> yield StreamChunks
MCP: pass mcp_servers dict to SDK options
Custom tools: create SDK MCP server from tool definitions
State: SDK manages conversation state internally
Docker: SDK stays on host; disable built-in file/shell tools via excluded_tools;
        route file/shell ops through MCP servers in container (see §Docker Execution for SDK Backends)

__init__: 导入SDK，配置选项，检测Docker模式
stream_with_tools: 调用SDK传递消息 → 遍历事件 → 返回StreamChunks
MCP: 将mcp_servers字典传递给SDK选项
自定义工具: 从工具定义创建SDK MCP服务
状态: SDK内部管理会话状态
Docker: SDK保留在宿主机；通过excluded_tools禁用内置文件/Shell工具；
        文件/Shell操作路由到容器内的MCP服务（参见§SDK后端的Docker执行）

API Backend (like Claude, Gemini)

API后端（如Claude、Gemini）

Subclass CustomToolAndMCPBackend or ChatCompletionsBackend
Implement formatter if non-standard API format
Implement API params handler to filter/transform config -> API params
MCP + custom tools handled automatically by base class

继承CustomToolAndMCPBackend或ChatCompletionsBackend
如果API格式非标准则实现格式化器
实现API参数处理器，过滤/转换配置到API参数
基类自动处理MCP + 自定义工具

Reference Backends by Complexity

按复杂度排序的参考后端

Backend	Lines	Type	Good reference for
`grok.py`	~81	Subclass of ChatCompletions	Minimal API backend
`chat_completions.py`	~1150	OpenAI-compatible	Standard API backend
`response.py`	~1600	Response API	Reasoning models
`claude.py`	~1830	Custom API	Full-featured API
`copilot.py`	~1250	SDK wrapper	SDK-based + PPM permission callback + Docker mode
`codex.py`	~2090	CLI wrapper	CLI-based + Docker
`gemini.py`	~2460	Custom API	Non-OpenAI API
`claude_code.py`	~3530	SDK wrapper	Full-featured stateful

后端	代码行数	类型	适合参考的场景
`grok.py`	~81	ChatCompletions子类	最简API后端
`chat_completions.py`	~1150	兼容OpenAI	标准API后端
`response.py`	~1600	Response API	推理模型
`claude.py`	~1830	自定义API	全功能API
`copilot.py`	~1250	SDK封装	基于SDK + PPM权限回调 + Docker模式
`codex.py`	~2090	CLI封装	基于CLI + Docker
`gemini.py`	~2460	自定义API	非OpenAI API
`claude_code.py`	~3530	SDK封装	全功能有状态后端

File Summary

文件汇总


massgen/backend/<name>.py
massgen/formatter/<name>_formatter.py
massgen/api_params_handler/<name>_handler.py
massgen/backend/__init__.py
massgen/cli.py
massgen/backend/capabilities.py
massgen/config_validator.py
massgen/token_manager/token_manager.py
massgen/backend/base.py
massgen/api_params_handler/_base.py
massgen/tests/test_*
massgen/configs/providers/<name>/
docs/source/reference/yaml_schema.rst
docs/source/user_guide/backends.rst
docs/scripts/generate_backend_tables.py

#	File	Required?	Purpose
1	`massgen/backend/<name>.py`	Yes	Core backend
2	`massgen/formatter/<name>_formatter.py`	If non-standard API	Message/tool formatting
3	`massgen/api_params_handler/<name>_handler.py`	If HTTP API	Param filtering
4	`massgen/backend/__init__.py`	Yes	Export
5	`massgen/cli.py`	Yes	Backend creation
6	`massgen/backend/capabilities.py`	Yes	Model registry
7	`massgen/config_validator.py`	If special rules	Validation
8	`massgen/token_manager/token_manager.py`	If not in LiteLLM	Pricing
9	`massgen/backend/base.py`	If new YAML params	Excluded params
10	`massgen/api_params_handler/_base.py`	If new YAML params	Excluded params
11	`massgen/tests/test_*`	Yes	Tests
12	`massgen/configs/providers/<name>/`	Yes	Example configs
13	`docs/source/reference/yaml_schema.rst`	Yes	Schema docs
14	`docs/source/user_guide/backends.rst`	Yes	User guide
15	`docs/scripts/generate_backend_tables.py`	Run it	Regenerate tables


massgen/backend/<name>.py
massgen/formatter/<name>_formatter.py
massgen/api_params_handler/<name>_handler.py
massgen/backend/__init__.py
massgen/cli.py
massgen/backend/capabilities.py
massgen/config_validator.py
massgen/token_manager/token_manager.py
massgen/backend/base.py
massgen/api_params_handler/_base.py
massgen/tests/test_*
massgen/configs/providers/<name>/
docs/source/reference/yaml_schema.rst
docs/source/user_guide/backends.rst
docs/scripts/generate_backend_tables.py

#	文件	是否必需	用途
1	`massgen/backend/<name>.py`	是	核心后端实现
2	`massgen/formatter/<name>_formatter.py`	API非标准时必需	消息/工具格式化
3	`massgen/api_params_handler/<name>_handler.py`	HTTP API时必需	参数过滤
4	`massgen/backend/__init__.py`	是	导出
5	`massgen/cli.py`	是	后端创建
6	`massgen/backend/capabilities.py`	是	模型注册表
7	`massgen/config_validator.py`	有特殊规则时必需	校验
8	`massgen/token_manager/token_manager.py`	不在LiteLLM中时必需	定价
9	`massgen/backend/base.py`	新增YAML参数时必需	排除参数
10	`massgen/api_params_handler/_base.py`	新增YAML参数时必需	排除参数
11	`massgen/tests/test_*`	是	测试
12	`massgen/configs/providers/<name>/`	是	示例配置
13	`docs/source/reference/yaml_schema.rst`	是	Schema文档
14	`docs/source/user_guide/backends.rst`	是	用户指南
15	`docs/scripts/generate_backend_tables.py`	运行即可	重新生成表格

backend-integrator

Original

Translation

Backend Integrator

后端集成指南

When to Use This Skill

何时使用本工具

Integration Architecture

集成架构

Complete Checklist

完整检查清单

Phase 1: Core Implementation (3 files)

阶段1：核心实现（3个文件）

1.1 Backend Class

1.1 后端类

1.2 Formatter (if needed)

1.2 格式化器（按需实现）

1.3 API Params Handler (if needed)

1.3 API参数处理器（按需实现）

Phase 2: Registration (4 files)

阶段2：注册（4个文件）

2.1 Backend init.py

2.1 后端__init__.py

Add to all

添加到__all__列表

2.2 CLI Backend Mapping

2.2 CLI后端映射

2.3 Capabilities Registry

2.3 能力注册表

2.4 Config Validator (if needed)

2.4 配置校验器（按需实现）

Phase 3: Token Management (1 file)

阶段3：Token管理（1个文件）

3.1 Pricing

3.1 定价

Phase 4: Excluded Params (2 files, if adding new YAML params)

阶段4：排除参数（2个文件，仅当新增YAML参数时需要）

4.1 Base Class Exclusions

4.1 基类排除参数

4.2 API Params Handler Exclusions

4.2 API参数处理器排除参数

Phase 5: Authentication

阶段5：鉴权

5.1 API Key Backends (standard)

5.1 API密钥后端（标准方案）

cli.py

cli.py

5.2 OAuth / Subscription Auth (CLI/SDK wrappers)

5.2 OAuth / 订阅鉴权（CLI/SDK封装）

5.3 No Auth (local inference)

5.3 无鉴权（本地推理）

Phase 6: Custom Tools & MCP

阶段6：自定义工具与MCP

6.1 Standard Path (API backends)

6.1 标准路径（API后端）

6.2 Multimodal Tools (all backend types)

6.2 多模态工具（所有后端类型）

In init:

在__init__中:

6.3 CLI/SDK Wrappers — MCP Servers

6.3 CLI/SDK封装 — MCP服务

6.4 CLI/SDK Wrappers — Custom Tools

6.4 CLI/SDK封装 — 自定义工具

claude_code.py pattern (simplified)

claude_code.py简化示例

6.5 Custom Tool YAML Config

6.5 自定义工具YAML配置

6.7 Workflow Tool Integration (vote, new_answer, etc.)

6.7 工作流工具集成（vote、new_answer等）

build_workflow_instructions(tools) → str

build_workflow_instructions(tools) → str

Filters tools to workflow tools, returns instruction text with usage examples.

筛选出工作流工具，返回包含使用示例的指导文本

Returns "" if no workflow tools present.

无工作流工具时返回空字符串

parse_workflow_tool_calls(text) → List[Dict]

parse_workflow_tool_calls(text) → List[Dict]

Extracts JSON tool calls from text output.

从文本输出中提取JSON工具调用

Returns standard format: {id, type, function: {name, arguments}}

2.1 后端init.py

添加到all列表

在init中:

10.2
`tool_category_overrides`

10.2
`tool_category_overrides`