backend-integrator

Compare original and translation side by side

🇺🇸

Original

English
🇨🇳

Translation

Chinese

Backend Integrator

后端集成指南

This skill provides the complete checklist and patterns for integrating a new LLM backend into MassGen. A full integration touches ~15 files across the codebase.
本工具提供了将新LLM后端集成到MassGen的完整检查清单和模式规范,完整的集成需要修改代码库中约15个文件。

When to Use This Skill

何时使用本工具

  • Adding a new LLM provider/backend
  • Auditing an existing backend for missing integration points
  • Understanding what files to modify when extending backend capabilities
  • 添加新的LLM提供商/后端
  • 审计现有后端缺失的集成点
  • 了解扩展后端能力时需要修改哪些文件

Integration Architecture

集成架构

Backend Type Decision:
  Stateless + OpenAI-compatible API   → subclass ChatCompletionsBackend
  Stateless + custom API              → subclass CustomToolAndMCPBackend
  Stateless + Response API format     → subclass ResponseBackend
  Stateful CLI wrapper (like Codex, Gemini CLI) → subclass LLMBackend directly
  Stateful SDK wrapper (like Claude Code, Copilot) → subclass LLMBackend directly
后端类型决策:
  无状态 + 兼容OpenAI的API   → 继承ChatCompletionsBackend
  无状态 + 自定义API              → 继承CustomToolAndMCPBackend
  无状态 + Response API格式     → 继承ResponseBackend
  有状态CLI封装(如Codex、Gemini CLI) → 直接继承LLMBackend
  有状态SDK封装(如Claude Code、Copilot) → 直接继承LLMBackend

Complete Checklist

完整检查清单

Phase 1: Core Implementation (3 files)

阶段1:核心实现(3个文件)

1.1 Backend Class

1.1 后端类

File:
massgen/backend/<name>.py
Choose base class:
  • LLMBackend
    — bare minimum, you handle everything
  • CustomToolAndMCPBackend
    — adds MCP + custom tool support (most common)
  • ChatCompletionsBackend
    — for OpenAI-compatible APIs (inherits from above)
  • ResponseBackend
    — for OpenAI Response API format
Required methods:
python
async def stream_with_tools(self, messages, tools, **kwargs) -> AsyncGenerator[StreamChunk, None]:
    """Main streaming method. Yield StreamChunks."""

def get_provider_name(self) -> str:
    """Return provider name string (e.g., 'OpenAI', 'Codex')."""

def get_filesystem_support(self) -> FilesystemSupport:
    """Return NONE, NATIVE, or MCP."""
StreamChunk types to yield:
TypeWhenKey fields
"content"
Text output
content="..."
"tool_calls"
Tool invocation
tool_calls=[{id, name, arguments}]
"reasoning"
Thinking/reasoning delta
reasoning_delta="..."
"reasoning_done"
Reasoning complete
reasoning_text="..."
"reasoning_summary"
Reasoning summary delta
reasoning_summary_delta="..."
"reasoning_summary_done"
Reasoning summary complete
reasoning_summary_text="..."
"complete_message"
Full assistant message
complete_message={...}
"complete_response"
Raw API response
response={...}
"done"
Stream complete
usage={prompt_tokens, completion_tokens, total_tokens}
"error"
Error occurred
error="..."
"agent_status"
Status update
status="...", detail="..."
"backend_status"
Backend-level status
status="...", detail="..."
"compression_status"
Compression event
status="...", detail="..."
"hook_execution"
Hook ran
hook_info={...}, tool_call_id="..."
Common fields on all chunks:
source
(agent/orchestrator ID),
display
(bool, default True).
Token tracking — call one of:
python
self._update_token_usage_from_api_response(usage_dict, model)  # If API returns usage
self._estimate_token_usage(messages, response_text, model)      # Fallback
Timing — call in stream_with_tools:
python
self.start_api_call_timing(self.model)       # Before API call
self.record_first_token()                     # On first content chunk
self.end_api_call_timing(success=True/False)  # After completion
For stateful backends (CLI/SDK wrappers), also implement:
python
def is_stateful(self) -> bool: return True
async def clear_history(self) -> None: ...
async def reset_state(self) -> None: ...
Compression support — inherit
StreamingBufferMixin
and call:
python
self._clear_streaming_buffer(**kwargs)       # Start of stream
self._finalize_streaming_buffer(agent_id=id) # End of stream
文件:
massgen/backend/<name>.py
选择基类:
  • LLMBackend
    — 最基础的基类,所有逻辑自行实现
  • CustomToolAndMCPBackend
    — 新增MCP + 自定义工具支持(最常用)
  • ChatCompletionsBackend
    — 适用于兼容OpenAI的API(继承自上述基类)
  • ResponseBackend
    — 适用于OpenAI Response API格式
必须实现的方法:
python
async def stream_with_tools(self, messages, tools, **kwargs) -> AsyncGenerator[StreamChunk, None]:
    """主流式方法,返回StreamChunk流。"""

def get_provider_name(self) -> str:
    """返回提供商名称字符串(如'OpenAI'、'Codex')。"""

def get_filesystem_support(self) -> FilesystemSupport:
    """返回NONE、NATIVE或MCP。"""
需要返回的StreamChunk类型:
类型触发时机关键字段
"content"
文本输出
content="..."
"tool_calls"
工具调用
tool_calls=[{id, name, arguments}]
"reasoning"
思考/推理增量
reasoning_delta="..."
"reasoning_done"
推理完成
reasoning_text="..."
"reasoning_summary"
推理摘要增量
reasoning_summary_delta="..."
"reasoning_summary_done"
推理摘要完成
reasoning_summary_text="..."
"complete_message"
完整助手消息
complete_message={...}
"complete_response"
原始API响应
response={...}
"done"
流结束
usage={prompt_tokens, completion_tokens, total_tokens}
"error"
发生错误
error="..."
"agent_status"
状态更新
status="...", detail="..."
"backend_status"
后端层面状态
status="...", detail="..."
"compression_status"
压缩事件
status="...", detail="..."
"hook_execution"
Hook执行完成
hook_info={...}, tool_call_id="..."
所有Chunk的通用字段:
source
(Agent/编排器ID)、
display
(布尔值,默认True)。
Token统计 — 调用以下方法之一:
python
self._update_token_usage_from_api_response(usage_dict, model)  # 如果API返回用量统计
self._estimate_token_usage(messages, response_text, model)      # 兜底估算方案
计时统计 — 在stream_with_tools中调用:
python
self.start_api_call_timing(self.model)       # API调用前调用
self.record_first_token()                     # 收到第一个内容Chunk时调用
self.end_api_call_timing(success=True/False)  # 调用完成后调用
针对有状态后端(CLI/SDK封装),还需要实现:
python
def is_stateful(self) -> bool: return True
async def clear_history(self) -> None: ...
async def reset_state(self) -> None: ...
压缩支持 — 继承
StreamingBufferMixin
并调用:
python
self._clear_streaming_buffer(**kwargs)       # 流开始时调用
self._finalize_streaming_buffer(agent_id=id) # 流结束时调用

1.2 Formatter (if needed)

1.2 格式化器(按需实现)

File:
massgen/formatter/<name>_formatter.py
Only needed if the API uses a non-standard message/tool format (not OpenAI chat completions format). Subclass
FormatterBase
and implement
format_messages()
,
format_tools()
,
format_mcp_tools()
.
Existing formatters:
  • _claude_formatter.py
    — Anthropic Messages API
  • _gemini_formatter.py
    — Gemini API
  • _chat_completions_formatter.py
    — OpenAI/generic (reuse for compatible APIs)
  • _response_formatter.py
    — OpenAI Response API format
文件:
massgen/formatter/<name>_formatter.py
仅当API使用非标准消息/工具格式(非OpenAI聊天补全格式)时需要实现。继承
FormatterBase
并实现
format_messages()
format_tools()
format_mcp_tools()
方法。
现有格式化器:
  • _claude_formatter.py
    — Anthropic Messages API
  • _gemini_formatter.py
    — Gemini API
  • _chat_completions_formatter.py
    — OpenAI/通用兼容API(可复用给兼容接口)
  • _response_formatter.py
    — OpenAI Response API格式

1.3 API Params Handler (if needed)

1.3 API参数处理器(按需实现)

File:
massgen/api_params_handler/<name>_api_params_handler.py
Only needed if the backend calls an HTTP API and needs to filter/transform YAML config params before passing to the API. Subclass
APIParamsHandlerBase
.
CLI/SDK wrappers (Codex, Claude Code) typically don't need this — they build commands directly.
文件:
massgen/api_params_handler/<name>_api_params_handler.py
仅当后端调用HTTP API,需要在传递给API前过滤/转换YAML配置参数时需要实现。继承
APIParamsHandlerBase
CLI/SDK封装(Codex、Claude Code)通常不需要该组件,它们会直接构建命令。

Phase 2: Registration (4 files)

阶段2:注册(4个文件)

2.1 Backend init.py

2.1 后端__init__.py

File:
massgen/backend/__init__.py
python
from .your_backend import YourBackend
文件:
massgen/backend/__init__.py
python
from .your_backend import YourBackend

Add to all

添加到__all__列表

undefined
undefined

2.2 CLI Backend Mapping

2.2 CLI后端映射

File:
massgen/cli.py
Add to
create_backend()
function:
python
elif backend_type == "your_backend":
    api_key = kwargs.get("api_key") or os.getenv("YOUR_API_KEY")
    if not api_key:
        raise ConfigurationError(
            _api_key_error_message("YourBackend", "YOUR_API_KEY", config_path)
        )
    return YourBackend(api_key=api_key, **kwargs)
For CLI-based backends that don't need API keys, skip the key check.
文件:
massgen/cli.py
添加到
create_backend()
函数:
python
elif backend_type == "your_backend":
    api_key = kwargs.get("api_key") or os.getenv("YOUR_API_KEY")
    if not api_key:
        raise ConfigurationError(
            _api_key_error_message("YourBackend", "YOUR_API_KEY", config_path)
        )
    return YourBackend(api_key=api_key, **kwargs)
不需要API密钥的基于CLI的后端可以跳过密钥检查。

2.3 Capabilities Registry

2.3 能力注册表

File:
massgen/backend/capabilities.py
Add entry to
BACKEND_CAPABILITIES
:
python
"your_backend": BackendCapabilities(
    backend_type="your_backend",
    provider_name="YourProvider",
    supported_capabilities={"mcp", "web_search", ...},
    builtin_tools=["web_search"],  # Provider-native tools
    filesystem_support="mcp",      # "none", "mcp", or "native"
    models=["model-a", "model-b"], # Newest first
    default_model="model-a",
    env_var="YOUR_API_KEY",        # Or None
    notes="...",
    model_release_dates={"model-a": "2025-06"},
    base_url="https://api.example.com/v1",  # If applicable
)
文件:
massgen/backend/capabilities.py
添加条目到
BACKEND_CAPABILITIES
:
python
"your_backend": BackendCapabilities(
    backend_type="your_backend",
    provider_name="YourProvider",
    supported_capabilities={"mcp", "web_search", ...},
    builtin_tools=["web_search"],  # 提供商原生工具
    filesystem_support="mcp",      # "none"、"mcp"或"native"
    models=["model-a", "model-b"], # 按发布时间从新到旧排序
    default_model="model-a",
    env_var="YOUR_API_KEY",        # 无则填None
    notes="...",
    model_release_dates={"model-a": "2025-06"},
    base_url="https://api.example.com/v1",  # 适用时填写
)

2.4 Config Validator (if needed)

2.4 配置校验器(按需实现)

File:
massgen/config_validator.py
Add backend-specific validation to
_validate_backend()
if there are special rules (e.g., required params, incompatible combinations).
文件:
massgen/config_validator.py
如果有特殊规则(如必填参数、不兼容的参数组合),添加后端专属校验逻辑到
_validate_backend()
方法。

Phase 3: Token Management (1 file)

阶段3:Token管理(1个文件)

3.1 Pricing

3.1 定价

File:
massgen/token_manager/token_manager.py
Check LiteLLM first — only add to
PROVIDER_PRICING
if the model is NOT in the LiteLLM database. Provider name must match
get_provider_name()
exactly (case-sensitive).
文件:
massgen/token_manager/token_manager.py
优先检查LiteLLM — 仅当模型不在LiteLLM数据库中时才添加到
PROVIDER_PRICING
。提供商名称必须与
get_provider_name()
返回值完全匹配(大小写敏感)。

Phase 4: Excluded Params (2 files, if adding new YAML params)

阶段4:排除参数(2个文件,仅当新增YAML参数时需要)

4.1 Base Class Exclusions

4.1 基类排除参数

File:
massgen/backend/base.py
->
get_base_excluded_config_params()
文件:
massgen/backend/base.py
->
get_base_excluded_config_params()

4.2 API Params Handler Exclusions

4.2 API参数处理器排除参数

File:
massgen/api_params_handler/_api_params_handler_base.py
->
get_base_excluded_params()
Both must stay in sync. Add any new framework-level YAML params that should NOT be passed to the provider API.
文件:
massgen/api_params_handler/_api_params_handler_base.py
->
get_base_excluded_params()
两处必须保持同步,添加所有不应该传递给提供商API的框架层面YAML参数。

Phase 5: Authentication

阶段5:鉴权

5.1 API Key Backends (standard)

5.1 API密钥后端(标准方案)

Most backends use API keys. Set
env_var
in
capabilities.py
and add the key check in
cli.py
:
python
undefined
大多数后端使用API密钥,在
capabilities.py
中设置
env_var
,并在
cli.py
中添加密钥检查:
python
undefined

cli.py

cli.py

api_key = kwargs.get("api_key") or os.getenv("YOUR_API_KEY") if not api_key: raise ConfigurationError(...) return YourBackend(api_key=api_key, **kwargs)
undefined
api_key = kwargs.get("api_key") or os.getenv("YOUR_API_KEY") if not api_key: raise ConfigurationError(...) return YourBackend(api_key=api_key, **kwargs)
undefined

5.2 OAuth / Subscription Auth (CLI/SDK wrappers)

5.2 OAuth / 订阅鉴权(CLI/SDK封装)

For backends that support OAuth (like Codex, Claude Code), implement a multi-tier auth cascade:
python
def __init__(self, api_key=None, **kwargs):
    # Tier 1: Explicit API key
    self.api_key = api_key or os.getenv("YOUR_API_KEY")
    # Tier 2: Check cached OAuth tokens
    self.use_oauth = not bool(self.api_key)
    self.auth_file = Path.home() / ".your_cli" / "auth.json"

async def _ensure_authenticated(self):
    if self.api_key:
        os.environ["YOUR_API_KEY"] = self.api_key
        return
    if self._has_cached_credentials():
        return
    # Tier 3: Initiate OAuth flow
    await self._initiate_oauth_flow()

async def _initiate_oauth_flow(self, use_device_flow=False):
    """Wrap the CLI's login command."""
    cmd = [self._cli_path, "login"]
    if use_device_flow:
        cmd.append("--device-auth")  # For headless/SSH
    proc = await asyncio.create_subprocess_exec(*cmd, ...)
In
cli.py
: Skip API key check — delegate auth to the backend:
python
elif backend_type == "your_backend":
    # Auth handled by backend (API key or OAuth)
    return YourBackend(**kwargs)
In
capabilities.py
: Set
env_var
to the API key name (optional, not required for OAuth):
python
env_var="YOUR_API_KEY",  # Optional - can use OAuth instead
notes="Works with subscription auth (via `your-cli login`) or YOUR_API_KEY."
NOTE: We only offer oauth through supported SDKs or programmatic command line usage as suggested by the provider. We must find the accepted way of using oauth for each backend to ensure we are supporting it correctly (e.g., use the Claude Agent SDK, NOT Claude Code with API spoofing).
支持OAuth的后端(如Codex、Claude Code)需要实现多层鉴权降级逻辑:
python
def __init__(self, api_key=None, **kwargs):
    # 层级1:显式传入的API密钥
    self.api_key = api_key or os.getenv("YOUR_API_KEY")
    # 层级2:检查缓存的OAuth令牌
    self.use_oauth = not bool(self.api_key)
    self.auth_file = Path.home() / ".your_cli" / "auth.json"

async def _ensure_authenticated(self):
    if self.api_key:
        os.environ["YOUR_API_KEY"] = self.api_key
        return
    if self._has_cached_credentials():
        return
    # 层级3:启动OAuth流程
    await self._initiate_oauth_flow()

async def _initiate_oauth_flow(self, use_device_flow=False):
    """封装CLI的登录命令。"""
    cmd = [self._cli_path, "login"]
    if use_device_flow:
        cmd.append("--device-auth")  # 适用于无界面/SSH场景
    proc = await asyncio.create_subprocess_exec(*cmd, ...)
cli.py
: 跳过API密钥检查,将鉴权逻辑委托给后端本身:
python
elif backend_type == "your_backend":
    # 鉴权由后端自行处理(API密钥或OAuth)
    return YourBackend(**kwargs)
capabilities.py
: 将
env_var
设置为API密钥变量名(可选,OAuth场景不需要):
python
env_var="YOUR_API_KEY",  # 可选 - 也可以使用OAuth鉴权
notes="支持订阅鉴权(通过`your-cli login`)或YOUR_API_KEY。"
注意:我们仅通过提供商官方支持的SDK或程序化命令行方式提供OAuth支持,必须为每个后端找到官方认可的OAuth使用方式以确保兼容性(例如使用Claude Agent SDK,而非通过API欺骗使用Claude Code)。

5.3 No Auth (local inference)

5.3 无鉴权(本地推理)

For local servers (LM Studio, vLLM, SGLang), set
env_var=None
in capabilities. No key check in
cli.py
.
Reference implementations:
  • codex.py
    — Full OAuth with browser + device code flows
  • claude_code.py
    — 3-tier:
    CLAUDE_CODE_API_KEY
    ->
    ANTHROPIC_API_KEY
    -> subscription login
  • lmstudio
    /
    vllm
    /
    sglang
    — No auth (
    env_var=None
    )
对于本地服务(LM Studio、vLLM、SGLang),在capabilities中设置
env_var=None
cli.py
中不需要密钥检查。
参考实现:
  • codex.py
    — 完整的浏览器+设备码OAuth流程
  • claude_code.py
    — 三层鉴权:
    CLAUDE_CODE_API_KEY
    ->
    ANTHROPIC_API_KEY
    -> 订阅登录
  • lmstudio
    /
    vllm
    /
    sglang
    — 无鉴权(
    env_var=None

Phase 6: Custom Tools & MCP

阶段6:自定义工具与MCP

6.1 Standard Path (API backends)

6.1 标准路径(API后端)

If inheriting from
CustomToolAndMCPBackend
, MCP and custom tools work automatically:
  • MCP servers from YAML
    mcp_servers
    config
  • Custom tools from
    custom_tools
    /
    custom_tools_path
    config
  • Filesystem MCP injected by
    FilesystemManager
    when
    cwd
    is set
  • Tool execution handled by
    ToolManager
如果继承自
CustomToolAndMCPBackend
,MCP和自定义工具会自动生效:
  • 来自YAML
    mcp_servers
    配置的MCP服务
  • 来自
    custom_tools
    /
    custom_tools_path
    配置的自定义工具
  • 当设置
    cwd
    时由
    FilesystemManager
    注入的文件系统MCP
  • 工具执行由
    ToolManager
    处理

6.2 Multimodal Tools (all backend types)

6.2 多模态工具(所有后端类型)

When
enable_multimodal_tools: true
is set, the backend must register
read_media
and
generate_media
custom tools. Use the shared helper in
massgen/backend/base.py
:
python
from .base import get_multimodal_tool_definitions
当设置
enable_multimodal_tools: true
时,后端必须注册
read_media
generate_media
自定义工具,使用
massgen/backend/base.py
中的共享辅助方法:
python
from .base import get_multimodal_tool_definitions

In init:

在__init__中:

enable_multimodal = self.config.get("enable_multimodal_tools", False) or kwargs.get("enable_multimodal_tools", False) if enable_multimodal: custom_tools.extend(get_multimodal_tool_definitions())

- **API backends** (`CustomToolAndMCPBackend` subclasses): handled automatically — base class calls `get_multimodal_tool_definitions()` and registers via `_register_custom_tools()`
- **CLI/SDK backends** (`LLMBackend` subclasses like Codex, Claude Code): must do this explicitly in `__init__`, then wrap as MCP server (see §6.4)

**Important**: Always use `get_multimodal_tool_definitions()` — never inline the tool dicts. This keeps the definitions in one place.
enable_multimodal = self.config.get("enable_multimodal_tools", False) or kwargs.get("enable_multimodal_tools", False) if enable_multimodal: custom_tools.extend(get_multimodal_tool_definitions())

- **API后端**(`CustomToolAndMCPBackend`子类): 自动处理 — 基类会调用`get_multimodal_tool_definitions()`并通过`_register_custom_tools()`注册
- **CLI/SDK后端**(`LLMBackend`子类如Codex、Claude Code): 必须在`__init__`中显式执行该逻辑,然后封装为MCP服务(参见§6.4)

**重要**: 始终使用`get_multimodal_tool_definitions()`,不要直接内联工具定义,保证定义的统一维护。

6.3 CLI/SDK Wrappers — MCP Servers

6.3 CLI/SDK封装 — MCP服务

These backends must configure the CLI/SDK's own MCP system:
Codex: Write project-scoped
.codex/config.toml
in the workspace (
-C
dir). Codex reads this automatically. Convert MassGen's
mcp_servers
list to TOML format:
toml
[mcp_servers.filesystem]
command = "npx"
args = ["-y", "@modelcontextprotocol/server-filesystem", "/workspace"]
Claude Code: Pass
mcp_servers
dict directly to SDK options. The SDK handles server lifecycle:
python
options = {"mcp_servers": {"filesystem": {"command": "npx", "args": [...]}}}
这类后端需要配置CLI/SDK自带的MCP系统:
Codex: 在工作区(
-C
目录)写入项目级的
.codex/config.toml
,Codex会自动读取该文件,将MassGen的
mcp_servers
列表转换为TOML格式:
toml
[mcp_servers.filesystem]
command = "npx"
args = ["-y", "@modelcontextprotocol/server-filesystem", "/workspace"]
Claude Code: 将
mcp_servers
字典直接传递给SDK选项,SDK会处理服务的生命周期:
python
options = {"mcp_servers": {"filesystem": {"command": "npx", "args": [...]}}}

6.4 CLI/SDK Wrappers — Custom Tools

6.4 CLI/SDK封装 — 自定义工具

Custom tools need special handling since the LLM runs inside an external process:
Preferred: Wrap as MCP server (what
claude_code.py
does):
  1. Load custom tools via
    ToolManager
    (schemas + executors)
  2. Create an MCP server that exposes each tool
  3. Register the MCP server with the CLI/SDK
  4. When the LLM calls the MCP tool, the wrapper executes via
    ToolManager
python
undefined
自定义工具需要特殊处理,因为LLM运行在外部进程中:
推荐方案:封装为MCP服务
claude_code.py
采用的方案):
  1. 通过
    ToolManager
    加载自定义工具(schema + 执行器)
  2. 创建暴露每个工具的MCP服务
  3. 将MCP服务注册到CLI/SDK
  4. 当LLM调用MCP工具时,封装层通过
    ToolManager
    执行逻辑
python
undefined

claude_code.py pattern (simplified)

claude_code.py简化示例

def _create_sdk_mcp_server_from_custom_tools(self): tool_schemas = self._custom_tool_manager.fetch_tool_schemas() mcp_tools = [] for schema in tool_schemas: name = schema["function"]["name"] async def wrapper(args, tool_name=name): return await self._execute_massgen_custom_tool(tool_name, args) mcp_tools.append(tool(name=name, ...)(wrapper)) return create_sdk_mcp_server(name="massgen_custom_tools", tools=mcp_tools)

**For CLI wrappers without SDK MCP** (like Codex):
Use the shared `massgen/mcp_tools/custom_tools_server.py` utility. It creates a standalone fastmcp server that wraps ToolManager tools via stdio transport.

```python
from ..mcp_tools.custom_tools_server import build_server_config, write_tool_specs

def _setup_custom_tools_mcp(self, custom_tools):
    # 1. Write tool specs JSON to workspace
    specs_path = Path(self.cwd) / ".codex" / "custom_tool_specs.json"
    write_tool_specs(custom_tools, specs_path)

    # 2. Get MCP server config (fastmcp run ... --tool-specs ...)
    server_config = build_server_config(
        tool_specs_path=specs_path,
        allowed_paths=[self.cwd],
        agent_id="my_backend",
    )

    # 3. Add to mcp_servers list (written to workspace config before launch)
    self.mcp_servers.append(server_config)
The server is launched by the CLI as a subprocess and connects via stdio. Cleanup the specs file on
reset_state()
.
Reference:
codex.py
uses this pattern.
custom_tools_server.py
also provides
build_server_config()
which returns a ready-to-use MCP server dict.
Fallback: System prompt injection
  • Describe tools as JSON schema in the system prompt
  • Parse structured output (JSON blocks) for tool calls
  • Only use when MCP wrapping isn't feasible
def _create_sdk_mcp_server_from_custom_tools(self): tool_schemas = self._custom_tool_manager.fetch_tool_schemas() mcp_tools = [] for schema in tool_schemas: name = schema["function"]["name"] async def wrapper(args, tool_name=name): return await self._execute_massgen_custom_tool(tool_name, args) mcp_tools.append(tool(name=name, ...)(wrapper)) return create_sdk_mcp_server(name="massgen_custom_tools", tools=mcp_tools)

**不支持SDK MCP的CLI封装**(如Codex):
使用共享工具`massgen/mcp_tools/custom_tools_server.py`,它会创建独立的fastmcp服务,通过stdio传输封装ToolManager的工具。

```python
from ..mcp_tools.custom_tools_server import build_server_config, write_tool_specs

def _setup_custom_tools_mcp(self, custom_tools):
    # 1. 将工具规范JSON写入工作区
    specs_path = Path(self.cwd) / ".codex" / "custom_tool_specs.json"
    write_tool_specs(custom_tools, specs_path)

    # 2. 获取MCP服务配置(fastmcp run ... --tool-specs ...)
    server_config = build_server_config(
        tool_specs_path=specs_path,
        allowed_paths=[self.cwd],
        agent_id="my_backend",
    )

    # 3. 添加到mcp_servers列表(启动前写入工作区配置)
    self.mcp_servers.append(server_config)
服务会作为子进程由CLI启动,通过stdio通信,在
reset_state()
时清理规范文件。
参考:
codex.py
使用了该模式,
custom_tools_server.py
还提供了
build_server_config()
方法返回可直接使用的MCP服务字典。
兜底方案:系统提示注入
  • 在系统提示中以JSON schema形式描述工具
  • 从结构化输出(JSON块)中解析工具调用
  • 仅当MCP封装不可行时使用

6.5 Custom Tool YAML Config

6.5 自定义工具YAML配置

yaml
backend:
  type: your_backend
  custom_tools:
    - path: "massgen/tool/_basic"          # Directory with TOOL.md
      function: "two_num_tool"
    - path: "path/to/tool.py"             # Single file
      function: ["func_a", "func_b"]
      preset_args: [{"timeout": 30}, {}]
  custom_tools_path: "massgen/tool/"      # Auto-discover from directory
  auto_discover_custom_tools: true        # Discover from registry
MassGen workflow tools (new_answer, vote, etc.): Always injected via system prompt — these are coordination-level, not executable tools. See §6.7 for full details.
yaml
backend:
  type: your_backend
  custom_tools:
    - path: "massgen/tool/_basic"          # 包含TOOL.md的目录
      function: "two_num_tool"
    - path: "path/to/tool.py"             # 单个文件
      function: ["func_a", "func_b"]
      preset_args: [{"timeout": 30}, {}]
  custom_tools_path: "massgen/tool/"      # 从目录自动发现
  auto_discover_custom_tools: true        # 从注册表发现
MassGen工作流工具(new_answer、vote等): 始终通过系统提示注入,这些是协调层面的工具,而非可执行工具,详情参见§6.7。

6.7 Workflow Tool Integration (vote, new_answer, etc.)

6.7 工作流工具集成(vote、new_answer等)

Workflow tools are NOT native function-calling tools — they're injected into the system prompt and parsed from the model's text output. This pattern is shared between Claude Code and Codex backends.
Shared helpers in
massgen/backend/base.py
:
python
from .base import build_workflow_instructions, parse_workflow_tool_calls, extract_structured_response
工作流工具不是原生函数调用工具,它们被注入到系统提示中,从模型的文本输出中解析,该模式在Claude Code和Codex后端中通用。
massgen/backend/base.py
中的共享辅助方法
:
python
from .base import build_workflow_instructions, parse_workflow_tool_calls, extract_structured_response

build_workflow_instructions(tools) → str

build_workflow_instructions(tools) → str

Filters tools to workflow tools, returns instruction text with usage examples.

筛选出工作流工具,返回包含使用示例的指导文本

Returns "" if no workflow tools present.

无工作流工具时返回空字符串

parse_workflow_tool_calls(text) → List[Dict]

parse_workflow_tool_calls(text) → List[Dict]

Extracts JSON tool calls from text output.

从文本输出中提取JSON工具调用

Returns standard format: {id, type, function: {name, arguments}}

返回标准格式: {id, type, function: {name, arguments}}

extract_structured_response(text) → Optional[Dict]

extract_structured_response(text) → Optional[Dict]

Low-level: extracts {"tool_name": "...", "arguments": {...}} from text.

底层方法: 从文本中提取{"tool_name": "...", "arguments": {...}}

Tries: ```json blocks → regex → brace-matching → line-by-line.

尝试顺序: ```json块 → 正则匹配 → 括号匹配 → 逐行解析


**For API backends**: Workflow tools are passed as normal function tools — no injection needed.

**For CLI/SDK backends** (Codex, Claude Code): The model can't receive native function tool definitions, so:

1. **Build instructions**: Call `build_workflow_instructions(tools)` to get the instruction text
2. **Inject into system prompt**: Append to whatever mechanism the backend uses for system prompts
3. **Accumulate text output**: Track all `content` chunks during streaming
4. **Parse after streaming**: Call `parse_workflow_tool_calls(accumulated_text)` to extract tool calls
5. **Yield as done chunk**: `yield StreamChunk(type="done", tool_calls=workflow_tool_calls)`

**Codex-specific**: Instructions go into `AGENTS.md` at the workspace root (Codex auto-reads this). The orchestrator's system message is also extracted from messages and included, since Codex only receives a single user prompt via CLI — the system message from the orchestrator would otherwise be lost. This approach was utilized as the developer instructions approach wasn't working.

**Claude Code-specific**: Instructions are built by `_build_system_prompt_with_workflow_tools()` which also adds tool calling sections, then passed as `system_prompt` to the SDK.

It is an open question as to whether typical workflow tool parsing works (e.g., Claude Code) or if we need to use MCP tools (e.g., Codex).

**API后端**: 工作流工具作为普通函数工具传递,无需注入。

**CLI/SDK后端**(Codex、Claude Code): 模型无法接收原生函数工具定义,因此需要:

1. **构建指导文本**: 调用`build_workflow_instructions(tools)`获取指导文本
2. **注入到系统提示**: 追加到后端使用的系统提示机制中
3. **累积文本输出**: 流式传输过程中跟踪所有`content`块
4. **流式传输结束后解析**: 调用`parse_workflow_tool_calls(accumulated_text)`提取工具调用
5. **作为完成块返回**: `yield StreamChunk(type="done", tool_calls=workflow_tool_calls)`

**Codex专属逻辑**: 指导文本写入工作区根目录的`AGENTS.md`(Codex会自动读取)。编排器的系统消息也会从消息中提取并包含在内,因为Codex仅通过CLI接收单个用户提示,否则编排器的系统消息会丢失。该方案是因为开发者指导方案无法生效而采用的。

**Claude Code专属逻辑**: 指导文本由`_build_system_prompt_with_workflow_tools()`构建,该方法还会添加工具调用章节,然后作为`system_prompt`传递给SDK。

目前仍存在的问题是:常规工作流工具解析是否生效(如Claude Code),还是需要使用MCP工具(如Codex)。

6.7.1 MCP-Based vs Text-Based Workflow Tools

6.7.1 基于MCP vs 基于文本的工作流工具

When deciding between MCP-based and text-based workflow tools for CLI/SDK backends:
MCP-based approach (attempted but not always reliable):
  • Register workflow tools as an MCP server
  • Pass to the SDK/CLI's MCP system
  • Caveat: MCP tool naming varies by SDK. Claude Code prefixes with
    mcp__{server_name}__
    :
    # If server is "massgen_workflow_tools", tool names become:
    # "new_answer" → "mcp__massgen_workflow_tools__new_answer"
    # "vote" → "mcp__massgen_workflow_tools__vote"
  • When using MCP workflow tools, use
    build_workflow_mcp_instructions()
    with the appropriate prefix:
    python
    instructions = build_workflow_mcp_instructions(
        tools,
        mcp_prefix="mcp__massgen_workflow_tools__"
    )
Text-based approach (current fallback for Claude Code):
  • Inject workflow tool instructions into system prompt
  • Parse JSON tool calls from model's text output using
    parse_workflow_tool_calls()
  • More reliable but requires robust parsing
Current status: Claude Code uses text-based workflow tools (JSON parsing) because MCP-based approach was unreliable. Codex uses text-based as well via AGENTS.md instructions.
为CLI/SDK后端选择工作流工具方案时:
基于MCP的方案(已尝试但并非始终可靠):
  • 将工作流工具注册为MCP服务
  • 传递给SDK/CLI的MCP系统
  • 注意: MCP工具命名因SDK而异,Claude Code会添加
    mcp__{server_name}__
    前缀:
    # 如果服务名为"massgen_workflow_tools",工具名称会变为:
    # "new_answer" → "mcp__massgen_workflow_tools__new_answer"
    # "vote" → "mcp__massgen_workflow_tools__vote"
  • 使用MCP工作流工具时,调用
    build_workflow_mcp_instructions()
    并传入对应前缀:
    python
    instructions = build_workflow_mcp_instructions(
        tools,
        mcp_prefix="mcp__massgen_workflow_tools__"
    )
基于文本的方案(Claude Code当前的兜底方案):
  • 将工作流工具指导注入到系统提示
  • 使用
    parse_workflow_tool_calls()
    从模型的文本输出中解析JSON工具调用
  • 更可靠但需要健壮的解析逻辑
当前状态: Claude Code使用基于文本的工作流工具(JSON解析),因为基于MCP的方案不可靠;Codex也通过AGENTS.md指导使用基于文本的方案。

6.6 Provider-Native Tools: Keep vs. Override

6.6 提供商原生工具:保留 vs 覆盖

CLI/SDK-based agents (Codex, Claude Code) come with their own built-in tools (file editing, shell execution, web search, sub-agents, etc.). When integrating, decide which to keep and which to override with MassGen equivalents. Document these decisions in
get_tool_category_overrides()
on the
NativeToolBackendMixin
— see §6.8 and Phase 10.2 for details.
General rule: Prefer the provider's native tools unless MassGen has a specific reason to override. Native tools are optimized for the provider's model and avoid extra MCP overhead.
Override when MassGen adds value:
  • Sub-agents/Task tools: Override with MassGen's sub-agents — theirs can't participate in MassGen coordination (voting, consensus, intelligence sharing)
  • Filesystem tools: May override if MassGen needs path permission enforcement or workspace isolation beyond what the provider offers
Keep provider-native:
  • File read/write/edit: Provider tools are model-optimized
  • Shell/command execution: Provider sandboxing is already configured
  • Web search: Provider integration is usually seamless
Implementation: Use the provider's tool filtering mechanism to disable specific tools:
python
undefined
基于CLI/SDK的Agent(Codex、Claude Code)自带内置工具(文件编辑、Shell执行、网页搜索、子Agent等),集成时需要决定哪些保留,哪些用MassGen的等价工具覆盖。在
NativeToolBackendMixin
get_tool_category_overrides()
方法中记录这些决策,详情参见§6.8和阶段10.2。
通用规则: 优先使用提供商原生工具,除非MassGen有明确的覆盖理由。原生工具针对提供商模型做了优化,避免额外的MCP开销。
需要覆盖的场景(MassGen能提供额外价值):
  • 子Agent/任务工具: 用MassGen的子Agent覆盖 — 提供商的子Agent无法参与MassGen的协调机制(投票、共识、情报共享)
  • 文件系统工具: 如果MassGen需要路径权限 enforcement 或工作区隔离能力,超出提供商提供的能力范围时可以覆盖
保留提供商原生工具的场景:
  • 文件读/写/编辑: 提供商工具针对模型做了优化
  • Shell/命令执行: 提供商的沙箱已经配置完成
  • 网页搜索: 提供商的集成通常更流畅
实现方式: 使用提供商的工具过滤机制禁用特定工具:
python
undefined

Codex: via .codex/config.toml

Codex: 通过.codex/config.toml配置

[mcp_servers.some_server] disabled_tools = ["task", "sub_agent"]
[mcp_servers.some_server] disabled_tools = ["task", "sub_agent"]

Claude Code: via SDK disallowed_tools param

Claude Code: 通过SDK的disallowed_tools参数

all_params["disallowed_tools"] = [ "Read", "Write", "Edit", "MultiEdit", # Use MassGen's MCP filesystem "Bash", "BashOutput", "KillShell", # Use MassGen's execute_command "LS", "Grep", "Glob", # Use MassGen's filesystem tools "TodoWrite", # MassGen has its own task tracking "NotebookEdit", "NotebookRead", "ExitPlanMode", # Security restrictions "Bash(rm*)", "Bash(sudo*)", "Bash(su*)", "Bash(chmod*)", "Bash(chown*)", ]
all_params["disallowed_tools"] = [ "Read", "Write", "Edit", "MultiEdit", # 使用MassGen的MCP文件系统 "Bash", "BashOutput", "KillShell", # 使用MassGen的execute_command "LS", "Grep", "Glob", # 使用MassGen的文件系统工具 "TodoWrite", # MassGen有自己的任务跟踪 "NotebookEdit", "NotebookRead", "ExitPlanMode", # 安全限制 "Bash(rm*)", "Bash(sudo*)", "Bash(su*)", "Bash(chmod*)", "Bash(chown*)", ]

Conditionally keep web tools:

条件保留网页工具:

if not enable_web_search: disallowed_tools.extend(["WebSearch", "WebFetch"])

**Key patterns**:
- Claude Code SDK supports glob patterns for Bash restrictions: `"Bash(rm*)"`, `"Bash(sudo*)"`
- Check `if "disallowed_tools" not in all_params` to allow user override via YAML config
- Codex uses `disabled_tools` per MCP server in `.codex/config.toml`; for built-in tools, Codex doesn't have a disable mechanism — use `--full-auto` to auto-approve instead

**Document in capabilities.py** which native tools the backend provides:
```python
builtin_tools=["file_edit", "shell", "web_search", "sub_agent"],
notes="Native tools: file ops, shell, web search, sub-agents. Override sub-agents with MassGen's."
if not enable_web_search: disallowed_tools.extend(["WebSearch", "WebFetch"])

**关键模式**:
- Claude Code SDK支持Bash限制的通配符模式: `"Bash(rm*)"`、`"Bash(sudo*)"`
- 检查`if "disallowed_tools" not in all_params`以允许用户通过YAML配置覆盖
- Codex在`.codex/config.toml`中针对每个MCP服务使用`disabled_tools`配置;对于内置工具,Codex没有禁用机制,使用`--full-auto`自动审批替代

**在capabilities.py中记录**后端提供的原生工具:
```python
builtin_tools=["file_edit", "shell", "web_search", "sub_agent"],
notes="原生工具: 文件操作、Shell、网页搜索、子Agent。子Agent使用MassGen实现覆盖。"

6.8 NativeToolBackendMixin

6.8 NativeToolBackendMixin

File:
massgen/backend/native_tool_mixin.py
For backends with built-in tools (CLI/SDK wrappers), use
NativeToolBackendMixin
to standardize tool filtering and hook integration:
python
from .native_tool_mixin import NativeToolBackendMixin

class YourBackend(NativeToolBackendMixin, LLMBackend):
    def __init__(self, **kwargs):
        super().__init__(**kwargs)
        self.__init_native_tool_mixin__()
        # Optional: initialize hook adapter
        self._init_native_hook_adapter(
            "massgen.mcp_tools.native_hook_adapters.YourBackendAdapter"
        )

    def get_disallowed_tools(self, config):
        """Return native tools to disable (MassGen has equivalents)."""
        return ["NativeTool1", "NativeTool2"]
Mixin provides:
MethodPurpose
get_disallowed_tools(config)
Abstract — declare which native tools to disable
get_tool_category_overrides()
Abstract — declare which MCP categories to skip/override
supports_native_hooks()
Check if hook adapter is available
get_native_hook_adapter()
Get the adapter instance
set_native_hooks_config(config)
Set MassGen hooks in native format
_init_native_hook_adapter(path)
Initialize adapter by import path
Reference implementations:
  • claude_code.py
    — disables most native tools (Read, Write, Bash, etc.) in favor of MassGen MCP
  • codex.py
    — keeps all native tools (MassGen skips attaching MCP equivalents instead)
文件:
massgen/backend/native_tool_mixin.py
对于带有内置工具的后端(CLI/SDK封装),使用
NativeToolBackendMixin
标准化工具过滤和Hook集成:
python
from .native_tool_mixin import NativeToolBackendMixin

class YourBackend(NativeToolBackendMixin, LLMBackend):
    def __init__(self, **kwargs):
        super().__init__(**kwargs)
        self.__init_native_tool_mixin__()
        # 可选:初始化Hook适配器
        self._init_native_hook_adapter(
            "massgen.mcp_tools.native_hook_adapters.YourBackendAdapter"
        )

    def get_disallowed_tools(self, config):
        """返回需要禁用的原生工具(MassGen有等价实现)。"""
        return ["NativeTool1", "NativeTool2"]
Mixin提供的方法:
方法用途
get_disallowed_tools(config)
抽象方法 — 声明需要禁用的原生工具
get_tool_category_overrides()
抽象方法 — 声明需要跳过/覆盖的MCP类别
supports_native_hooks()
检查是否有可用的Hook适配器
get_native_hook_adapter()
获取适配器实例
set_native_hooks_config(config)
设置原生格式的MassGen Hook
_init_native_hook_adapter(path)
通过导入路径初始化适配器
参考实现:
  • claude_code.py
    — 禁用大多数原生工具(Read、Write、Bash等),使用MassGen MCP替代
  • codex.py
    — 保留所有原生工具,MassGen改为不挂载等价MCP工具

Phase 7: Testing (2+ files)

阶段7:测试(2+个文件)

7.1 Capabilities Test

7.1 能力测试

Automatically tested by
massgen/tests/test_backend_capabilities.py
once you add to capabilities.py.
bash
uv run pytest massgen/tests/test_backend_capabilities.py -v
添加到capabilities.py后,
massgen/tests/test_backend_capabilities.py
会自动测试。
bash
uv run pytest massgen/tests/test_backend_capabilities.py -v

7.2 Integration Test

7.2 集成测试

Create:
massgen/tests/test_<name>_integration.py
or add to cross-backend test scripts.
Cross-backend test pattern:
python
BACKEND_CONFIGS = {
    "claude": {"type": "claude", "model": "claude-haiku-4-5-20251001"},
    "openai": {"type": "openai", "model": "gpt-4o-mini"},
    "your_backend": {"type": "your_backend", "model": "model-a"},
}
创建:
massgen/tests/test_<name>_integration.py
或添加到跨后端测试脚本。
跨后端测试模式:
python
BACKEND_CONFIGS = {
    "claude": {"type": "claude", "model": "claude-haiku-4-5-20251001"},
    "openai": {"type": "openai", "model": "gpt-4o-mini"},
    "your_backend": {"type": "your_backend", "model": "model-a"},
}

7.3 Hook Firing Test (CLI/SDK backends) ⚠️ REQUIRED

7.3 Hook触发测试(CLI/SDK后端)⚠️ 必选

Script:
scripts/test_hook_backends.py
This is mandatory for every new CLI/SDK backend. The script verifies that the full hook pipeline actually fires during real streaming — not just that the hook data structures are correct.
Two modes:
Unit mode (no API calls — runs in CI):
bash
uv run python scripts/test_hook_backends.py --backend your_backend
Verifies: backend stores
GeneralHookManager
,
MidStreamInjectionHook
returns injection content,
HighPriorityTaskReminderHook
fires, combined hooks aggregate correctly.
E2E mode (real API calls):
bash
uv run python scripts/test_hook_backends.py --backend your_backend --e2e
uv run python scripts/test_hook_backends.py --backend your_backend --e2e --verbose
Verifies the complete live flow: PreToolUse hook fires → tool executes → PostToolUse hook fires → injection content is fed back to the model and acknowledged.
Adding your backend: Add an entry to
BACKEND_CONFIGS
in the script:
python
"your_backend": {
    "type": "your_backend",
    "model": "your-model",
    "description": "Your Backend description",
    "api_style": "openai",  # or "anthropic", "gemini"
},
Currently covered:
claude
,
openai
,
gemini
(native SDK),
openrouter
,
grok
Not yet covered:
codex
,
claude_code
,
copilot
,
gemini_cli
— these use file-based IPC or SDK-native hooks and need separate E2E hook verification.
Why this matters: Unit tests verify that hook data structures are correct. This script verifies that hooks actually fire during a real streaming turn. A backend can pass all unit tests and still silently drop hooks during live execution.
脚本:
scripts/test_hook_backends.py
所有新的CLI/SDK后端都必须通过该测试,脚本验证完整的Hook流水线在真实流式传输过程中确实会触发,而不仅仅是Hook数据结构正确。
两种模式:
单元模式(无API调用 — 可在CI中运行):
bash
uv run python scripts/test_hook_backends.py --backend your_backend
验证项: 后端存储了
GeneralHookManager
MidStreamInjectionHook
返回注入内容、
HighPriorityTaskReminderHook
触发、组合Hook正确聚合。
E2E模式(真实API调用):
bash
uv run python scripts/test_hook_backends.py --backend your_backend --e2e
uv run python scripts/test_hook_backends.py --backend your_backend --e2e --verbose
验证完整的实时流程: PreToolUse Hook触发 → 工具执行 → PostToolUse Hook触发 → 注入内容反馈给模型并被确认。
添加你的后端: 在脚本的
BACKEND_CONFIGS
中添加条目:
python
"your_backend": {
    "type": "your_backend",
    "model": "your-model",
    "description": "Your Backend描述",
    "api_style": "openai",  # 或"anthropic"、"gemini"
},
当前已覆盖:
claude
openai
gemini
(原生SDK)、
openrouter
grok
未覆盖:
codex
claude_code
copilot
gemini_cli
— 这些使用基于文件的IPC或SDK原生Hook,需要单独的E2E Hook验证。
重要性: 单元测试仅验证Hook数据结构正确,该脚本验证Hook在真实流式交互过程中确实会触发。后端可能通过所有单元测试,但在实际运行时仍可能静默丢失Hook。

7.4 Sandbox / Path Permission Test (CLI/SDK backends) ⚠️ REQUIRED

7.4 沙箱/路径权限测试(CLI/SDK后端)⚠️ 必选

Script:
scripts/test_native_tools_sandbox.py
This is mandatory for every new CLI/SDK backend with built-in file/shell tools. It runs real agents against a real filesystem with unique secrets and verifies that permission boundaries are actually enforced — not just that the enforcement code exists.
bash
undefined
脚本:
scripts/test_native_tools_sandbox.py
所有带有内置文件/Shell工具的新CLI/SDK后端都必须通过该测试,它会使用真实Agent操作真实文件系统,通过唯一密钥验证权限边界确实被强制执行,而不仅仅是 enforcement 代码存在。
bash
undefined

Test your new backend

测试新后端

uv run python scripts/test_native_tools_sandbox.py --backend your_backend
uv run python scripts/test_native_tools_sandbox.py --backend your_backend

Use LLM judge to detect subtle leakage

使用LLM裁判检测细微泄露

uv run python scripts/test_native_tools_sandbox.py --backend your_backend --llm-judge

The script verifies the full permission matrix:

| Zone | Expected reads | Expected writes |
|------|---------------|-----------------|
| Workspace (cwd) | ✅ allowed | ✅ allowed |
| Writable context path | ✅ allowed | ✅ allowed |
| Read-only context path | ✅ allowed | ❌ blocked |
| Outside all contexts | depends on backend | ❌ blocked |
| Parent directory | depends on backend | ❌ blocked |
| `/tmp` | ✅ allowed | depends on backend |

It uses **unique secrets** (UUIDs written to each zone's files) to detect unauthorized reads even when the operation "fails" — if the secret string appears in the model's response, there's a leak regardless of error messages.

**Adding your backend**: Add an entry to `BACKEND_CONFIGS` in the script:
```python
"your_backend": {
    "module": "massgen.backend.your_backend",
    "class": "YourBackend",
    "model": "your-model",
    "blocks_reads_outside": True,   # Does your enforcement block reads?
    "blocks_tmp_writes": True,      # Does your enforcement block /tmp writes?
},
Currently covered:
claude_code
,
codex
Not yet covered:
copilot
,
gemini_cli
— must be added when those backends are used in production.
Why this matters: Permission callback and hook unit tests verify the enforcement logic in isolation. This script verifies that enforcement actually stops a live agent from accessing restricted paths. A backend can have correct hook logic and still allow unauthorized access if the hook isn't wired into the right execution path.
uv run python scripts/test_native_tools_sandbox.py --backend your_backend --llm-judge

脚本验证完整的权限矩阵:

| 区域 | 预期读权限 | 预期写权限 |
|------|---------------|-----------------|
| 工作区(cwd) | ✅ 允许 | ✅ 允许 |
| 可写上下文路径 | ✅ 允许 | ✅ 允许 |
| 只读上下文路径 | ✅ 允许 | ❌ 阻止 |
| 所有上下文外路径 | 依后端而定 | ❌ 阻止 |
| 父目录 | 依后端而定 | ❌ 阻止 |
| `/tmp` | ✅ 允许 | 依后端而定 |

它使用**唯一密钥**(写入每个区域文件的UUID)检测未授权读操作,即使操作“失败” — 如果密钥字符串出现在模型响应中,无论错误消息如何,都说明存在泄露。

**添加你的后端**: 在脚本的`BACKEND_CONFIGS`中添加条目:
```python
"your_backend": {
    "module": "massgen.backend.your_backend",
    "class": "YourBackend",
    "model": "your-model",
    "blocks_reads_outside": True,   # 你的权限控制是否阻止外部读?
    "blocks_tmp_writes": True,      # 你的权限控制是否阻止/tmp写操作?
},
当前已覆盖:
claude_code
codex
未覆盖:
copilot
gemini_cli
— 这些后端生产环境使用前必须添加测试。
重要性: 权限回调和Hook单元测试仅单独验证 enforcement 逻辑,该脚本验证 enforcement 确实能阻止实时Agent访问受限路径。后端可能有正确的Hook逻辑,但如果Hook没有接入正确的执行路径,仍可能允许未授权访问。

7.5 Config Validation

7.5 配置校验

bash
uv run python scripts/validate_all_configs.py
bash
uv run python scripts/validate_all_configs.py

Phase 8: Config Examples (2+ files)

阶段8:配置示例(2+个文件)

8.1 Single Agent

8.1 单Agent

Create:
massgen/configs/providers/<name>/single_<name>.yaml
创建:
massgen/configs/providers/<name>/single_<name>.yaml

8.2 Multi-Agent / Tools

8.2 多Agent / 工具

Create:
massgen/configs/providers/<name>/<name>_with_tools.yaml
创建:
massgen/configs/providers/<name>/<name>_with_tools.yaml

Phase 9: Documentation (3+ files)

阶段9:文档(3+个文件)

9.1 YAML Schema

9.1 YAML Schema

File:
docs/source/reference/yaml_schema.rst
— document backend-specific params
文件:
docs/source/reference/yaml_schema.rst
— 记录后端专属参数

9.2 Backend Tables

9.2 后端表格

bash
uv run python docs/scripts/generate_backend_tables.py
bash
uv run python docs/scripts/generate_backend_tables.py

9.3 User Guide

9.3 用户指南

File:
docs/source/user_guide/backends.rst
— add section for new backend
文件:
docs/source/user_guide/backends.rst
— 为新后端添加章节

Phase 10: System Prompt Considerations

阶段10:系统提示注意事项

The system prompt is assembled by
massgen/system_message_builder.py
using priority-based sections from
massgen/system_prompt_sections.py
. Different sections are conditionally included based on backend capabilities and config flags.
系统提示由
massgen/system_message_builder.py
基于
massgen/system_prompt_sections.py
的优先级章节组装,根据后端能力和配置标志条件包含不同章节。

10.1 Section Categories

10.1 章节分类

Always included (all backends):
SectionPurpose
AgentIdentitySection
WHO the agent is
EvaluationSection
vote/new_answer coordination primitives
CoreBehaviorsSection
Action bias, parallel execution
OutputFirstVerificationSection
Quality iteration loop
SkillsSection
openskills read
via command execution
MemorySection
Decision documentation
WorkspaceStructureSection
Workspace paths
ProjectInstructionsSection
CLAUDE.md/AGENTS.md
SubagentSection
MassGen subagent delegation
BroadcastCommunicationSection
Inter-agent communication
PostEvaluationSection
submit/restart
FileSearchSection
rg/sg universal CLI tools
TaskContextSection
CONTEXT.md creation
NoveltyPressureSection
Novelty/diversity pressure across agents
ChangedocSection
Change documentation
Conditional on config flags:
SectionGate
TaskPlanningSection
enable_task_planning=True
EvolvingSkillsSection
auto_discover_custom_tools=True
AND
enable_task_planning=True
PlanningModeSection
planning_mode_enabled=True
CodeBasedToolsSection
enable_code_based_tools=True
(CodeAct paradigm)
CommandExecutionSection
enable_mcp_command_line=True
DecompositionSection
Decomposition mode active
MultimodalToolsSection
enable_multimodal_tools=True
Model-specific:
SectionGate
GPT5GuidanceSection
GPT-5 models only
GrokGuidanceSection
Grok models only
Adapted for native backends:
SectionAdaptation
FilesystemOperationsSection
has_native_tools=True
→ generic tool language
FilesystemBestPracticesSection
Comparison tool language adjusted
始终包含(所有后端):
章节用途
AgentIdentitySection
Agent身份定义
EvaluationSection
vote/new_answer协调原语
CoreBehaviorsSection
行为倾向、并行执行
OutputFirstVerificationSection
质量迭代循环
SkillsSection
通过命令执行
openskills read
MemorySection
决策记录
WorkspaceStructureSection
工作区路径
ProjectInstructionsSection
CLAUDE.md/AGENTS.md
SubagentSection
MassGen子Agent委托
BroadcastCommunicationSection
Agent间通信
PostEvaluationSection
submit/restart
FileSearchSection
rg/sg通用CLI工具
TaskContextSection
CONTEXT.md创建
NoveltyPressureSection
Agent间的新颖性/多样性要求
ChangedocSection
变更记录
依赖配置标志:
章节开关
TaskPlanningSection
enable_task_planning=True
EvolvingSkillsSection
auto_discover_custom_tools=True
enable_task_planning=True
PlanningModeSection
planning_mode_enabled=True
CodeBasedToolsSection
enable_code_based_tools=True
(CodeAct范式)
CommandExecutionSection
enable_mcp_command_line=True
DecompositionSection
分解模式激活
MultimodalToolsSection
enable_multimodal_tools=True
模型专属:
章节开关
GPT5GuidanceSection
仅GPT-5模型
GrokGuidanceSection
仅Grok模型
针对原生后端适配:
章节适配逻辑
FilesystemOperationsSection
has_native_tools=True
→ 使用通用工具表述
FilesystemBestPracticesSection
调整比较工具的表述

10.2
tool_category_overrides

10.2
tool_category_overrides

File:
massgen/backend/native_tool_mixin.py
The
get_tool_category_overrides()
abstract method on
NativeToolBackendMixin
declares which MassGen tool categories a backend handles natively. Each native backend implements this method:
python
@abstractmethod
def get_tool_category_overrides(self) -> Dict[str, str]:
    # "skip"     = backend has native equivalent
    # "override" = attach MassGen's version, disable native
    # (absent)   = attach MCP normally
Categories:
filesystem
,
command_execution
,
file_search
,
web_search
,
planning
,
subagents
Currently used by
system_message_builder.py
to:
  • Pass
    has_native_tools=True
    to filesystem sections when
    filesystem: "skip"
  • Document which tool categories each backend handles natively
Note: MCP server injection filtering (which MCP servers to attach) is handled individually by each CLI/SDK backend (
codex.py
,
claude_code.py
). The
get_tool_category_overrides()
method serves as documentation and controls system prompt section behavior. Non-mixin backends (API-based) return
{}
by default via
system_message_builder.py
's fallback.
文件:
massgen/backend/native_tool_mixin.py
NativeToolBackendMixin
的抽象方法
get_tool_category_overrides()
声明后端原生支持的MassGen工具类别,每个原生后端都需要实现该方法:
python
@abstractmethod
def get_tool_category_overrides(self) -> Dict[str, str]:
    # "skip"     = 后端有原生等价实现
    # "override" = 挂载MassGen版本,禁用原生实现
    # (未声明)   = 正常挂载MCP
类别:
filesystem
command_execution
file_search
web_search
planning
subagents
当前
system_message_builder.py
使用该配置:
  • filesystem: "skip"
    时,给文件系统章节传递
    has_native_tools=True
    参数
  • 记录每个后端原生处理的工具类别
注意: MCP服务注入过滤(挂载哪些MCP服务)由每个CLI/SDK后端单独处理(
codex.py
claude_code.py
),
get_tool_category_overrides()
方法作为文档说明,同时控制系统提示章节行为。非Mixin后端(基于API的)
system_message_builder.py
默认返回空字典。

10.3 Path Permission Integration

10.3 路径权限集成

When
filesystem: "skip"
, the backend's native tools handle file ops — but they may not enforce MassGen's granular per-path permissions from
PathPermissionManager
(PPM). Each backend addresses this differently:
Two-layer defense (recommended for SDK backends with permission callbacks):
  1. Layer 1 — Permission callback (coarse gate): Extract paths from the SDK's permission request, validate against PPM. Fail-open if no path extractable (defer to Layer 2).
  2. Layer 2 — PreToolUse hook (fine-grained): The orchestrator auto-registers
    PathPermissionManagerHook
    as
    PRE_TOOL_USE
    for non-Claude-Code native backends. Full
    toolName
    /
    toolArgs
    available for precise validation.
Backend-specific approaches:
BackendPPM Strategy
Claude CodeOwn
add_dirs
sandboxing — PPM hook skipped by orchestrator
CopilotTwo-layer defense (permission callback + PPM PreToolUse hook). In Docker mode, container provides isolation + PPM as defense-in-depth
CodexNo native hook support — relies on Docker/workspace isolation
API backendsPPM enforced via
ToolManager
automatically
Reference:
copilot.py
(
_build_permission_callback
) and
orchestrator.py
(
_setup_native_hooks_for_agent
) for the two-layer pattern. See
massgen/filesystem_manager/_path_permission_manager.py
for PPM internals.
filesystem: "skip"
时,后端的原生工具处理文件操作,但它们可能不会强制执行MassGen的
PathPermissionManager
(PPM)的细粒度路径权限,每个后端的处理方式不同:
双层防御(推荐给支持权限回调的SDK后端):
  1. 第一层 — 权限回调(粗粒度关口): 从SDK的权限请求中提取路径,通过PPM验证。如果无法提取路径则失败开放(降级到第二层)。
  2. 第二层 — PreToolUse Hook(细粒度控制): 编排器会为非Claude Code的原生后端自动注册
    PathPermissionManagerHook
    作为
    PRE_TOOL_USE
    Hook,可获取完整的
    toolName
    /
    toolArgs
    进行精确验证。
后端专属方案:
后端PPM策略
Claude Code自有
add_dirs
沙箱 — 编排器跳过PPM Hook
Copilot双层防御(权限回调 + PPM PreToolUse Hook)。Docker模式下,容器提供隔离 + PPM作为深度防御
Codex无原生Hook支持 — 依赖Docker/工作区隔离
API后端
ToolManager
自动强制执行PPM
参考:
copilot.py
_build_permission_callback
)和
orchestrator.py
_setup_native_hooks_for_agent
)的双层模式实现,PPM内部实现参见
massgen/filesystem_manager/_path_permission_manager.py

Phase 11: Hooks

阶段11:Hook

MassGen supports hooks for intercepting tool execution and other lifecycle events. Hooks run transparently (not documented in the system prompt).
Note that some backends may not support hooks (e.g., Codex). In this case, hooks will only be applied where possible within MassGen's tool execution framework.
MassGen支持Hook拦截工具执行和其他生命周期事件,Hook透明运行(不在系统提示中说明)。
注意部分后端可能不支持Hook(如Codex),这种情况下Hook仅会在MassGen的工具执行框架中尽可能应用。

11.1 Hook Types and Built-in Hooks

11.1 Hook类型和内置Hook

Hook types (
HookType
enum in
massgen/mcp_tools/hooks.py
):
HookTypeTrigger
PRE_TOOL_USE
Before tool execution
POST_TOOL_USE
After tool execution
Built-in hook classes (
PatternHook
subclasses — registered by the orchestrator):
Hook ClassTypePurpose
PathPermissionManagerHook
PreToolUseEnforce per-path read/write permissions (in
_path_permission_manager.py
)
RoundTimeoutPreHook
PreToolUseEnforce round time limits
HumanInputHook
PreToolUseInject human input into agent flow
MidStreamInjectionHook
PostToolUseInject coordination messages (voting reminders, etc.)
SubagentCompleteHook
PostToolUseNotify when subagent completes
BackgroundToolCompleteHook
PostToolUseNotify when background tool completes
HighPriorityTaskReminderHook
PostToolUseRemind agent of current task
MediaCallLedgerHook
PostToolUseTrack media generation calls
RoundTimeoutPostHook
PostToolUseEnforce round time limits
Hook类型
massgen/mcp_tools/hooks.py
中的
HookType
枚举):
HookType触发时机
PRE_TOOL_USE
工具执行前
POST_TOOL_USE
工具执行后
内置Hook类
PatternHook
子类 — 由编排器注册):
Hook类类型用途
PathPermissionManagerHook
PreToolUse执行路径级读/写权限(定义在
_path_permission_manager.py
RoundTimeoutPreHook
PreToolUse执行轮次时间限制
HumanInputHook
PreToolUse将人类输入注入到Agent流程
MidStreamInjectionHook
PostToolUse注入协调消息(投票提醒等)
SubagentCompleteHook
PostToolUse子Agent完成时通知
BackgroundToolCompleteHook
PostToolUse后台工具完成时通知
HighPriorityTaskReminderHook
PostToolUse提醒Agent当前任务
MediaCallLedgerHook
PostToolUse跟踪媒体生成调用
RoundTimeoutPostHook
PostToolUse执行轮次时间限制

11.2 Hook Architecture

11.2 Hook架构

GeneralHookManager (massgen/mcp_tools/hooks/)
    ├── Registers global and per-agent hooks
    ├── Executes hooks in order for MCP-based backends
    └── For native backends → NativeHookAdapter
            ├── ClaudeCodeNativeHookAdapter
            │   └── Converts to Claude Agent SDK HookMatcher format
            └── (future) CodexNativeHookAdapter
GeneralHookManager (massgen/mcp_tools/hooks/)
    ├── 注册全局和单Agent Hook
    ├── 为基于MCP的后端按顺序执行Hook
    └── 针对原生后端 → NativeHookAdapter
            ├── ClaudeCodeNativeHookAdapter
            │   └── 转换为Claude Agent SDK HookMatcher格式
            └── (未来)CodexNativeHookAdapter

11.3 Native Hook Adapters — How to Implement

11.3 原生Hook适配器 — 实现方法

For backends with native tool execution (CLI/SDK wrappers), MassGen hooks need to be bridged into the backend's hook system. This is done via
NativeHookAdapter
(base class in
massgen/mcp_tools/native_hook_adapters/base.py
).
Step 1: Create adapter subclass:
python
undefined
对于带有原生工具执行的后端(CLI/SDK封装),需要将MassGen Hook桥接到后端的Hook系统,通过
NativeHookAdapter
实现(基类位于
massgen/mcp_tools/native_hook_adapters/base.py
)。
步骤1: 创建适配器子类:
python
undefined

massgen/mcp_tools/native_hook_adapters/your_backend_adapter.py

massgen/mcp_tools/native_hook_adapters/your_backend_adapter.py

from .base import NativeHookAdapter
class YourBackendNativeHookAdapter(NativeHookAdapter): def supports_hook_type(self, hook_type): return hook_type in (HookType.PRE_TOOL_USE, HookType.POST_TOOL_USE)
def convert_hook_to_native(self, hook, hook_type, context_factory=None):
    """Wrap MassGen hook as your backend's native hook format."""
    async def native_hook(tool_name, tool_args, context):
        event = self.create_hook_event_from_native(
            {"tool_name": tool_name, "tool_input": tool_args},
            hook_type, context_factory() if context_factory else {},
        )
        result = await hook.execute(event)
        return self.convert_hook_result_to_native(result, hook_type)
    return {"pattern": hook.matcher, "handler": native_hook}

def build_native_hooks_config(self, hook_manager, agent_id=None, context_factory=None):
    """Convert all registered hooks to native config."""
    config = {"PreToolUse": [], "PostToolUse": []}
    for hook_type, hooks in hook_manager.get_hooks(agent_id).items():
        for hook in hooks:
            native = self.convert_hook_to_native(hook, hook_type, context_factory)
            config[hook_type.value].append(native)
    return config

def merge_native_configs(self, *configs):
    """Merge permission hooks + MassGen hooks + user hooks."""
    merged = {"PreToolUse": [], "PostToolUse": []}
    for config in configs:
        for key in merged:
            merged[key].extend(config.get(key, []))
    return merged

**Step 2**: Initialize in backend `__init__`:
```python
from ..mcp_tools.native_hook_adapters import YourBackendNativeHookAdapter
self._native_hook_adapter = YourBackendNativeHookAdapter()
Step 3: Implement the backend interface methods:
python
def supports_native_hooks(self) -> bool:
    return self._native_hook_adapter is not None

def get_native_hook_adapter(self):
    return self._native_hook_adapter

def set_native_hooks_config(self, config):
    self._massgen_hooks_config = config
Step 4: Merge hooks when launching the backend process. The orchestrator calls
set_native_hooks_config()
with converted hooks; your backend merges with its own permission hooks when building options.
from .base import NativeHookAdapter
class YourBackendNativeHookAdapter(NativeHookAdapter): def supports_hook_type(self, hook_type): return hook_type in (HookType.PRE_TOOL_USE, HookType.POST_TOOL_USE)
def convert_hook_to_native(self, hook, hook_type, context_factory=None):
    """将MassGen Hook封装为后端原生Hook格式。"""
    async def native_hook(tool_name, tool_args, context):
        event = self.create_hook_event_from_native(
            {"tool_name": tool_name, "tool_input": tool_args},
            hook_type, context_factory() if context_factory else {},
        )
        result = await hook.execute(event)
        return self.convert_hook_result_to_native(result, hook_type)
    return {"pattern": hook.matcher, "handler": native_hook}

def build_native_hooks_config(self, hook_manager, agent_id=None, context_factory=None):
    """将所有注册的Hook转换为原生配置。"""
    config = {"PreToolUse": [], "PostToolUse": []}
    for hook_type, hooks in hook_manager.get_hooks(agent_id).items():
        for hook in hooks:
            native = self.convert_hook_to_native(hook, hook_type, context_factory)
            config[hook_type.value].append(native)
    return config

def merge_native_configs(self, *configs):
    """合并权限Hook + MassGen Hook + 用户Hook。"""
    merged = {"PreToolUse": [], "PostToolUse": []}
    for config in configs:
        for key in merged:
            merged[key].extend(config.get(key, []))
    return merged

**步骤2**: 在后端`__init__`中初始化:
```python
from ..mcp_tools.native_hook_adapters import YourBackendNativeHookAdapter
self._native_hook_adapter = YourBackendNativeHookAdapter()
步骤3: 实现后端接口方法:
python
def supports_native_hooks(self) -> bool:
    return self._native_hook_adapter is not None

def get_native_hook_adapter(self):
    return self._native_hook_adapter

def set_native_hooks_config(self, config):
    self._massgen_hooks_config = config
步骤4: 启动后端进程时合并Hook。编排器会调用
set_native_hooks_config()
传递转换后的Hook,后端构建选项时将其与自身的权限Hook合并。

11.4 Reference

11.4 参考

  • Hook framework:
    massgen/mcp_tools/hooks/
  • Adapter base:
    massgen/mcp_tools/native_hook_adapters/base.py
  • Claude Code adapter:
    massgen/mcp_tools/native_hook_adapters/claude_code_adapter.py
  • Copilot adapter:
    massgen/mcp_tools/native_hook_adapters/copilot_adapter.py
  • Path permissions:
    massgen/filesystem_manager/_path_permission_manager.py
  • Orchestrator hook setup:
    massgen/orchestrator.py
    (search for
    native_hook
    )
  • Hook框架:
    massgen/mcp_tools/hooks/
  • 适配器基类:
    massgen/mcp_tools/native_hook_adapters/base.py
  • Claude Code适配器:
    massgen/mcp_tools/native_hook_adapters/claude_code_adapter.py
  • Copilot适配器:
    massgen/mcp_tools/native_hook_adapters/copilot_adapter.py
  • 路径权限:
    massgen/filesystem_manager/_path_permission_manager.py
  • 编排器Hook设置:
    massgen/orchestrator.py
    (搜索
    native_hook

Common Patterns

通用模式

CLI Wrapper Backend (like Codex)

CLI封装后端(如Codex)

__init__: find CLI binary, set config defaults
stream_with_tools: build command -> spawn subprocess -> parse JSONL stdout -> yield StreamChunks
MCP: write project-scoped config file in workspace before launch
Custom tools: wrap as MCP server or inject via system prompt
Cleanup: remove workspace config on reset_state/clear_history
__init__: 查找CLI二进制文件,设置配置默认值
stream_with_tools: 构建命令 → 启动子进程 → 解析JSONL标准输出 → 返回StreamChunks
MCP: 启动前在工作区写入项目级配置文件
自定义工具: 封装为MCP服务或通过系统提示注入
清理: reset_state/clear_history时删除工作区配置

Docker Execution for CLI Backends

CLI后端的Docker执行

If a CLI-based backend doesn't support Docker natively, run the CLI inside the MassGen container:
  1. Install the CLI in the Docker image (add to
    npm install -g
    in Dockerfile) or via
    command_line_docker_packages.preinstall.npm
  2. In
    __init__
    , detect
    command_line_execution_mode: "docker"
    and skip host CLI lookup
  3. In
    stream_with_tools
    , use
    container.client.api.exec_create()
    +
    exec_start(stream=True)
    to run the CLI inside the container
  4. If the CLI has its own sandbox, disable it — the container provides isolation
  5. If the CLI needs host credentials, mount them read-only via
    _build_credential_mounts()
IMPORTANT: Some CLI backends require
command_line_docker_network_mode: bridge
when running in Docker mode. This allows the container to make outbound network requests to APIs. Add config validation to enforce this:
python
undefined
如果基于CLI的后端原生不支持Docker,将CLI运行在MassGen容器内:
  1. 在Docker镜像中安装CLI(在Dockerfile中添加
    npm install -g
    )或通过
    command_line_docker_packages.preinstall.npm
    安装
  2. __init__
    中检测
    command_line_execution_mode: "docker"
    ,跳过宿主机CLI查找
  3. stream_with_tools
    中使用
    container.client.api.exec_create()
    +
    exec_start(stream=True)
    在容器内运行CLI
  4. 如果CLI有自有沙箱,禁用它 — 容器提供隔离能力
  5. 如果CLI需要宿主机凭证,通过
    _build_credential_mounts()
    以只读方式挂载
重要: 部分CLI后端在Docker模式下需要
command_line_docker_network_mode: bridge
配置,允许容器对外发起API网络请求。添加配置校验强制该要求:
python
undefined

In config_validator.py

在config_validator.py中

if backend_type == "your_backend": execution_mode = backend_config.get("command_line_execution_mode") if execution_mode == "docker": if "command_line_docker_network_mode" not in backend_config: result.add_error( "YourBackend in Docker mode requires 'command_line_docker_network_mode'", f"{location}.command_line_docker_network_mode", "Add 'command_line_docker_network_mode: bridge' (required for network access)", )

**Reference**: `codex.py` implements this pattern with `_stream_docker()` / `_stream_local()` branching.
if backend_type == "your_backend": execution_mode = backend_config.get("command_line_execution_mode") if execution_mode == "docker": if "command_line_docker_network_mode" not in backend_config: result.add_error( "Docker模式下的YourBackend需要配置'command_line_docker_network_mode'", f"{location}.command_line_docker_network_mode", "添加'command_line_docker_network_mode: bridge'(网络访问必需)", )

**参考**: `codex.py`实现了该模式,通过`_stream_docker()` / `_stream_local()`分支处理。

Docker Execution for SDK Backends (like Copilot)

SDK后端的Docker执行(如Copilot)

SDK-based backends (in-process Python clients) cannot run inside Docker — the SDK stays on host. Instead, Docker isolates file/shell operations by:
  1. Disable built-in tools in Docker mode via
    excluded_tools
    so the SDK doesn't use its native file/shell tools
  2. Route all file/shell ops through MCP servers that execute inside the Docker container
  3. Point
    working_directory
    to the Docker-mounted workspace path
python
undefined
基于SDK的后端(进程内Python客户端)无法运行在Docker内部,SDK保留在宿主机,Docker通过以下方式隔离文件/Shell操作:
  1. Docker模式下通过
    excluded_tools
    禁用内置工具,避免SDK使用原生文件/Shell工具
  2. 所有文件/Shell操作路由到Docker容器内执行的MCP服务
  3. working_directory
    指向Docker挂载的工作区路径
python
undefined

In init:

在__init__中:

self._docker_execution = ( kwargs.get("command_line_execution_mode") == "docker" or self.config.get("command_line_execution_mode") == "docker" )
self._docker_execution = ( kwargs.get("command_line_execution_mode") == "docker" or self.config.get("command_line_execution_mode") == "docker" )

Property to check if Docker is actually active:

检查Docker是否实际激活的属性:

@property def _is_docker_mode(self) -> bool: if not self._docker_execution: return False if not self.filesystem_manager: return False dm = getattr(self.filesystem_manager, "docker_manager", None) if dm is None: return False agent_id = self.config.get("agent_id") if agent_id and dm.get_container(agent_id): return True return False
@property def _is_docker_mode(self) -> bool: if not self._docker_execution: return False if not self.filesystem_manager: return False dm = getattr(self.filesystem_manager, "docker_manager", None) if dm is None: return False agent_id = self.config.get("agent_id") if agent_id and dm.get_container(agent_id): return True return False

In get_disallowed_tools():

在get_disallowed_tools()中:

def get_disallowed_tools(self, config): if self._docker_execution: return ["editFile", "createFile", "deleteFile", "readFile", "listDirectory", "runShellCommand", "shellCommand"] return []
def get_disallowed_tools(self, config): if self._docker_execution: return ["editFile", "createFile", "deleteFile", "readFile", "listDirectory", "runShellCommand", "shellCommand"] return []

In stream_with_tools(), merge Docker-excluded tools into session config:

在stream_with_tools()中,合并Docker排除工具到会话配置:

if self._docker_execution: docker_excluded = self.get_disallowed_tools(self.config) if docker_excluded: excluded_tools = list(set((excluded_tools or []) + docker_excluded))

**Key differences from CLI Docker pattern**:
- SDK stays on host (no `exec_create` / `exec_start`)
- Built-in tools disabled via SDK session config (`excluded_tools`), not by running inside container
- MCP servers still execute inside container (same as CLI pattern)
- PPM permission callback still active as defense-in-depth

**Config validation**: Same pattern — require `command_line_docker_network_mode`:
```python
if backend_type == "copilot":
    execution_mode = backend_config.get("command_line_execution_mode")
    if execution_mode == "docker":
        if "command_line_docker_network_mode" not in backend_config:
            result.add_error(...)
Reference:
copilot.py
implements this pattern.
if self._docker_execution: docker_excluded = self.get_disallowed_tools(self.config) if docker_excluded: excluded_tools = list(set((excluded_tools or []) + docker_excluded))

**与CLI Docker模式的关键差异**:
- SDK保留在宿主机(无`exec_create` / `exec_start`调用)
- 内置工具通过SDK会话配置(`excluded_tools`)禁用,而非运行在容器内
- MCP服务仍在容器内执行(与CLI模式相同)
- PPM权限回调仍作为深度防御生效

**配置校验**: 相同模式 — 要求`command_line_docker_network_mode`:
```python
if backend_type == "copilot":
    execution_mode = backend_config.get("command_line_execution_mode")
    if execution_mode == "docker":
        if "command_line_docker_network_mode" not in backend_config:
            result.add_error(...)
参考:
copilot.py
实现了该模式。

Docker MCP Path Resolution

Docker MCP路径解析

MCP server configs are built by the orchestrator on the host with absolute host file paths (e.g.,
fastmcp run /host/path/massgen/mcp_tools/planning/_server.py:create_server
). Inside Docker, these paths don't exist unless explicitly mounted.
Solution:
_docker_manager.py
bind-mounts the
massgen/
package directory into the container at the same host path (read-only). This makes host-path-based MCP configs work as-is inside the container, using the latest source from the host rather than stale pip-installed modules.
MCP服务配置由宿主机上的编排器使用宿主机绝对路径构建(如
fastmcp run /host/path/massgen/mcp_tools/planning/_server.py:create_server
)。在Docker内部,这些路径不存在,除非显式挂载。
解决方案:
_docker_manager.py
massgen/
包目录以相同的宿主机路径绑定挂载到容器内(只读),使得基于宿主机路径的MCP配置在容器内无需修改即可生效,使用宿主机上的最新源码而非过时的pip安装模块。

Shell Sandboxing for Backends with Native Tools

带有原生工具的后端的Shell沙箱

This section applies only to backends that bring their own shell/file tools (e.g., CLI wrappers with built-in command execution). If a backend delegates all tool execution to MassGen's MCP servers, MassGen's
PathPermissionManager
handles sandboxing automatically.
When the backend has its own tools that can access the filesystem or run commands, you must map MassGen's permission model to the backend's sandbox mechanism:
MassGen conceptExpected access
WorkspaceWrite
Temp workspacesRead-only
Read context pathsRead-only
Write context pathsWrite — must be explicitly granted in backend sandbox config
The workspace is typically writable by default. Read access to the full filesystem is usually allowed. The key task is ensuring write context paths are added to the backend's writable allowlist (e.g.,
sandbox_workspace_write.writable_roots
for Codex,
allowed_directories
for Claude Code).
Docker mode: Skip backend sandbox config — the container IS the sandbox. Grant full access inside the container.
本节仅适用于自带Shell/文件工具的后端(如内置命令执行能力的CLI封装)。如果后端将所有工具执行委托给MassGen的MCP服务,MassGen的
PathPermissionManager
会自动处理沙箱。
当后端自有工具可以访问文件系统或运行命令时,必须将MassGen的权限模型映射到后端的沙箱机制:
MassGen概念预期访问权限
工作区可写
临时工作区只读
读上下文路径只读
写上下文路径可写 — 必须显式添加到后端沙箱的可写允许列表
工作区通常默认可写,全文件系统读权限通常允许,核心任务是确保写上下文路径被添加到后端的可写允许列表(如Codex的
sandbox_workspace_write.writable_roots
、Claude Code的
allowed_directories
)。
Docker模式: 跳过后端沙箱配置 — 容器本身就是沙箱,在容器内部授予完全访问权限。

SDK Wrapper Backend (like Claude Code, Copilot)

SDK封装后端(如Claude Code、Copilot)

__init__: import SDK, configure options, detect Docker mode
stream_with_tools: call SDK with messages -> iterate events -> yield StreamChunks
MCP: pass mcp_servers dict to SDK options
Custom tools: create SDK MCP server from tool definitions
State: SDK manages conversation state internally
Docker: SDK stays on host; disable built-in file/shell tools via excluded_tools;
        route file/shell ops through MCP servers in container (see §Docker Execution for SDK Backends)
__init__: 导入SDK,配置选项,检测Docker模式
stream_with_tools: 调用SDK传递消息 → 遍历事件 → 返回StreamChunks
MCP: 将mcp_servers字典传递给SDK选项
自定义工具: 从工具定义创建SDK MCP服务
状态: SDK内部管理会话状态
Docker: SDK保留在宿主机;通过excluded_tools禁用内置文件/Shell工具;
        文件/Shell操作路由到容器内的MCP服务(参见§SDK后端的Docker执行)

API Backend (like Claude, Gemini)

API后端(如Claude、Gemini)

Subclass CustomToolAndMCPBackend or ChatCompletionsBackend
Implement formatter if non-standard API format
Implement API params handler to filter/transform config -> API params
MCP + custom tools handled automatically by base class
继承CustomToolAndMCPBackend或ChatCompletionsBackend
如果API格式非标准则实现格式化器
实现API参数处理器,过滤/转换配置到API参数
基类自动处理MCP + 自定义工具

Reference Backends by Complexity

按复杂度排序的参考后端

BackendLinesTypeGood reference for
grok.py
~81Subclass of ChatCompletionsMinimal API backend
chat_completions.py
~1150OpenAI-compatibleStandard API backend
response.py
~1600Response APIReasoning models
claude.py
~1830Custom APIFull-featured API
copilot.py
~1250SDK wrapperSDK-based + PPM permission callback + Docker mode
codex.py
~2090CLI wrapperCLI-based + Docker
gemini.py
~2460Custom APINon-OpenAI API
claude_code.py
~3530SDK wrapperFull-featured stateful
后端代码行数类型适合参考的场景
grok.py
~81ChatCompletions子类最简API后端
chat_completions.py
~1150兼容OpenAI标准API后端
response.py
~1600Response API推理模型
claude.py
~1830自定义API全功能API
copilot.py
~1250SDK封装基于SDK + PPM权限回调 + Docker模式
codex.py
~2090CLI封装基于CLI + Docker
gemini.py
~2460自定义API非OpenAI API
claude_code.py
~3530SDK封装全功能有状态后端

File Summary

文件汇总

#FileRequired?Purpose
1
massgen/backend/<name>.py
YesCore backend
2
massgen/formatter/<name>_formatter.py
If non-standard APIMessage/tool formatting
3
massgen/api_params_handler/<name>_handler.py
If HTTP APIParam filtering
4
massgen/backend/__init__.py
YesExport
5
massgen/cli.py
YesBackend creation
6
massgen/backend/capabilities.py
YesModel registry
7
massgen/config_validator.py
If special rulesValidation
8
massgen/token_manager/token_manager.py
If not in LiteLLMPricing
9
massgen/backend/base.py
If new YAML paramsExcluded params
10
massgen/api_params_handler/_base.py
If new YAML paramsExcluded params
11
massgen/tests/test_*
YesTests
12
massgen/configs/providers/<name>/
YesExample configs
13
docs/source/reference/yaml_schema.rst
YesSchema docs
14
docs/source/user_guide/backends.rst
YesUser guide
15
docs/scripts/generate_backend_tables.py
Run itRegenerate tables
#文件是否必需用途
1
massgen/backend/<name>.py
核心后端实现
2
massgen/formatter/<name>_formatter.py
API非标准时必需消息/工具格式化
3
massgen/api_params_handler/<name>_handler.py
HTTP API时必需参数过滤
4
massgen/backend/__init__.py
导出
5
massgen/cli.py
后端创建
6
massgen/backend/capabilities.py
模型注册表
7
massgen/config_validator.py
有特殊规则时必需校验
8
massgen/token_manager/token_manager.py
不在LiteLLM中时必需定价
9
massgen/backend/base.py
新增YAML参数时必需排除参数
10
massgen/api_params_handler/_base.py
新增YAML参数时必需排除参数
11
massgen/tests/test_*
测试
12
massgen/configs/providers/<name>/
示例配置
13
docs/source/reference/yaml_schema.rst
Schema文档
14
docs/source/user_guide/backends.rst
用户指南
15
docs/scripts/generate_backend_tables.py
运行即可重新生成表格