hosted-agents

Compare original and translation side by side

🇺🇸

Original

English

🇨🇳

Translation

Chinese

Hosted Agent Infrastructure

托管式Agent基础设施

Hosted agents run in remote sandboxed environments rather than on local machines. When designed well, they provide unlimited concurrency, consistent execution environments, and multiplayer collaboration. The critical insight is that session speed should be limited only by model provider time-to-first-token, with all infrastructure setup completed before the user starts their session.

托管式Agent运行在远程沙箱环境中，而非本地机器。设计完善的托管式Agent可提供无限并发能力、一致的执行环境以及多用户协作功能。核心设计思路是：会话速度应仅受限于模型提供商的首令牌响应时间，所有基础设施配置需在用户启动会话前完成。

When to Activate

激活场景

Activate this skill when:

Building background coding agents that run independently of user devices
Designing sandboxed execution environments for agent workloads
Implementing multiplayer agent sessions with shared state
Creating multi-client agent interfaces (Slack, Web, Chrome extensions)
Scaling agent infrastructure beyond local machine constraints
Building systems where agents spawn sub-agents for parallel work

在以下场景中激活此技能：

构建独立于用户设备运行的后台编码Agent
为Agent工作负载设计沙箱执行环境
实现具备共享状态的多用户协作Agent会话
创建多客户端Agent交互界面（Slack、网页、Chrome扩展）
突破本地机器限制，扩展Agent基础设施规模
构建可生成子Agent以并行处理任务的系统

Core Concepts

核心概念

Hosted agents address the fundamental limitation of local agent execution: resource contention, environment inconsistency, and single-user constraints. By moving agent execution to remote sandboxed environments, teams gain unlimited concurrency, reproducible environments, and collaborative workflows.

The architecture consists of three layers: sandbox infrastructure for isolated execution, API layer for state management and client coordination, and client interfaces for user interaction across platforms. Each layer has specific design requirements that enable the system to scale.

托管式Agent解决了本地Agent执行的根本性局限：资源竞争、环境不一致以及单用户限制。通过将Agent执行转移到远程沙箱环境，团队可获得无限并发能力、可复现的执行环境以及协作工作流。

该架构包含三层：用于隔离执行的沙箱基础设施、用于状态管理和客户端协调的API层，以及支持跨平台用户交互的客户端界面。每一层都有特定的设计要求以保障系统可扩展性。

Detailed Topics

详细主题

Sandbox Infrastructure

沙箱基础设施

The Core Challenge Spinning up full development environments quickly is the primary technical challenge. Users expect near-instant session starts, but development environments require cloning repositories, installing dependencies, and running build steps.

Image Registry Pattern Pre-build environment images on a regular cadence (every 30 minutes works well). Each image contains:

Cloned repository at a known commit
All runtime dependencies installed
Initial setup and build commands completed
Cached files from running app and test suite once

When starting a session, spin up a sandbox from the most recent image. The repository is at most 30 minutes out of date, making synchronization with the latest code much faster.

Snapshot and Restore Take filesystem snapshots at key points:

After initial image build (base snapshot)
When agent finishes making changes (session snapshot)
Before sandbox exit for potential follow-up

This enables instant restoration for follow-up prompts without re-running setup.

Git Configuration for Background Agents Since git operations are not tied to a specific user during image builds:

Generate GitHub app installation tokens for repository access during clone
Update git config's
```
user.name
```
and
```
user.email
```
when committing and pushing changes
Use the prompting user's identity for commits, not the app identity

Warm Pool Strategy Maintain a pool of pre-warmed sandboxes for high-volume repositories:

Sandboxes are ready before users start sessions
Expire and recreate pool entries as new image builds complete
Start warming sandbox as soon as user begins typing (predictive warm-up)

核心挑战 快速启动完整开发环境是首要技术难题。用户期望会话近乎即时启动，但开发环境需要克隆代码仓库、安装依赖并执行构建步骤。

镜像仓库模式 定期预构建环境镜像（每30分钟一次效果良好）。每个镜像包含：

基于指定提交版本克隆的代码仓库
已安装的所有运行时依赖
已完成的初始配置和构建命令
运行应用和测试套件后缓存的文件

启动会话时，从最新镜像创建沙箱。代码仓库最多滞后30分钟，与最新代码同步的速度会大幅提升。

快照与恢复 在关键节点生成文件系统快照：

初始镜像构建完成后（基础快照）
Agent完成修改后（会话快照）
沙箱退出前（用于后续可能的跟进操作）

这使得后续请求可即时恢复会话，无需重新执行配置步骤。

后台Agent的Git配置 由于镜像构建期间Git操作不绑定特定用户：

生成GitHub应用安装令牌以在克隆时访问仓库
提交和推送变更时更新git配置中的
```
user.name
```
和
```
user.email
```
使用发起请求的用户身份进行提交，而非应用身份

预热池策略 为高访问量的代码仓库维护预预热的沙箱池：

沙箱在用户启动会话前已准备就绪
新镜像构建完成后，过期并重建池中的沙箱实例
用户开始输入时即启动沙箱预热（预测性预热）

Agent Framework Selection

Agent框架选择

Server-First Architecture Choose an agent framework structured as a server first, with TUI and desktop apps as clients. This enables:

Multiple custom clients without duplicating agent logic
Consistent behavior across all interaction surfaces
Plugin systems for extending functionality
Event-driven architectures for real-time updates

Code as Source of Truth Select frameworks where the agent can read its own source code to understand behavior. This is underrated in AI development: having the code as source of truth prevents hallucination about the agent's own capabilities.

Plugin System Requirements The framework should support plugins that:

Listen to tool execution events (e.g.,
```
tool.execute.before
```
)
Block or modify tool calls conditionally
Inject context or state at runtime

服务器优先架构 选择以服务器为核心的Agent框架，将TUI和桌面应用作为客户端。这可实现：

无需重复编写Agent逻辑即可支持多个自定义客户端
所有交互界面的行为一致性
用于扩展功能的插件系统
支持实时更新的事件驱动架构

代码作为事实来源 选择Agent可读取自身源代码以理解自身行为的框架。这在AI开发中常被忽视：将代码作为事实来源可避免Agent对自身能力产生幻觉。

插件系统要求 框架应支持具备以下能力的插件：

监听工具执行事件（如
```
tool.execute.before
```
）
有条件地阻止或修改工具调用
在运行时注入上下文或状态

Speed Optimizations

速度优化

Predictive Warm-Up Start warming the sandbox as soon as a user begins typing their prompt:

Clone latest changes in parallel with user typing
Run initial setup before user hits enter
For fast spin-up, sandbox can be ready before user finishes typing

Parallel File Reading Allow the agent to start reading files immediately, even if sync from latest base branch is not complete:

In large repositories, incoming prompts rarely modify recently-changed files
Agent can research immediately without waiting for git sync
Block file edits (not reads) until synchronization completes

Maximize Build-Time Work Move everything possible to the image build step:

Full dependency installation
Database schema setup
Initial app and test suite runs (populates caches)
Build-time duration is invisible to users

预测性预热 用户开始输入请求时即启动沙箱预热：

在用户输入的同时并行克隆最新变更
用户按下回车前完成初始配置
实现快速启动，沙箱可在用户完成输入前准备就绪

并行文件读取 允许Agent立即开始读取文件，即使尚未完成与最新基础分支的同步：

在大型代码仓库中，用户的请求很少会修改最近变更的文件
Agent可立即开始分析，无需等待Git同步完成
同步完成前仅阻止文件编辑（不阻止读取）

最大化构建时工作量 尽可能将操作转移到镜像构建阶段：

完整的依赖安装
数据库架构配置
初始应用和测试套件运行（填充缓存）
构建阶段的耗时对用户不可见

Self-Spawning Agents

自生成Agent

Agent-Spawned Sessions Create tools that allow agents to spawn new sessions:

Research tasks across different repositories
Parallel subtask execution for large changes
Multiple smaller PRs from one major task

Frontier models are capable of containing themselves. The tools should:

Start a new session with specified parameters
Read status of any session (check-in capability)
Continue main work while sub-sessions run in parallel

Prompt Engineering for Self-Spawning Engineer prompts to guide when agents spawn sub-sessions:

Research tasks that require cross-repository exploration
Breaking monolithic changes into smaller PRs
Parallel exploration of different approaches

Agent生成的会话 创建允许Agent生成新会话的工具：

跨不同代码仓库执行研究任务
并行处理大型变更的子任务
从一个主任务生成多个小型PR

前沿模型具备自我管理能力。工具应支持：

使用指定参数启动新会话
查看任意会话的状态（检查能力）
子会话并行运行时继续主任务处理

自生成的提示工程 设计提示以引导Agent在合适场景下生成子会话：

需要跨仓库探索的研究任务
将大型变更拆分为多个小型PR
并行探索不同解决方案

API Layer

API层

Per-Session State Isolation Each session requires its own isolated state storage:

Dedicated database per session (SQLite per session works well)
No session can impact another's performance
Handles hundreds of concurrent sessions

Real-Time Streaming Agent work involves high-frequency updates:

Token streaming from model providers
Tool execution status updates
File change notifications

WebSocket connections with hibernation APIs reduce compute costs during idle periods while maintaining open connections.

Synchronization Across Clients Build a single state system that synchronizes across:

Chat interfaces
Slack bots
Chrome extensions
Web interfaces
VS Code instances

All changes sync to the session state, enabling seamless client switching.

会话级状态隔离 每个会话需要独立的状态存储：

每个会话对应专属数据库（SQLite适合此场景）
会话之间不会互相影响性能
支持数百个并发会话

实时流传输 Agent工作涉及高频更新：

模型提供商的令牌流传输
工具执行状态更新
文件变更通知

带有休眠API的WebSocket连接可在空闲时段降低计算成本，同时保持连接处于打开状态。

跨客户端同步 构建统一的状态系统以实现跨平台同步：

聊天界面
Slack机器人
Chrome扩展
网页界面
VS Code实例

所有变更同步到会话状态，实现无缝的客户端切换。

Multiplayer Support

多用户协作支持

Why Multiplayer Matters Multiplayer enables:

Teaching non-engineers to use AI effectively
Live QA sessions with multiple team members
Real-time PR review with immediate changes
Collaborative debugging sessions

Implementation Requirements

Data model must not tie sessions to single authors
Pass authorship info to each prompt
Attribute code changes to the prompting user
Share session links for instant collaboration

With proper synchronization architecture, multiplayer support is nearly free to add.

多用户协作的价值 多用户协作可实现：

指导非工程师用户有效使用AI
多团队成员参与的实时QA会话
可即时修改的实时PR评审
协作调试会话

实现要求

数据模型不能将会话绑定到单个用户
向每个请求传递用户身份信息
代码变更归属到发起请求的用户
支持会话链接分享以实现即时协作

借助合适的同步架构，多用户协作功能的开发成本几乎为零。

Authentication and Authorization

认证与授权

User-Based Commits Use GitHub authentication to:

Obtain user tokens for PR creation
Open PRs on behalf of the user (not the app)
Prevent users from approving their own changes

Sandbox-to-API Flow

Sandbox pushes changes (updating git user config)
Sandbox sends event to API with branch name and session ID
API uses user's GitHub token to create PR
GitHub webhooks notify API of PR events

基于用户的提交 使用GitHub认证实现：

获取用户令牌以创建PR
代表用户（而非应用）创建PR
防止用户批准自己的变更

沙箱到API的流程

沙箱推送变更（更新git用户配置）
沙箱向API发送包含分支名称和会话ID的事件
API使用用户的GitHub令牌创建PR
GitHub Webhook通知API PR相关事件

Client Implementations

客户端实现

Slack Integration The most effective distribution channel for internal adoption:

Creates virality loop as team members see others using it
No syntax required, natural chat interface
Classify repository from message, thread context, and channel name

Build a classifier to determine which repository to work in:

Fast model with descriptions of available repositories
Include hints for common repositories
Allow "unknown" option for ambiguous cases

Web Interface Core features:

Works on desktop and mobile
Real-time streaming of agent work
Hosted VS Code instance running inside sandbox
Streamed desktop view for visual verification
Before/after screenshots for PRs

Statistics page showing:

Sessions resulting in merged PRs (primary metric)
Usage over time
Live "humans prompting" count (prompts in last 5 minutes)

Chrome Extension For non-engineering users:

Sidebar chat interface with screenshot tool
DOM and React internals extraction instead of raw images
Reduces token usage while maintaining precision
Distribute via managed device policy (bypasses Chrome Web Store)

Slack集成 内部推广最有效的分发渠道：

团队成员看到他人使用后会形成传播效应
无需特定语法，支持自然聊天交互
根据消息、线程上下文和频道名称识别目标代码仓库

构建分类器以确定要操作的代码仓库：

轻量模型，包含可用仓库的描述信息
为常见仓库提供提示
对模糊场景支持“未知”选项

网页界面 核心功能：

支持桌面和移动设备
Agent工作进度实时流传输
沙箱中运行的托管式VS Code实例
用于视觉验证的桌面视图流传输
PR的前后对比截图

统计页面展示：

最终合并PR的会话数（核心指标）
随时间变化的使用情况
实时“用户请求数”（过去5分钟内的请求量）

Chrome扩展 面向非工程师用户：

带截图工具的侧边栏聊天界面
提取DOM和React内部结构而非原始图片
在保持精度的同时降低令牌消耗
通过托管设备策略分发（绕过Chrome应用商店）

Practical Guidance

实践指导

Follow-Up Message Handling

后续消息处理

Decide how to handle messages sent during execution:

Queue approach: Messages wait until current prompt completes
Insert approach: Messages are processed immediately

Queueing is simpler to manage and lets users send thoughts on next steps while agent works. Build mechanism to stop agent mid-execution when needed.

确定执行过程中收到消息的处理方式：

队列模式：消息等待当前请求处理完成后再执行
插入模式：立即处理新消息

队列模式更易于管理，用户可在Agent工作时发送后续想法。需构建可在必要时终止Agent执行的机制。

Metrics That Matter

关键指标

Track metrics that indicate real value:

Sessions resulting in merged PRs (primary success metric)
Time from session start to first model response
PR approval rate and revision count
Agent-written code percentage across repositories

跟踪能体现实际价值的指标：

最终合并PR的会话数（核心成功指标）
会话启动到首次模型响应的时间
PR批准率和修订次数
各仓库中Agent编写代码的占比

Adoption Strategy

推广策略

Internal adoption patterns that work:

Work in public spaces (Slack channels) for visibility
Let the product create virality loops
Don't force usage over existing tools
Build to people's needs, not hypothetical requirements

有效的内部推广模式：

在公开空间（Slack频道）使用以提升可见度
借助产品自身形成传播效应
不强制用户替代现有工具
围绕用户实际需求构建，而非假设需求

Guidelines

指导原则

Pre-build environment images on regular cadence (30 minutes is a good default)
Start warming sandboxes when users begin typing, not when they submit
Allow file reads before git sync completes; block only writes
Structure agent framework as server-first with clients as thin wrappers
Isolate state per session to prevent cross-session interference
Attribute commits to the user who prompted, not the app
Track merged PRs as primary success metric
Build for multiplayer from the start; it is nearly free with proper sync architecture

定期预构建环境镜像（默认每30分钟一次）
用户开始输入时即启动沙箱预热，而非提交请求时
Git同步完成前允许文件读取，仅阻止写入操作
采用服务器优先的Agent框架，客户端作为轻量封装
为每个会话隔离状态，防止会话间干扰
代码提交归属到发起请求的用户，而非应用
将合并PR数作为核心成功指标
从设计初期就支持多用户协作；借助合适的同步架构，其开发成本几乎为零

Integration

集成

This skill builds on multi-agent-patterns for agent coordination and tool-design for agent-tool interfaces. It connects to:

multi-agent-patterns - Self-spawning agents follow supervisor patterns
tool-design - Building tools for agent spawning and status checking
context-optimization - Managing context across distributed sessions
filesystem-context - Using filesystem for session state and artifacts

此技能基于多Agent协作模式实现Agent协调，基于工具设计实现Agent-工具交互界面。它与以下技能相关：

multi-agent-patterns - 自生成Agent遵循管理者模式
tool-design - 为Agent生成和状态检查功能构建工具
context-optimization - 在分布式会话中管理上下文
filesystem-context - 使用文件系统存储会话状态和产物

References

参考资料

Internal reference:

Infrastructure Patterns - Detailed implementation patterns

Related skills in this collection:

multi-agent-patterns - Coordination patterns for self-spawning agents
tool-design - Designing tools for hosted environments
context-optimization - Managing context in distributed systems

External resources:

Ramp - Why We Built Our Own Background Agent
Modal Sandboxes - Cloud sandbox infrastructure
Cloudflare Durable Objects - Per-session state management
OpenCode - Server-first agent framework

内部参考：

基础设施模式 - 详细的实现模式

本技能集中的相关技能：

multi-agent-patterns - 自生成Agent的协作模式
tool-design - 为托管环境设计工具
context-optimization - 在分布式系统中管理上下文

外部资源：

Ramp - 我们为何构建自己的后台Agent
Modal Sandboxes - 云沙箱基础设施
Cloudflare Durable Objects - 会话级状态管理
OpenCode - 服务器优先的Agent框架

Skill Metadata

技能元数据

Created: 2026-01-12 Last Updated: 2026-01-12 Author: Agent Skills for Context Engineering Contributors Version: 1.0.0

创建时间: 2026-01-12 最后更新时间: 2026-01-12 作者: Agent Skills for Context Engineering Contributors 版本: 1.0.0