hosted-agents
Compare original and translation side by side
🇺🇸
Original
English🇨🇳
Translation
ChineseHosted Agent Infrastructure
托管式Agent基础设施
Hosted agents run in remote sandboxed environments rather than on local machines. When designed well, they provide unlimited concurrency, consistent execution environments, and multiplayer collaboration. The critical insight is that session speed should be limited only by model provider time-to-first-token, with all infrastructure setup completed before the user starts their session.
托管式Agent运行在远程沙箱环境中,而非本地机器。设计完善的托管式Agent可提供无限并发能力、一致的执行环境以及多用户协作功能。核心设计思路是:会话速度应仅受限于模型提供商的首令牌响应时间,所有基础设施配置需在用户启动会话前完成。
When to Activate
激活场景
Activate this skill when:
- Building background coding agents that run independently of user devices
- Designing sandboxed execution environments for agent workloads
- Implementing multiplayer agent sessions with shared state
- Creating multi-client agent interfaces (Slack, Web, Chrome extensions)
- Scaling agent infrastructure beyond local machine constraints
- Building systems where agents spawn sub-agents for parallel work
在以下场景中激活此技能:
- 构建独立于用户设备运行的后台编码Agent
- 为Agent工作负载设计沙箱执行环境
- 实现具备共享状态的多用户协作Agent会话
- 创建多客户端Agent交互界面(Slack、网页、Chrome扩展)
- 突破本地机器限制,扩展Agent基础设施规模
- 构建可生成子Agent以并行处理任务的系统
Core Concepts
核心概念
Hosted agents address the fundamental limitation of local agent execution: resource contention, environment inconsistency, and single-user constraints. By moving agent execution to remote sandboxed environments, teams gain unlimited concurrency, reproducible environments, and collaborative workflows.
The architecture consists of three layers: sandbox infrastructure for isolated execution, API layer for state management and client coordination, and client interfaces for user interaction across platforms. Each layer has specific design requirements that enable the system to scale.
托管式Agent解决了本地Agent执行的根本性局限:资源竞争、环境不一致以及单用户限制。通过将Agent执行转移到远程沙箱环境,团队可获得无限并发能力、可复现的执行环境以及协作工作流。
该架构包含三层:用于隔离执行的沙箱基础设施、用于状态管理和客户端协调的API层,以及支持跨平台用户交互的客户端界面。每一层都有特定的设计要求以保障系统可扩展性。
Detailed Topics
详细主题
Sandbox Infrastructure
沙箱基础设施
The Core Challenge
Spinning up full development environments quickly is the primary technical challenge. Users expect near-instant session starts, but development environments require cloning repositories, installing dependencies, and running build steps.
Image Registry Pattern
Pre-build environment images on a regular cadence (every 30 minutes works well). Each image contains:
- Cloned repository at a known commit
- All runtime dependencies installed
- Initial setup and build commands completed
- Cached files from running app and test suite once
When starting a session, spin up a sandbox from the most recent image. The repository is at most 30 minutes out of date, making synchronization with the latest code much faster.
Snapshot and Restore
Take filesystem snapshots at key points:
- After initial image build (base snapshot)
- When agent finishes making changes (session snapshot)
- Before sandbox exit for potential follow-up
This enables instant restoration for follow-up prompts without re-running setup.
Git Configuration for Background Agents
Since git operations are not tied to a specific user during image builds:
- Generate GitHub app installation tokens for repository access during clone
- Update git config's and
user.namewhen committing and pushing changesuser.email - Use the prompting user's identity for commits, not the app identity
Warm Pool Strategy
Maintain a pool of pre-warmed sandboxes for high-volume repositories:
- Sandboxes are ready before users start sessions
- Expire and recreate pool entries as new image builds complete
- Start warming sandbox as soon as user begins typing (predictive warm-up)
核心挑战
快速启动完整开发环境是首要技术难题。用户期望会话近乎即时启动,但开发环境需要克隆代码仓库、安装依赖并执行构建步骤。
镜像仓库模式
定期预构建环境镜像(每30分钟一次效果良好)。每个镜像包含:
- 基于指定提交版本克隆的代码仓库
- 已安装的所有运行时依赖
- 已完成的初始配置和构建命令
- 运行应用和测试套件后缓存的文件
启动会话时,从最新镜像创建沙箱。代码仓库最多滞后30分钟,与最新代码同步的速度会大幅提升。
快照与恢复
在关键节点生成文件系统快照:
- 初始镜像构建完成后(基础快照)
- Agent完成修改后(会话快照)
- 沙箱退出前(用于后续可能的跟进操作)
这使得后续请求可即时恢复会话,无需重新执行配置步骤。
后台Agent的Git配置
由于镜像构建期间Git操作不绑定特定用户:
- 生成GitHub应用安装令牌以在克隆时访问仓库
- 提交和推送变更时更新git配置中的和
user.nameuser.email - 使用发起请求的用户身份进行提交,而非应用身份
预热池策略
为高访问量的代码仓库维护预预热的沙箱池:
- 沙箱在用户启动会话前已准备就绪
- 新镜像构建完成后,过期并重建池中的沙箱实例
- 用户开始输入时即启动沙箱预热(预测性预热)
Agent Framework Selection
Agent框架选择
Server-First Architecture
Choose an agent framework structured as a server first, with TUI and desktop apps as clients. This enables:
- Multiple custom clients without duplicating agent logic
- Consistent behavior across all interaction surfaces
- Plugin systems for extending functionality
- Event-driven architectures for real-time updates
Code as Source of Truth
Select frameworks where the agent can read its own source code to understand behavior. This is underrated in AI development: having the code as source of truth prevents hallucination about the agent's own capabilities.
Plugin System Requirements
The framework should support plugins that:
- Listen to tool execution events (e.g., )
tool.execute.before - Block or modify tool calls conditionally
- Inject context or state at runtime
服务器优先架构
选择以服务器为核心的Agent框架,将TUI和桌面应用作为客户端。这可实现:
- 无需重复编写Agent逻辑即可支持多个自定义客户端
- 所有交互界面的行为一致性
- 用于扩展功能的插件系统
- 支持实时更新的事件驱动架构
代码作为事实来源
选择Agent可读取自身源代码以理解自身行为的框架。这在AI开发中常被忽视:将代码作为事实来源可避免Agent对自身能力产生幻觉。
插件系统要求
框架应支持具备以下能力的插件:
- 监听工具执行事件(如)
tool.execute.before - 有条件地阻止或修改工具调用
- 在运行时注入上下文或状态
Speed Optimizations
速度优化
Predictive Warm-Up
Start warming the sandbox as soon as a user begins typing their prompt:
- Clone latest changes in parallel with user typing
- Run initial setup before user hits enter
- For fast spin-up, sandbox can be ready before user finishes typing
Parallel File Reading
Allow the agent to start reading files immediately, even if sync from latest base branch is not complete:
- In large repositories, incoming prompts rarely modify recently-changed files
- Agent can research immediately without waiting for git sync
- Block file edits (not reads) until synchronization completes
Maximize Build-Time Work
Move everything possible to the image build step:
- Full dependency installation
- Database schema setup
- Initial app and test suite runs (populates caches)
- Build-time duration is invisible to users
预测性预热
用户开始输入请求时即启动沙箱预热:
- 在用户输入的同时并行克隆最新变更
- 用户按下回车前完成初始配置
- 实现快速启动,沙箱可在用户完成输入前准备就绪
并行文件读取
允许Agent立即开始读取文件,即使尚未完成与最新基础分支的同步:
- 在大型代码仓库中,用户的请求很少会修改最近变更的文件
- Agent可立即开始分析,无需等待Git同步完成
- 同步完成前仅阻止文件编辑(不阻止读取)
最大化构建时工作量
尽可能将操作转移到镜像构建阶段:
- 完整的依赖安装
- 数据库架构配置
- 初始应用和测试套件运行(填充缓存)
- 构建阶段的耗时对用户不可见
Self-Spawning Agents
自生成Agent
Agent-Spawned Sessions
Create tools that allow agents to spawn new sessions:
- Research tasks across different repositories
- Parallel subtask execution for large changes
- Multiple smaller PRs from one major task
Frontier models are capable of containing themselves. The tools should:
- Start a new session with specified parameters
- Read status of any session (check-in capability)
- Continue main work while sub-sessions run in parallel
Prompt Engineering for Self-Spawning
Engineer prompts to guide when agents spawn sub-sessions:
- Research tasks that require cross-repository exploration
- Breaking monolithic changes into smaller PRs
- Parallel exploration of different approaches
Agent生成的会话
创建允许Agent生成新会话的工具:
- 跨不同代码仓库执行研究任务
- 并行处理大型变更的子任务
- 从一个主任务生成多个小型PR
前沿模型具备自我管理能力。工具应支持:
- 使用指定参数启动新会话
- 查看任意会话的状态(检查能力)
- 子会话并行运行时继续主任务处理
自生成的提示工程
设计提示以引导Agent在合适场景下生成子会话:
- 需要跨仓库探索的研究任务
- 将大型变更拆分为多个小型PR
- 并行探索不同解决方案
API Layer
API层
Per-Session State Isolation
Each session requires its own isolated state storage:
- Dedicated database per session (SQLite per session works well)
- No session can impact another's performance
- Handles hundreds of concurrent sessions
Real-Time Streaming
Agent work involves high-frequency updates:
- Token streaming from model providers
- Tool execution status updates
- File change notifications
WebSocket connections with hibernation APIs reduce compute costs during idle periods while maintaining open connections.
Synchronization Across Clients
Build a single state system that synchronizes across:
- Chat interfaces
- Slack bots
- Chrome extensions
- Web interfaces
- VS Code instances
All changes sync to the session state, enabling seamless client switching.
会话级状态隔离
每个会话需要独立的状态存储:
- 每个会话对应专属数据库(SQLite适合此场景)
- 会话之间不会互相影响性能
- 支持数百个并发会话
实时流传输
Agent工作涉及高频更新:
- 模型提供商的令牌流传输
- 工具执行状态更新
- 文件变更通知
带有休眠API的WebSocket连接可在空闲时段降低计算成本,同时保持连接处于打开状态。
跨客户端同步
构建统一的状态系统以实现跨平台同步:
- 聊天界面
- Slack机器人
- Chrome扩展
- 网页界面
- VS Code实例
所有变更同步到会话状态,实现无缝的客户端切换。
Multiplayer Support
多用户协作支持
Why Multiplayer Matters
Multiplayer enables:
- Teaching non-engineers to use AI effectively
- Live QA sessions with multiple team members
- Real-time PR review with immediate changes
- Collaborative debugging sessions
Implementation Requirements
- Data model must not tie sessions to single authors
- Pass authorship info to each prompt
- Attribute code changes to the prompting user
- Share session links for instant collaboration
With proper synchronization architecture, multiplayer support is nearly free to add.
多用户协作的价值
多用户协作可实现:
- 指导非工程师用户有效使用AI
- 多团队成员参与的实时QA会话
- 可即时修改的实时PR评审
- 协作调试会话
实现要求
- 数据模型不能将会话绑定到单个用户
- 向每个请求传递用户身份信息
- 代码变更归属到发起请求的用户
- 支持会话链接分享以实现即时协作
借助合适的同步架构,多用户协作功能的开发成本几乎为零。
Authentication and Authorization
认证与授权
User-Based Commits
Use GitHub authentication to:
- Obtain user tokens for PR creation
- Open PRs on behalf of the user (not the app)
- Prevent users from approving their own changes
Sandbox-to-API Flow
- Sandbox pushes changes (updating git user config)
- Sandbox sends event to API with branch name and session ID
- API uses user's GitHub token to create PR
- GitHub webhooks notify API of PR events
基于用户的提交
使用GitHub认证实现:
- 获取用户令牌以创建PR
- 代表用户(而非应用)创建PR
- 防止用户批准自己的变更
沙箱到API的流程
- 沙箱推送变更(更新git用户配置)
- 沙箱向API发送包含分支名称和会话ID的事件
- API使用用户的GitHub令牌创建PR
- GitHub Webhook通知API PR相关事件
Client Implementations
客户端实现
Slack Integration
The most effective distribution channel for internal adoption:
- Creates virality loop as team members see others using it
- No syntax required, natural chat interface
- Classify repository from message, thread context, and channel name
Build a classifier to determine which repository to work in:
- Fast model with descriptions of available repositories
- Include hints for common repositories
- Allow "unknown" option for ambiguous cases
Web Interface
Core features:
- Works on desktop and mobile
- Real-time streaming of agent work
- Hosted VS Code instance running inside sandbox
- Streamed desktop view for visual verification
- Before/after screenshots for PRs
Statistics page showing:
- Sessions resulting in merged PRs (primary metric)
- Usage over time
- Live "humans prompting" count (prompts in last 5 minutes)
Chrome Extension
For non-engineering users:
- Sidebar chat interface with screenshot tool
- DOM and React internals extraction instead of raw images
- Reduces token usage while maintaining precision
- Distribute via managed device policy (bypasses Chrome Web Store)
Slack集成
内部推广最有效的分发渠道:
- 团队成员看到他人使用后会形成传播效应
- 无需特定语法,支持自然聊天交互
- 根据消息、线程上下文和频道名称识别目标代码仓库
构建分类器以确定要操作的代码仓库:
- 轻量模型,包含可用仓库的描述信息
- 为常见仓库提供提示
- 对模糊场景支持“未知”选项
网页界面
核心功能:
- 支持桌面和移动设备
- Agent工作进度实时流传输
- 沙箱中运行的托管式VS Code实例
- 用于视觉验证的桌面视图流传输
- PR的前后对比截图
统计页面展示:
- 最终合并PR的会话数(核心指标)
- 随时间变化的使用情况
- 实时“用户请求数”(过去5分钟内的请求量)
Chrome扩展
面向非工程师用户:
- 带截图工具的侧边栏聊天界面
- 提取DOM和React内部结构而非原始图片
- 在保持精度的同时降低令牌消耗
- 通过托管设备策略分发(绕过Chrome应用商店)
Practical Guidance
实践指导
Follow-Up Message Handling
后续消息处理
Decide how to handle messages sent during execution:
- Queue approach: Messages wait until current prompt completes
- Insert approach: Messages are processed immediately
Queueing is simpler to manage and lets users send thoughts on next steps while agent works. Build mechanism to stop agent mid-execution when needed.
确定执行过程中收到消息的处理方式:
- 队列模式:消息等待当前请求处理完成后再执行
- 插入模式:立即处理新消息
队列模式更易于管理,用户可在Agent工作时发送后续想法。需构建可在必要时终止Agent执行的机制。
Metrics That Matter
关键指标
Track metrics that indicate real value:
- Sessions resulting in merged PRs (primary success metric)
- Time from session start to first model response
- PR approval rate and revision count
- Agent-written code percentage across repositories
跟踪能体现实际价值的指标:
- 最终合并PR的会话数(核心成功指标)
- 会话启动到首次模型响应的时间
- PR批准率和修订次数
- 各仓库中Agent编写代码的占比
Adoption Strategy
推广策略
Internal adoption patterns that work:
- Work in public spaces (Slack channels) for visibility
- Let the product create virality loops
- Don't force usage over existing tools
- Build to people's needs, not hypothetical requirements
有效的内部推广模式:
- 在公开空间(Slack频道)使用以提升可见度
- 借助产品自身形成传播效应
- 不强制用户替代现有工具
- 围绕用户实际需求构建,而非假设需求
Guidelines
指导原则
- Pre-build environment images on regular cadence (30 minutes is a good default)
- Start warming sandboxes when users begin typing, not when they submit
- Allow file reads before git sync completes; block only writes
- Structure agent framework as server-first with clients as thin wrappers
- Isolate state per session to prevent cross-session interference
- Attribute commits to the user who prompted, not the app
- Track merged PRs as primary success metric
- Build for multiplayer from the start; it is nearly free with proper sync architecture
- 定期预构建环境镜像(默认每30分钟一次)
- 用户开始输入时即启动沙箱预热,而非提交请求时
- Git同步完成前允许文件读取,仅阻止写入操作
- 采用服务器优先的Agent框架,客户端作为轻量封装
- 为每个会话隔离状态,防止会话间干扰
- 代码提交归属到发起请求的用户,而非应用
- 将合并PR数作为核心成功指标
- 从设计初期就支持多用户协作;借助合适的同步架构,其开发成本几乎为零
Integration
集成
This skill builds on multi-agent-patterns for agent coordination and tool-design for agent-tool interfaces. It connects to:
- multi-agent-patterns - Self-spawning agents follow supervisor patterns
- tool-design - Building tools for agent spawning and status checking
- context-optimization - Managing context across distributed sessions
- filesystem-context - Using filesystem for session state and artifacts
此技能基于多Agent协作模式实现Agent协调,基于工具设计实现Agent-工具交互界面。它与以下技能相关:
- multi-agent-patterns - 自生成Agent遵循管理者模式
- tool-design - 为Agent生成和状态检查功能构建工具
- context-optimization - 在分布式会话中管理上下文
- filesystem-context - 使用文件系统存储会话状态和产物
References
参考资料
Internal reference:
- Infrastructure Patterns - Detailed implementation patterns
Related skills in this collection:
- multi-agent-patterns - Coordination patterns for self-spawning agents
- tool-design - Designing tools for hosted environments
- context-optimization - Managing context in distributed systems
External resources:
- Ramp - Why We Built Our Own Background Agent
- Modal Sandboxes - Cloud sandbox infrastructure
- Cloudflare Durable Objects - Per-session state management
- OpenCode - Server-first agent framework
内部参考:
- 基础设施模式 - 详细的实现模式
本技能集中的相关技能:
- multi-agent-patterns - 自生成Agent的协作模式
- tool-design - 为托管环境设计工具
- context-optimization - 在分布式系统中管理上下文
外部资源:
- Ramp - 我们为何构建自己的后台Agent
- Modal Sandboxes - 云沙箱基础设施
- Cloudflare Durable Objects - 会话级状态管理
- OpenCode - 服务器优先的Agent框架
Skill Metadata
技能元数据
Created: 2026-01-12
Last Updated: 2026-01-12
Author: Agent Skills for Context Engineering Contributors
Version: 1.0.0
创建时间: 2026-01-12
最后更新时间: 2026-01-12
作者: Agent Skills for Context Engineering Contributors
版本: 1.0.0