massgen-develops-massgen
Compare original and translation side by side
🇺🇸
Original
English🇨🇳
Translation
ChineseMassGen Develops MassGen
用MassGen开发MassGen
This skill provides guidance for using MassGen to develop and improve itself. Choose the appropriate workflow based on what you're testing.
本Skill提供使用MassGen进行自我开发与改进的指导。请根据测试需求选择合适的工作流。
Two Workflows
两种工作流
- Automation Mode - Test backend functionality, coordination logic, agent responses
- Visual Evaluation - Test terminal display, colors, layout, UX
- Automation Mode(自动化模式) - 测试后端功能、协调逻辑、Agent响应
- Visual Evaluation(视觉评估) - 测试终端显示、颜色、布局、UX
Workflow 1: Automation Mode
工作流1:Automation Mode
Use this to test functionality without visual inspection. Ideal for programmatic testing.
无需视觉检查即可测试功能,非常适合程序化测试。
Running MassGen with Automation
以自动化模式运行MassGen
Run MassGen in the background (exact mechanism depends on your tooling):
bash
uv run massgen --automation --config massgen/configs/basic/multi/two_agents_gemini.yaml "What is 2+2?"For MassGen agents: Use MCP tool.
For Claude Code: Use Bash tool's parameter.
start_background_shellrun_in_background在后台运行MassGen(具体机制取决于你的工具链):
bash
uv run massgen --automation --config massgen/configs/basic/multi/two_agents_gemini.yaml "What is 2+2?"对于MassGen Agent:使用 MCP工具。
对于Claude Code:使用Bash工具的参数。
start_background_shellrun_in_backgroundWhy Automation Mode
为什么选择自动化模式
| Feature | Benefit |
|---|---|
| Clean output | ~10 parseable lines vs 3,000+ ANSI codes |
| LOG_DIR printed | First line shows log directory path |
| status.json | Real-time monitoring file |
| Exit codes | 0=success, 1=config, 2=execution, 3=timeout, 4=interrupted |
| Workspace isolation | Safe parallel execution |
| 特性 | 优势 |
|---|---|
| 简洁输出 | 约10行可解析内容,而非3000+行ANSI代码 |
| 打印LOG_DIR | 第一行显示日志目录路径 |
| status.json | 实时监控文件 |
| 退出码 | 0=成功,1=配置错误,2=执行错误,3=超时,4=中断 |
| 工作区隔离 | 支持安全的并行执行 |
Expected Output
预期输出
LOG_DIR: .massgen/massgen_logs/log_20251120_143022_123456
STATUS: .massgen/massgen_logs/log_20251120_143022_123456/status.json
🤖 Multi-Agent Mode
Agents: gemini-2.5-pro1, gemini-2.5-pro2
Question: What is 2+2?
============================================================
QUESTION: What is 2+2?
[Coordination in progress - monitor status.json for real-time updates]
WINNER: gemini-2.5-pro1
DURATION: 33.4s
ANSWER_PREVIEW: The answer is 4.
COMPLETED: 2 agents, 35.2s totalParse from the first line to find the log directory.
LOG_DIRLOG_DIR: .massgen/massgen_logs/log_20251120_143022_123456
STATUS: .massgen/massgen_logs/log_20251120_143022_123456/status.json
🤖 Multi-Agent Mode
Agents: gemini-2.5-pro1, gemini-2.5-pro2
Question: What is 2+2?
============================================================
QUESTION: What is 2+2?
[Coordination in progress - monitor status.json for real-time updates]
WINNER: gemini-2.5-pro1
DURATION: 33.4s
ANSWER_PREVIEW: The answer is 4.
COMPLETED: 2 agents, 35.2s total从第一行解析以找到日志目录。
LOG_DIRMonitoring Progress
监控进度
Read the status.json file (updated every 2 seconds):
bash
cat .massgen/massgen_logs/log_20251120_143022_123456/status.jsonKey fields:
json
{
"coordination": {
"completion_percentage": 65,
"phase": "enforcement"
},
"results": {
"winner": null // null = running, "agent_id" = done
},
"agents": {
"agent_a": {
"status": "streaming",
"error": null
}
}
}Agent status values: , , , , ,
waitingstreamingansweredvotedcompletederror读取status.json文件(每2秒更新一次):
bash
cat .massgen/massgen_logs/log_20251120_143022_123456/status.json关键字段:
json
{
"coordination": {
"completion_percentage": 65,
"phase": "enforcement"
},
"results": {
"winner": null // null = 运行中, "agent_id" = 完成
},
"agents": {
"agent_a": {
"status": "streaming",
"error": null
}
}
}Agent状态值: , , , , ,
waitingstreamingansweredvotedcompletederrorReading Results
读取结果
After completion (exit code 0):
bash
undefined完成后(退出码0):
bash
undefinedRead final answer
读取最终答案
cat [log_dir]/final/[winner]/answer.txt
undefinedcat [log_dir]/final/[winner]/answer.txt
undefinedTiming Expectations
时间预期
- Standard tasks: 2-10 minutes
- Complex/meta tasks: 10-30 minutes
- Check if stuck: Read status.json - if increases, it's working
completion_percentage
- 标准任务:2-10分钟
- 复杂/元任务:10-30分钟
- 检查是否卡住:读取status.json - 如果持续增加,则系统仍在运行
completion_percentage
Advanced: Multiple Background Monitors
进阶:多后台监控
You can create multiple background monitoring tasks that run independently alongside the main MassGen process. Each monitor can track different aspects and write to separate log files for later inspection.
你可以创建多个独立的后台监控任务,与主MassGen进程并行运行。每个监控任务可以跟踪不同的方面,并将数据写入单独的日志文件,供后续检查。
Approach
实现方法
Create small Python scripts that run in background shells. Each script:
- Monitors a specific aspect (tokens, errors, progress, coordination, etc.)
- Writes timestamped data to its own log file
- Runs in a loop with intervals
sleep() - Can be checked anytime without blocking the main task
创建小型Python脚本并在后台shell中运行。每个脚本:
- 监控特定方面(令牌使用、错误、进度、协调等)
- 将带时间戳的数据写入专属日志文件
- 循环运行并包含间隔
sleep() - 可随时检查,不会阻塞主任务
Example Monitor Scripts
监控脚本示例
Token Usage Monitor ():
token_monitor.pypython
import json, time, sys
from pathlib import Path
log_dir = Path(sys.argv[1]) # Pass LOG_DIR as argument
while True:
if (log_dir / "status.json").exists():
with open(log_dir / "status.json") as f:
data = json.load(f)
with open("token_monitor.log", "a") as log:
log.write(f"=== {time.strftime('%H:%M:%S')} ===\n")
log.write(f"Tokens: {data.get('total_tokens_used', 0)}\n")
log.write(f"Cost: ${data.get('total_cost', 0):.4f}\n\n")
time.sleep(5)Error Monitor ():
error_monitor.pypython
import time, sys
from pathlib import Path
log_dir = Path(sys.argv[1])
while True:
if log_dir.exists():
with open("error_monitor.log", "a") as log:
log.write(f"=== {time.strftime('%H:%M:%S')} ===\n")
errors = []
for logfile in log_dir.glob("*.log"):
with open(logfile) as f:
for line in f:
if any(x in line.lower() for x in ['error', 'warning', 'failed']):
errors.append(line.strip())
log.write('\n'.join(errors[-5:]) if errors else "No errors\n")
log.write("\n")
time.sleep(5)Progress Monitor ():
progress_monitor.pypython
import json, time, sys
from pathlib import Path
log_dir = Path(sys.argv[1])
while True:
if (log_dir / "status.json").exists():
with open(log_dir / "status.json") as f:
data = json.load(f)
with open("progress_monitor.log", "a") as log:
log.write(f"=== {time.strftime('%H:%M:%S')} ===\n")
progress = data.get('completion_percentage', 0)
active = sum(1 for a in data.get('agents', {}).values()
if a.get('status') == 'active')
log.write(f"Progress: {progress}% Active agents: {active}\n\n")
time.sleep(5)Coordination Monitor ():
coordination_monitor.pypython
import json, time, sys
from pathlib import Path
log_dir = Path(sys.argv[1])
while True:
if (log_dir / "status.json").exists():
with open(log_dir / "status.json") as f:
data = json.load(f)
coord = data.get('coordination', {})
with open("coordination_monitor.log", "a") as log:
log.write(f"=== {time.strftime('%H:%M:%S')} ===\n")
log.write(f"Phase: {coord.get('phase', 'unknown')}\n")
log.write(f"Round: {coord.get('round', 0)}\n")
log.write(f"Total answers: {coord.get('total_answers', 0)}\n\n")
time.sleep(5)令牌使用监控 ():
token_monitor.pypython
import json, time, sys
from pathlib import Path
log_dir = Path(sys.argv[1]) # 将LOG_DIR作为参数传入
while True:
if (log_dir / "status.json").exists():
with open(log_dir / "status.json") as f:
data = json.load(f)
with open("token_monitor.log", "a") as log:
log.write(f"=== {time.strftime('%H:%M:%S')} ===\n")
log.write(f"Tokens: {data.get('total_tokens_used', 0)}\n")
log.write(f"Cost: ${data.get('total_cost', 0):.4f}\n\n")
time.sleep(5)错误监控 ():
error_monitor.pypython
import time, sys
from pathlib import Path
log_dir = Path(sys.argv[1])
while True:
if log_dir.exists():
with open("error_monitor.log", "a") as log:
log.write(f"=== {time.strftime('%H:%M:%S')} ===\n")
errors = []
for logfile in log_dir.glob("*.log"):
with open(logfile) as f:
for line in f:
if any(x in line.lower() for x in ['error', 'warning', 'failed']):
errors.append(line.strip())
log.write('\n'.join(errors[-5:]) if errors else "No errors\n")
log.write("\n")
time.sleep(5)进度监控 ():
progress_monitor.pypython
import json, time, sys
from pathlib import Path
log_dir = Path(sys.argv[1])
while True:
if (log_dir / "status.json").exists():
with open(log_dir / "status.json") as f:
data = json.load(f)
with open("progress_monitor.log", "a") as log:
log.write(f"=== {time.strftime('%H:%M:%S')} ===\n")
progress = data.get('completion_percentage', 0)
active = sum(1 for a in data.get('agents', {}).values()
if a.get('status') == 'active')
log.write(f"Progress: {progress}% Active agents: {active}\n\n")
time.sleep(5)协调监控 ():
coordination_monitor.pypython
import json, time, sys
from pathlib import Path
log_dir = Path(sys.argv[1])
while True:
if (log_dir / "status.json").exists():
with open(log_dir / "status.json") as f:
data = json.load(f)
coord = data.get('coordination', {})
with open("coordination_monitor.log", "a") as log:
log.write(f"=== {time.strftime('%H:%M:%S')} ===\n")
log.write(f"Phase: {coord.get('phase', 'unknown')}\n")
log.write(f"Round: {coord.get('round', 0)}\n")
log.write(f"Total answers: {coord.get('total_answers', 0)}\n\n")
time.sleep(5)Workflow
工作流
- Launch main task, parse the LOG_DIR from output
- Create monitor scripts as needed (write Python files)
- Launch monitors in background shells:
python3 token_monitor.py [LOG_DIR] & - Check monitor logs anytime by reading the .log files
- When complete, kill monitor processes and analyze logs
- 启动主任务,从输出中解析LOG_DIR
- 创建监控脚本(按需编写Python文件)
- 在后台shell中启动监控:
python3 token_monitor.py [LOG_DIR] & - 随时检查监控日志,读取.log文件即可
- 任务完成后,终止监控进程并分析日志
Custom Monitors
自定义监控
Create monitors for any metric you want to track:
- Model-specific performance metrics
- Memory/context usage patterns
- Real-time cost accumulation
- Answer quality trends
- Agent coordination patterns
- Specific error categories
Benefits:
- Non-blocking inspection of specific metrics on demand
- Historical data captured for post-run analysis
- Independent monitoring streams for different aspects
- Easy to add new monitors without modifying configs
你可以创建监控脚本来跟踪任何你关注的指标:
- 模型特定的性能指标
- 内存/上下文使用模式
- 实时成本累积
- 答案质量趋势
- Agent协调模式
- 特定错误类别
优势:
- 可按需非阻塞地检查特定指标
- 捕获历史数据用于事后分析
- 为不同方面提供独立的监控流
- 无需修改配置即可轻松添加新监控
Workflow 2: Visual Evaluation
工作流2:视觉评估
Use this to analyze and improve MassGen's terminal display quality. Requires tools from .
custom_tools/_multimodal_tools/Important: This workflow records the rich terminal display, so the actual recording does NOT use mode. However, you should ALWAYS pre-test with first.
--automation--automation用于分析和改进MassGen的终端显示质量。需要使用中的工具。
custom_tools/_multimodal_tools/重要提示:此工作流会记录丰富的终端显示内容,因此实际录制不使用模式。但你应始终先用模式进行预测试。
--automation--automationPrerequisites
前提条件
You should have these tools available in your workspace:
- - Records terminal sessions as video
run_massgen_with_recording - - Analyzes video frames with GPT-4.1 vision
understand_video
你的工作区应具备以下工具:
- - 将终端会话录制为视频
run_massgen_with_recording - - 使用GPT-4.1视觉能力分析视频帧
understand_video
Step 0: Pre-Test with Automation (REQUIRED)
步骤0:用自动化模式预测试(必填)
Before recording the video, verify the config works and API keys are valid:
bash
undefined在录制视频前,验证配置可用且API密钥有效:
bash
undefinedStart with --automation to verify everything works
先使用--automation模式验证所有功能正常
uv run massgen --automation --config [config_path] "[question]"
**Wait 30-60 seconds** (enough to verify API keys, config parsing, tool initialization), then kill the process.
**Why this is critical:**
- Detects config errors before wasting recording time
- Validates API keys are present and working
- Ensures tools initialize correctly
- Prevents recording a broken session
**If the automation test fails**, fix the issues before proceeding to recording.uv run massgen --automation --config [config_path] "[question]"
**等待30-60秒**(足够验证API密钥、配置解析、工具初始化),然后终止进程。
**为什么这一步至关重要:**
- 在浪费录制时间前检测配置错误
- 验证API密钥是否存在且可用
- 确保工具初始化正常
- 避免录制失败的会话
**如果自动化测试失败**,请先修复问题,再进行录制。Step 1: Record a MassGen Session
步骤1:录制MassGen会话
After the automation pre-test succeeds, record the visual session:
python
from custom_tools._multimodal_tools.run_massgen_with_recording import run_massgen_with_recording
result = await run_massgen_with_recording(
config_path="massgen/configs/basic/multi/two_agents_gemini.yaml",
question="What is 2+2?",
output_format="mp4", # ALWAYS use mp4 for maximum compatibility
timeout_seconds=120,
width=1920,
height=1080
)Format recommendation: Always use for maximum compatibility. GIF and WebM are supported but MP4 is preferred.
"mp4"The recording captures: Rich terminal display with colors, status indicators, coordination visualization (WITHOUT --automation flag).
自动化预测试成功后,录制可视化会话:
python
from custom_tools._multimodal_tools.run_massgen_with_recording import run_massgen_with_recording
result = await run_massgen_with_recording(
config_path="massgen/configs/basic/multi/two_agents_gemini.yaml",
question="What is 2+2?",
output_format="mp4", # 始终使用mp4以获得最大兼容性
timeout_seconds=120,
width=1920,
height=1080
)格式建议:始终使用以获得最大兼容性。支持GIF和WebM,但MP4是首选。
"mp4"录制内容:包含颜色、状态指示器、协调可视化效果的丰富终端显示(不使用--automation标志)。
Step 2: Analyze the Recording
步骤2:分析录制视频
Use to analyze the MP4 recording. Call it at least once, but as many as multiple times to analyze different aspects:
understand_videopython
from custom_tools._multimodal_tools.understand_video import understand_video使用分析MP4录制文件。至少调用一次,也可多次调用以分析不同方面:
understand_videopython
from custom_tools._multimodal_tools.understand_video import understand_videoOverall UX evaluation
整体UX评估
ux_eval = await understand_video(
video_path=result["video_path"], # The MP4 file from Step 1
prompt="Evaluate the overall terminal display quality, clarity, and usability",
num_frames=12
)
ux_eval = await understand_video(
video_path=result["video_path"], # 步骤1生成的MP4文件路径
prompt="Evaluate the overall terminal display quality, clarity, and usability",
num_frames=12
)
Focused on coordination
聚焦协调机制
coordination_eval = await understand_video(
video_path=result["video_path"],
prompt="How clearly does the display show agent coordination phases and voting?",
num_frames=8
)
coordination_eval = await understand_video(
video_path=result["video_path"],
prompt="How clearly does the display show agent coordination phases and voting?",
num_frames=8
)
Status indicators
状态指示器评估
status_eval = await understand_video(
video_path=result["video_path"],
prompt="Are status indicators (streaming, answered, voted) clear and visually distinct?",
num_frames=8
)
**Key points:**
- The recording tool saves the video to workspace - use that path for analysis
- You can call `understand_video` multiple times on the same video with different prompts
- Each call focuses on a specific aspect (UX, coordination, status, colors, etc.)status_eval = await understand_video(
video_path=result["video_path"],
prompt="Are status indicators (streaming, answered, voted) clear and visually distinct?",
num_frames=8
)
**关键点:**
- 录制工具会将视频保存到工作区 - 使用该路径进行分析
- 你可以对同一视频多次调用`understand_video`,使用不同的提示词
- 每次调用聚焦一个特定方面(UX、协调、状态、颜色等)Evaluation Criteria
评估标准
When analyzing terminal displays, assess:
- Visual Clarity - Contrast, colors, font rendering, ANSI handling, spacing
- Information Organization - Layout, content density, streaming display, scroll handling
- Status Indicators - Agent states, progress tracking, phase transitions, winner selection
- User Experience - Real-time feedback, error visibility, cognitive load, information hierarchy
分析终端显示时,需评估以下方面:
- 视觉清晰度 - 对比度、颜色、字体渲染、ANSI代码处理、间距
- 信息组织 - 布局、内容密度、流式显示、滚动处理
- 状态指示器 - Agent状态、进度跟踪、阶段转换、获胜者选择
- 用户体验 - 实时反馈、错误可见性、认知负荷、信息层级
Output Format Recommendations
输出格式建议
Default to MP4 - Maximum compatibility and quality.
| Format | Use Case | Notes |
|---|---|---|
| MP4 | Default - use for everything | Best quality, universally supported, ideal for detailed analysis |
| GIF | Smaller file size, easy embedding | Lower quality, larger files than expected, avoid unless size-constrained |
| WebM | Modern web publishing | Good quality, not universally supported |
Rule of thumb: Use MP4 unless you have a specific reason not to.
默认使用MP4 - 兼容性和质量最佳。
| 格式 | 使用场景 | 说明 |
|---|---|---|
| MP4 | 默认 - 所有场景通用 | 质量最佳,支持所有平台,适合详细分析 |
| GIF | 文件体积较小,易于嵌入 | 质量较低,文件体积可能超出预期,除非受体积限制否则避免使用 |
| WebM | 现代网页发布 | 质量良好,但并非所有平台都支持 |
经验法则:除非有特殊原因,否则一律使用MP4。
Frame Count Guidelines
帧数量指南
| Frames | Use Case |
|---|---|
| 4-8 | Quick evaluation |
| 8-12 | Standard evaluation |
| 12-16+ | Detailed analysis |
| 帧数 | 使用场景 |
|---|---|
| 4-8 | 快速评估 |
| 8-12 | 标准评估 |
| 12-16+ | 详细分析 |
Which Configs to Test
测试哪些配置
Model Selection Guidelines
模型选择指南
Default to mid-tier models when generating configs or running experiments. These provide the best balance of cost, speed, and capability for development and testing.
CRITICAL: Always check model recency based on TODAY'S DATE. Models older than 6-12 months should be considered outdated.
生成配置或运行实验时,默认选择中端模型。这些模型在成本、速度和能力之间达到最佳平衡,适合开发和测试。
关键提示:始终根据当前日期检查模型的时效性。发布超过6-12个月的模型应视为过时。
How to Select Models
模型选择步骤
Step 1: Read backend files to check release dates
bash
undefined步骤1:读取后端文件查看发布日期
bash
undefinedCheck Gemini models and their release dates
查看Gemini模型及其发布日期
grep -A 5 "model.*2." massgen/backend/gemini.py
grep -A 5 "model.*2." massgen/backend/gemini.py
Check OpenAI models and their release dates
查看OpenAI模型及其发布日期
grep -A 5 "model.*gpt" massgen/backend/openai.py
grep -A 5 "model.*gpt" massgen/backend/openai.py
Check Claude models and their release dates
查看Claude模型及其发布日期
grep -A 5 "model.*claude" massgen/backend/claude.py
**Step 2: Check token costs**
```bash
cat docs/source/reference/token_budget.rst | grep -A 3 "gemini\|gpt\|claude"Step 3: Compare release dates against today's date
- Calculate months since release: (today's year-month) - (model release year-month)
- If > 12 months: Model is outdated
- If 6-12 months: Model is aging, prefer newer if available
- If < 6 months: Model is current
grep -A 5 "model.*claude" massgen/backend/claude.py
**步骤2:检查令牌成本**
```bash
cat docs/source/reference/token_budget.rst | grep -A 3 "gemini\|gpt\|claude"步骤3:对比发布日期与当前日期
- 计算发布至今的月数:(当前年份-月份) - (模型发布年份-月份)
- 若>12个月:模型已过时
- 若6-12个月:模型已老化,优先选择更新的模型
- 若<6个月:模型为当前主流
Model Selection Examples
模型选择示例
✅ GOOD (Recent, mid-tier patterns):
- Gemini: ,
gemini-2.5-pro(2.x series, 2025)gemini-2.5-flash - OpenAI: ,
gpt-5-mini(GPT-5 generation)gpt-4o-mini - Claude: (4.x series, 2025)
claude-sonnet-4-*
⚠️ BAD (Outdated patterns - check dates!):
- ❌ (2024 release - likely >12 months old)
gpt-4o - ❌ (2023-2024 era)
gpt-4-turbo - ❌ (1.x series deprecated by 2.x)
gemini-1.5-pro - ❌ (3.x series when 4.x exists)
claude-3.5-sonnet
Selection criteria:
- Recency: Released within last 6-12 months (ALWAYS check backend files for dates)
- Mid-range pricing: Not top-tier (expensive) or bottom-tier (cheap)
- General availability: Stable release, not experimental/preview/alpha
- Version numbers: Higher major versions are newer (gemini-2.x > gemini-1.x, gpt-5 > gpt-4, claude-4 > claude-3)
When to deviate:
- Premium models: Testing model ceiling capabilities (e.g., ,
gpt-5,claude-opus-4)gemini-3-pro - Budget models: Cost optimization experiments (e.g., ,
gpt-5-mini)gemini-2.5-flash - Legacy testing: Validating backwards compatibility with older models
✅ 推荐(近期中端模型):
- Gemini: ,
gemini-2.5-pro(2.x系列,2025年发布)gemini-2.5-flash - OpenAI: ,
gpt-5-mini(GPT-5代)gpt-4o-mini - Claude: (4.x系列,2025年发布)
claude-sonnet-4-*
⚠️ 不推荐(过时模型 - 请检查日期!):
- ❌ (2024年发布 - 可能已超过12个月)
gpt-4o - ❌ (2023-2024年)
gpt-4-turbo - ❌ (1.x系列已被2.x系列取代)
gemini-1.5-pro - ❌ (4.x系列已发布,3.x系列过时)
claude-3.5-sonnet
选择标准:
- 时效性:发布时间不超过6-12个月(始终查看后端文件确认日期)
- 中端定价:非顶级(昂贵)或低端(廉价)模型
- 通用可用性:稳定发布版本,非实验/预览/alpha版本
- 版本号:主版本号越高越新(gemini-2.x > gemini-1.x, gpt-5 > gpt-4, claude-4 > claude-3)
何时可以例外:
- 高端模型:测试模型的上限能力(例如,
gpt-5,claude-opus-4)gemini-3-pro - 经济型模型:成本优化实验(例如,
gpt-5-mini)gemini-2.5-flash - 遗留测试:验证与旧模型的向后兼容性
Generating a Config (Agent-Friendly)
生成配置(Agent友好型)
Use for programmatic config generation:
--generate-configbash
undefined使用以编程方式生成配置:
--generate-configbash
undefinedWORKFLOW:
工作流:
1. Read backend file to find recent mid-tier models
1. 读取后端文件查找近期中端模型
2. Verify release date (< 12 months old)
2. 验证发布日期(<12个月)
3. Check pricing tier (mid-range)
3. 检查定价层级(中端)
4. Use model in --config-model flag
4. 在--config-model参数中使用该模型
Example: Generate 2-agent config
示例:生成双Agent配置
massgen --generate-config ./test_config.yaml
--config-backend gemini
--config-model gemini-2.5-pro \ # (example - always verify this is current!) --config-agents 2
--config-docker
--config-backend gemini
--config-model gemini-2.5-pro \ # (example - always verify this is current!) --config-agents 2
--config-docker
massgen --generate-config ./test_config.yaml
--config-backend gemini
--config-model gemini-2.5-pro \ # (示例 - 请始终验证该模型是否为当前主流!) --config-agents 2
--config-docker
--config-backend gemini
--config-model gemini-2.5-pro \ # (示例 - 请始终验证该模型是否为当前主流!) --config-agents 2
--config-docker
With context path
带上下文路径的情况
massgen --generate-config ./test_config.yaml
--config-backend openai
--config-model gpt-5-mini \ # (example - always verify this is current!) --config-context-path /path/to/project
--config-backend openai
--config-model gpt-5-mini \ # (example - always verify this is current!) --config-context-path /path/to/project
**IMPORTANT**: The model names shown above are EXAMPLES. Always check backend files for current models based on today's date.
This creates a full-featured config with code-based tools, skills, and task planning enabled.massgen --generate-config ./test_config.yaml
--config-backend openai
--config-model gpt-5-mini \ # (示例 - 请始终验证该模型是否为当前主流!) --config-context-path /path/to/project
--config-backend openai
--config-model gpt-5-mini \ # (示例 - 请始终验证该模型是否为当前主流!) --config-context-path /path/to/project
**重要提示**:以上模型名称仅为示例。请始终根据当前日期查看后端文件以确认当前主流模型。
此命令会生成一个全功能配置,包含代码工具、Skill和任务规划功能。Testing Specific Features
测试特定功能
Modify the generated config to enable/disable features:
Code execution:
yaml
agents:
- backend:
enable_mcp_command_line: true
command_line_execution_mode: "docker"Custom tools:
yaml
agents:
- backend:
enable_code_based_tools: true
auto_discover_custom_tools: trueDifferent models per agent:
yaml
agents:
- backend: {type: "gemini", model: "gemini-2.5-pro"}
- backend: {type: "openai", model: "gpt-5-mini"}Common parameters: , , , ,
enable_code_based_toolsenable_mcp_command_linecommand_line_execution_modeauto_discover_custom_toolstimeout_settings修改生成的配置以启用/禁用功能:
代码执行:
yaml
agents:
- backend:
enable_mcp_command_line: true
command_line_execution_mode: "docker"自定义工具:
yaml
agents:
- backend:
enable_code_based_tools: true
auto_discover_custom_tools: true为每个Agent配置不同模型:
yaml
agents:
- backend: {type: "gemini", model: "gemini-2.5-pro"}
- backend: {type: "openai", model: "gpt-5-mini"}常用参数:, , , ,
enable_code_based_toolsenable_mcp_command_linecommand_line_execution_modeauto_discover_custom_toolstimeout_settingsDocker Considerations
Docker相关注意事项
Automatic Docker Detection
自动Docker检测
MassGen automatically detects when running inside a Docker container. If a config has , MassGen will:
command_line_execution_mode: "docker"- Detect the container environment (via )
/.dockerenv - Automatically switch to execution mode
"local" - Log: "Already running inside Docker container - switching to local execution mode"
Why this works: The outer container already provides isolation. Running "locally" within that container is safe and sandboxed.
No manual configuration needed - configs with Docker mode just work when run inside containers.
MassGen会自动检测是否在Docker容器内运行。如果配置中包含,MassGen将:
command_line_execution_mode: "docker"- 检测容器环境(通过文件)
/.dockerenv - 自动切换为执行模式
"local" - 记录日志:"Already running inside Docker container - switching to local execution mode"
为什么这样设计:外层容器已提供隔离环境。在容器内以“本地”模式运行是安全且沙箱化的。
无需手动配置 - 带有Docker模式的配置在容器内运行时会自动适配。
Tradeoffs
权衡
When auto-switching to local execution:
- ✅ Still sandboxed from host
- ✅ All features work (VHS, MassGen, tools are in container)
- ⚠️ No per-execution isolation between tool calls
- ⚠️ State persists within container session
自动切换为本地执行模式时:
- ✅ 仍与主机系统隔离
- ✅ 所有功能正常(VHS、MassGen、工具均在容器内)
- ⚠️ 工具调用之间无每次执行的隔离
- ⚠️ 状态会在容器会话内持续保留
Reference Files
参考文件
- Status file docs:
docs/source/reference/status_file.rst - Terminal evaluation docs:
docs/source/user_guide/terminal_evaluation.rst - Example configs: ,
massgen/configs/basic/massgen/configs/meta/ - Recording tool:
massgen/tool/_multimodal_tools/run_massgen_with_recording.py - Video analysis tool:
massgen/tool/_multimodal_tools/understand_video.py
- 状态文件文档:
docs/source/reference/status_file.rst - 终端评估文档:
docs/source/user_guide/terminal_evaluation.rst - 示例配置:,
massgen/configs/basic/massgen/configs/meta/ - 录制工具:
massgen/tool/_multimodal_tools/run_massgen_with_recording.py - 视频分析工具:
massgen/tool/_multimodal_tools/understand_video.py