multimedia-backend-integrator
Compare original and translation side by side
🇺🇸
Original
English🇨🇳
Translation
ChineseMultimedia Backend Integrator
多媒体后端集成指南
Reference guide for adding new media generation backends to MassGen's unified tool.
generate_media用于向MassGen统一的工具添加新媒体生成后端的参考指南。
generate_mediaArchitecture Overview
架构概览
_base.py -- Registration: API keys, default models, priority lists
_selector.py -- Auto-selection logic: picks best backend by key + priority
_image.py -- Image backends: OpenAI, Google (Gemini/Imagen), Grok, OpenRouter
_video.py -- Video backends: Grok, Google Veo, OpenAI Sora
_audio.py -- Audio backends: ElevenLabs, OpenAI TTS
generate_media.py -- Entry point: routing, validation, batch mode, image-to-image_base.py -- Registration: API keys, default models, priority lists
_selector.py -- Auto-selection logic: picks best backend by key + priority
_image.py -- Image backends: OpenAI, Google (Gemini/Imagen), Grok, OpenRouter
_video.py -- Video backends: Grok, Google Veo, OpenAI Sora
_audio.py -- Audio backends: ElevenLabs, OpenAI TTS
generate_media.py -- Entry point: routing, validation, batch mode, image-to-imageComplete Checklist: Adding a New Backend
新增后端的完整检查清单
1. Registration (_base.py
)
_base.py1. 注册(_base.py
)
_base.py- Add to : map backend name to env var(s)
BACKEND_API_KEYS - Add to : map backend name to
DEFAULT_MODELSfor each supported type{MediaType: model_name} - Add to : insert at correct position per media type
BACKEND_PRIORITY
- 添加到:将后端名称映射到环境变量
BACKEND_API_KEYS - 添加到:针对后端支持的每种媒体类型,将后端名称映射到
DEFAULT_MODELS结构{MediaType: model_name} - 添加到:按媒体类型插入到正确的优先级位置
BACKEND_PRIORITY
2. Implementation (_image.py
/ _video.py
/ _audio.py
)
_image.py_video.py_audio.py2. 实现(_image.py
/ _video.py
/ _audio.py
)
_image.py_video.py_audio.py- Add for SDK at module top
import - Implement
_generate_{media}_{backend}(config) -> GenerationResult - Check API key first, return error result if missing
- Create SDK client with API key
- Map fields to SDK parameters
config.* - Handle continuation (if applicable) — see Continuation Store Patterns
- Write output bytes to
config.output_path - Return with metadata
GenerationResult - Wrap in try/except, log errors
- 在模块顶部导入对应的SDK
- 实现函数
_generate_{media}_{backend}(config) -> GenerationResult - 首先检查API key,若缺失则返回错误结果
- 使用API key创建SDK客户端
- 将字段映射到SDK参数
config.* - 处理续跑逻辑(如适用)—— 参考续跑存储模式章节
- 将输出字节流写入指定路径
config.output_path - 返回携带元数据的对象
GenerationResult - 使用try/except包裹逻辑,打印错误日志
3. Dispatcher Update
3. 调度器更新
- Add in the media type's
elif backend == "new_backend":functiongenerate_{media}()
- 在对应媒体类型的函数中添加
generate_{media}()分支elif backend == "new_backend":
4. Image-to-Image Support (generate_media.py
)
generate_media.py4. 图生图支持(generate_media.py
)
generate_media.py- Add backend name to the check in
selected_backend not in (...)_generate_single_with_input_images - Add fallback: in the auto-selection chain
elif has_api_key("new_backend"): - Update error message to mention new backend + env var
- 将后端名称添加到函数中的
_generate_single_with_input_images校验列表中selected_backend not in (...) - 在自动选择链路中添加降级逻辑:
elif has_api_key("new_backend"): - 更新错误提示,提及新增后端及其对应的环境变量
5. Documentation
5. 文档
- : Add env var to frontmatter, backend to tables, keywords
TOOL.md - docstring: Update
generate_media.pylist andbackend_typeSupported Backends
- :在前置元数据中添加环境变量,在表格和关键词中补充新增后端
TOOL.md - 的文档字符串:更新
generate_media.py列表和「支持的后端」章节backend_type
6. Tests
6. 测试
- Backend registration tests (API keys, default models, priority order)
- Auto-selection tests (with only this backend's key, with multiple keys)
- SDK call verification (correct params passed through)
- Output file written correctly
- Continuation flow (if applicable)
- Error handling (missing key, API errors)
- Parameter mapping (aspect_ratio, size, duration)
- Update existing tests that assert priority list length/contents
- 后端注册测试(API key、默认模型、优先级顺序)
- 自动选择测试(仅配置该后端key、同时配置多个后端key两种场景)
- SDK调用校验(确认参数传递正确)
- 输出文件写入正确性校验
- 续跑流程测试(如适用)
- 错误处理测试(key缺失、API报错场景)
- 参数映射测试(宽高比、尺寸、时长)
- 更新所有断言优先级列表长度/内容的已有测试用例
Continuation Store Patterns
续跑存储模式
Each backend that supports iterative editing needs a continuation mechanism:
| Backend | Store Type | Key Format | What's Stored | How Continuation Works |
|---|---|---|---|---|
| OpenAI | Stateless (server-side) | | Nothing locally | Pass |
| Gemini | | | (client, chat) tuples | Reuse chat object for |
| Grok | | | Base64 strings | Pass stored base64 as |
所有支持迭代编辑的后端都需要配套续跑机制:
| 后端 | 存储类型 | Key格式 | 存储内容 | 续跑实现逻辑 |
|---|---|---|---|---|
| OpenAI | 无状态(服务端存储) | | 本地无存储 | 将 |
| Gemini | | | (client, chat)元组 | 复用chat对象调用 |
| Grok | | | Base64字符串 | 将存储的Base64作为 |
Store Pattern Template
存储模式模板
python
class _NewBackendStore:
def __init__(self, max_items: int = 50):
self._store: OrderedDict[str, Any] = OrderedDict()
self._max = max_items
def save(self, data: Any) -> str:
store_id = f"prefix_{uuid.uuid4().hex[:12]}"
if len(self._store) >= self._max:
self._store.popitem(last=False) # LRU eviction
self._store[store_id] = data
return store_id
def get(self, store_id: str) -> Any | None:
return self._store.get(store_id)
_store = _NewBackendStore()python
class _NewBackendStore:
def __init__(self, max_items: int = 50):
self._store: OrderedDict[str, Any] = OrderedDict()
self._max = max_items
def save(self, data: Any) -> str:
store_id = f"prefix_{uuid.uuid4().hex[:12]}"
if len(self._store) >= self._max:
self._store.popitem(last=False) # LRU eviction
self._store[store_id] = data
return store_id
def get(self, store_id: str) -> Any | None:
return self._store.get(store_id)
_store = _NewBackendStore()Common Pitfalls
常见问题
- Missing from priority list — Backend works when explicitly specified but never auto-selected
- Sync vs async — Some SDKs are sync-only; wrap in if needed
asyncio.to_thread() - Ephemeral URLs — Some APIs return temporary URLs; always prefer base64 or download immediately
- Falsy duration — treats
duration or defaultas falsy; use0if duration is not None - Existing test breakage — Adding to priority list changes auto-selection; update existing tests that clear env vars
- Image-to-image gating — The function has a backend allowlist
_generate_single_with_input_images
- 优先级列表缺失 —— 手动指定后端时可正常运行,但永远不会被自动选中
- 同步/异步适配问题 —— 部分SDK仅支持同步调用;必要时使用包裹
asyncio.to_thread() - 临时URL问题 —— 部分API返回临时链接;优先使用Base64或者立即下载资源
- 时长假值问题 —— 写法会把
duration or default识别为假值;请使用0判断if duration is not None - 现有测试损坏 —— 向优先级列表添加新后端会改变自动选择逻辑;请更新所有清空环境变量的现有测试用例
- 图生图权限校验 —— 函数存在后端白名单限制
_generate_single_with_input_images
Reference Files
参考文件
| File | Purpose |
|---|---|
| API keys, default models, priorities |
| Backend auto-selection logic |
| Image generation backends |
| Video generation backends |
| Audio generation backends |
| Entry point and routing |
| User-facing documentation |
| Reference: Grok backend tests |
| Reference: Grok selection tests |
| Reference: image selection tests |
| 文件路径 | 用途 |
|---|---|
| API key、默认模型、优先级配置 |
| 后端自动选择逻辑 |
| 图片生成后端实现 |
| 视频生成后端实现 |
| 音频生成后端实现 |
| 入口点与路由逻辑 |
| 面向用户的文档 |
| 参考:Grok后端测试用例 |
| 参考:Grok后端选择测试用例 |
| 参考:图片后端选择测试用例 |