zoom-rtms

Compare original and translation side by side

🇺🇸

Original

English
🇨🇳

Translation

Chinese

Zoom Realtime Media Streams (RTMS)

Zoom 实时媒体流(RTMS)

Background reference for live Zoom media pipelines. Prefer
build-zoom-bot
first, then use this skill for stream types, capabilities, and RTMS-specific implementation constraints.
Zoom直播媒体管道的背景参考。优先使用
build-zoom-bot
,然后可参考本技能了解流类型、功能以及RTMS特有的实现约束。

Zoom Realtime Media Streams (RTMS)

Zoom 实时媒体流(RTMS)

Expert guidance for accessing live audio, video, transcript, chat, and screen share data from Zoom meetings, webinars, Video SDK sessions, and Zoom Contact Center Voice in real-time. RTMS uses a WebSocket-based protocol with open standards and does not require a meeting bot to capture the media plane.
用于实时获取Zoom会议、网络研讨会、Video SDK会话以及Zoom联络中心语音的直播音频、视频、转录文本、聊天和屏幕共享数据的专业指南。RTMS采用基于WebSocket的开放标准协议,无需会议机器人即可捕获媒体平面数据。

Read This First (Critical)

请先阅读本段(关键内容)

RTMS is primarily a backend media ingestion service.
  • Your backend receives and processes live media: audio, video, screen share, chat, transcript.
  • RTMS is not a frontend UI SDK by itself.
  • Processing is event-triggered: backend waits for RTMS start webhook events before stream handling begins.
Optional architecture (common):
  • Add a Zoom App SDK frontend for in-client UI/controls.
  • Stream backend RTMS outputs to frontend via WebSocket (or SSE, gRPC, queue workers, etc.).
Use RTMS for media/data plane, and use frontend frameworks/Zoom Apps for presentation + user interactions.
Official Documentation: https://developers.zoom.us/docs/rtms/ SDK Reference (JS): https://zoom.github.io/rtms/js/ SDK Reference (Python): https://zoom.github.io/rtms/py/ Sample Repository: https://github.com/zoom/rtms-samples
RTMS本质上是后端媒体接入服务
  • 你的后端接收并处理直播媒体:音频、视频、屏幕共享、聊天、转录文本
  • RTMS本身不是前端UI SDK。
  • 处理是事件触发的:后端需要先等待RTMS启动的Webhook事件,再开始处理流。
常用可选架构:
  • 新增Zoom App SDK前端实现客户端内UI/控制功能。
  • 后端通过WebSocket(或SSE、gRPC、队列worker等)将RTMS输出流推送到前端。
使用RTMS处理媒体/数据平面,使用前端框架/Zoom Apps实现展示层和用户交互功能。

Quick Links

快速链接

New to RTMS? Follow this path:
  1. Connection Architecture - Two-phase WebSocket design
  2. SDK Quickstart - Fastest way to receive media (recommended)
  3. Manual WebSocket - Full protocol control without SDK
  4. Media Types - Audio, video, transcript, chat, screen share
Complete Implementation:
  • RTMS Bot - End-to-end bot implementation guide
Reference:
  • Lifecycle Flow - Complete webhook-to-streaming flow
  • Data Types - All enums and constants
  • Webhooks - Event subscription details
  • Environment Variables - credential modes and runtime knobs
  • Quickstart Notes - Secondary quickstart guide
  • Integrated Index - see the section below in this file
Having issues?
  • Connection fails -> Common Issues
  • Duplicate connections -> Webhook Gotchas
  • No audio/video -> Media Configuration
  • Start with preflight checks -> 5-Minute Runbook
首次使用RTMS请按照以下路径操作:
  1. 连接架构 - 两阶段WebSocket设计
  2. SDK快速入门 - 最快的媒体接收方式(推荐)
  3. 手动实现WebSocket - 不使用SDK实现完整协议控制
  4. 媒体类型 - 音频、视频、转录、聊天、屏幕共享
完整实现参考:
  • RTMS机器人 - 端到端机器人实现指南
参考资料:
  • 生命周期流程 - 完整的Webhook到推流流程
  • 数据类型 - 所有枚举和常量
  • Webhooks - 事件订阅详情
  • 环境变量 - 凭证模式和运行时配置项
  • 快速入门说明 - 补充快速入门指南
  • 集成索引 - 查看本文件下方的章节
遇到问题?
  • 连接失败 -> 常见问题
  • 重复连接 -> Webhook注意事项
  • 无音频/视频 -> 媒体配置
  • 先运行预检检查 -> 5分钟运行手册

Supported Products

支持的产品

ProductWebhook EventPayload IDApp Type
Meetings
meeting.rtms_started
/
meeting.rtms_stopped
meeting_uuid
General App
Webinars
webinar.rtms_started
/
webinar.rtms_stopped
meeting_uuid
(same!)
General App
Video SDK
session.rtms_started
/
session.rtms_stopped
session_id
Video SDK App
Zoom Contact Center VoiceProduct-specific RTMS/ZCC Voice eventsProduct-specific stream/session identifiersContact Center / approved RTMS integration
Once connected, the core signaling/media socket model is shared across products. Meetings, webinars, and Video SDK sessions use the familiar start/stop webhooks. Zoom Contact Center Voice adds its own RTMS/ZCC Voice event family and should be treated as the same transport model with product-specific event payloads.
产品Webhook事件负载ID应用类型
会议
meeting.rtms_started
/
meeting.rtms_stopped
meeting_uuid
通用应用
网络研讨会
webinar.rtms_started
/
webinar.rtms_stopped
meeting_uuid
(和会议一致!)
通用应用
Video SDK
session.rtms_started
/
session.rtms_stopped
session_id
Video SDK应用
Zoom联络中心语音产品专属的RTMS/ZCC语音事件产品专属的流/会话标识符联络中心/获批的RTMS集成
连接成功后,所有产品共用核心信令/媒体Socket模型。会议、网络研讨会和Video SDK会话使用通用的启动/停止Webhook。Zoom联络中心语音有专属的RTMS/ZCC语音事件体系,传输模型和其他产品一致,仅事件负载为产品专属。

RTMS Overview

RTMS概述

RTMS is a data pipeline that gives your app access to live media from Zoom meetings, webinars, and Video SDK sessions without participant bots. Instead of having automated clients join meetings, use RTMS to collect media data directly from Zoom's infrastructure.
RTMS是一个数据管道,无需参会机器人即可让你的应用获取Zoom会议、网络研讨会和Video SDK会话的直播媒体。你无需让自动化客户端加入会议,通过RTMS可直接从Zoom基础设施采集媒体数据。

What RTMS Provides

RTMS提供的能力

Media TypeFormatUse Cases
AudioPCM (L16), G.711, G.722, OpusTranscription, voice analysis, recording
VideoH.264, JPG, PNGRecording, AI vision, thumbnails, active participant selection
Screen ShareH.264, JPG, PNGContent capture, slide extraction
TranscriptJSON textMeeting notes, search, compliance
ChatJSON textArchive, sentiment analysis
媒体类型格式适用场景
音频PCM (L16), G.711, G.722, Opus转录、语音分析、录制
视频H.264, JPG, PNG录制、AI视觉、缩略图、活跃参会者选择
屏幕共享H.264, JPG, PNG内容捕获、幻灯片提取
转录文本JSON文本会议纪要、搜索、合规
聊天JSON文本归档、情感分析

March 2026 Protocol Changes

2026年3月协议变更

  • Zoom Contact Center Voice support: RTMS now covers Contact Center Voice audio and transcript scenarios.
  • Transcript Language Identification control: transcript media handshakes now support
    src_language
    and
    enable_lid
    . Default behavior is LID enabled. Set
    enable_lid: false
    to force a fixed language.
  • Single individual video stream subscription: RTMS can now stream one participant's camera feed at a time when
    data_opt
    is set to
    VIDEO_SINGLE_INDIVIDUAL_STREAM
    .
  • Graceful client-initiated shutdown: backends can send
    STREAM_CLOSE_REQ
    over the signaling socket and wait for
    STREAM_CLOSE_RESP
    .
  • Media keep-alive tolerance increased: media socket keep-alive timeout is now 65 seconds, not 35.
  • 新增Zoom联络中心语音支持: RTMS现在覆盖联络中心语音音频和转录场景。
  • 转录语言识别控制: 转录媒体握手现在支持
    src_language
    enable_lid
    参数。默认开启LID(语言识别),设置
    enable_lid: false
    可强制使用固定语言。
  • 单用户视频流订阅: 当
    data_opt
    设置为
    VIDEO_SINGLE_INDIVIDUAL_STREAM
    时,RTMS可单次推送单个参会者的摄像头流。
  • 客户端主动优雅关闭: 后端可通过信令Socket发送
    STREAM_CLOSE_REQ
    ,等待
    STREAM_CLOSE_RESP
    完成关闭。
  • 媒体保活容错时长提升: 媒体Socket保活超时时间现在为65秒,此前为35秒。

Two Approaches

两种实现方式

ApproachBest ForComplexity
SDK (
@zoom/rtms
)
Most use casesLow - handles WebSocket complexity
Manual WebSocketCustom protocols, other languagesHigh - full protocol implementation
实现方式适用场景复杂度
SDK (
@zoom/rtms
)
绝大多数场景低 - 封装了WebSocket的复杂逻辑
手动实现WebSocket自定义协议、其他编程语言场景高 - 需要完整实现协议逻辑

Prerequisites

前置要求

  • Node.js 20.3.0+ (24 LTS recommended) for JavaScript SDK
  • Python 3.10+ for Python SDK
  • Zoom General App (for meetings/webinars) or Video SDK App (for Video SDK) with RTMS feature enabled
  • Webhook endpoint for RTMS events
  • Server to receive WebSocket streams
Need RTMS access? Post in Zoom Developer Forum requesting RTMS access with your use case.
  • JavaScript SDK需要Node.js 20.3.0+(推荐24 LTS)
  • Python SDK需要Python 3.10+
  • 开启了RTMS功能的Zoom通用应用(用于会议/网络研讨会)或Video SDK应用(用于Video SDK)
  • 用于接收RTMS事件的Webhook端点
  • 用于接收WebSocket流的服务器
需要申请RTMS权限?Zoom开发者论坛发帖说明你的使用场景,申请RTMS访问权限。

Quick Start (SDK - Recommended)

快速入门(SDK - 推荐)

javascript
import rtms from "@zoom/rtms";

// All RTMS start/stop events across products
const RTMS_EVENTS = ["meeting.rtms_started", "webinar.rtms_started", "session.rtms_started"];

// Handle webhook events
rtms.onWebhookEvent(({ event, payload }) => {
  if (!RTMS_EVENTS.includes(event)) return;

  const client = new rtms.Client();

  client.onAudioData((data, timestamp, metadata) => {
    console.log(`Audio from ${metadata.userName}: ${data.length} bytes`);
  });

  client.onTranscriptData((data, timestamp, metadata) => {
    const text = data.toString('utf8');
    console.log(`${metadata.userName}: ${text}`);
  });

  client.onJoinConfirm((reason) => {
    console.log(`Joined session: ${reason}`);
  });

  // SDK handles all WebSocket connections automatically
  // Accepts both meeting_uuid and session_id transparently
  client.join(payload);
});
javascript
import rtms from "@zoom/rtms";

// 覆盖所有产品的RTMS启动/停止事件
const RTMS_EVENTS = ["meeting.rtms_started", "webinar.rtms_started", "session.rtms_started"];

// 处理Webhook事件
rtms.onWebhookEvent(({ event, payload }) => {
  if (!RTMS_EVENTS.includes(event)) return;

  const client = new rtms.Client();

  client.onAudioData((data, timestamp, metadata) => {
    console.log(`来自${metadata.userName}的音频:${data.length}字节`);
  });

  client.onTranscriptData((data, timestamp, metadata) => {
    const text = data.toString('utf8');
    console.log(`${metadata.userName}: ${text}`);
  });

  client.onJoinConfirm((reason) => {
    console.log(`已加入会话:${reason}`);
  });

  // SDK自动处理所有WebSocket连接
  // 透明兼容meeting_uuid和session_id
  client.join(payload);
});

Quick Start (Manual WebSocket)

快速入门(手动实现WebSocket)

For full control or non-SDK languages, implement the two-phase WebSocket protocol:
javascript
const WebSocket = require('ws');
const crypto = require('crypto');

const RTMS_EVENTS = ['meeting.rtms_started', 'webinar.rtms_started', 'session.rtms_started'];

// 1. Generate signature
// For meetings/webinars: uses meeting_uuid. For Video SDK: uses session_id.
function generateSignature(clientId, idValue, streamId, clientSecret) {
  const message = `${clientId},${idValue},${streamId}`;
  return crypto.createHmac('sha256', clientSecret).update(message).digest('hex');
}

// 2. Handle webhook
app.post('/webhook', (req, res) => {
  res.status(200).send();  // CRITICAL: Respond immediately!
  
  const { event, payload } = req.body;
  if (RTMS_EVENTS.includes(event)) {
    connectToRTMS(payload);
  }
});

// 3. Connect to signaling WebSocket
function connectToRTMS(payload) {
  const { server_urls, rtms_stream_id } = payload;
  // meeting_uuid for meetings/webinars, session_id for Video SDK
  const idValue = payload.meeting_uuid || payload.session_id;
  const signature = generateSignature(CLIENT_ID, idValue, rtms_stream_id, CLIENT_SECRET);
  
  const signalingWs = new WebSocket(server_urls);
  
  signalingWs.on('open', () => {
    signalingWs.send(JSON.stringify({
      msg_type: 1,  // Handshake request
      protocol_version: 1,
      meeting_uuid: idValue,
      rtms_stream_id,
      signature,
      media_type: 9  // AUDIO(1) | TRANSCRIPT(8)
    }));
  });
  
  // ... handle responses, connect to media WebSocket
}
See: Manual WebSocket Guide for complete implementation.
如果需要完全控制或者使用非SDK支持的语言,可实现两阶段WebSocket协议:
javascript
const WebSocket = require('ws');
const crypto = require('crypto');

const RTMS_EVENTS = ['meeting.rtms_started', 'webinar.rtms_started', 'session.rtms_started'];

// 1. 生成签名
// 会议/网络研讨会使用meeting_uuid,Video SDK使用session_id
function generateSignature(clientId, idValue, streamId, clientSecret) {
  const message = `${clientId},${idValue},${streamId}`;
  return crypto.createHmac('sha256', clientSecret).update(message).digest('hex');
}

// 2. 处理Webhook
app.post('/webhook', (req, res) => {
  res.status(200).send();  // 关键:必须立即响应!
  
  const { event, payload } = req.body;
  if (RTMS_EVENTS.includes(event)) {
    connectToRTMS(payload);
  }
});

// 3. 连接信令WebSocket
function connectToRTMS(payload) {
  const { server_urls, rtms_stream_id } = payload;
  // 会议/网络研讨会为meeting_uuid,Video SDK为session_id
  const idValue = payload.meeting_uuid || payload.session_id;
  const signature = generateSignature(CLIENT_ID, idValue, rtms_stream_id, CLIENT_SECRET);
  
  const signalingWs = new WebSocket(server_urls);
  
  signalingWs.on('open', () => {
    signalingWs.send(JSON.stringify({
      msg_type: 1,  // 握手请求
      protocol_version: 1,
      meeting_uuid: idValue,
      rtms_stream_id,
      signature,
      media_type: 9  // 音频(1) | 转录(8)
    }));
  });
  
  // ... 处理响应、连接媒体WebSocket
}
参考: 手动WebSocket指南获取完整实现代码。

Media Type Bitmask

媒体类型位掩码

Combine types with bitwise OR:
TypeValueDescription
Audio1PCM audio samples
Video2H.264/JPG video frames
Screen Share4Separate from video!
Transcript8Real-time speech-to-text
Chat16In-meeting chat messages
All32All media types
Example: Audio + Transcript =
1 | 8
=
9
可通过按位或组合多种类型:
类型数值说明
音频1PCM音频采样
视频2H.264/JPG视频帧
屏幕共享4和视频是独立类型!
转录8实时语音转文本
聊天16会议内聊天消息
全部32所有媒体类型
示例: 音频+转录 =
1 | 8
=
9

Critical Gotchas

关键注意事项

IssueSolution
Only 1 connection allowedNew connections kick out existing ones. Track active sessions!
Respond 200 immediatelyIf webhook delays, Zoom retries creating duplicate connections
Heartbeat mandatoryRespond to msg_type 12 with msg_type 13, or connection dies
Reconnection is YOUR jobRTMS doesn't auto-reconnect. Media keep-alive tolerance is now about 65s; signaling remains around 60s
Transcript language driftUse
src_language
plus
enable_lid: false
when you want fixed-language transcription instead of automatic language switching
Single participant video only
VIDEO_SINGLE_INDIVIDUAL_STREAM
supports one participant at a time. A new
VIDEO_SUBSCRIPTION_REQ
overrides the previous selection
Graceful close is explicit nowUse
STREAM_CLOSE_REQ
/
STREAM_CLOSE_RESP
when your backend wants to terminate the stream cleanly
问题解决方案
仅允许1个连接新连接会踢掉已有连接,需跟踪活跃会话!
立即响应200状态码Webhook响应延迟会导致Zoom重试,产生重复连接
心跳机制是强制要求收到msg_type 12必须回复msg_type 13,否则连接会断开
重连需要自行实现RTMS不会自动重连,媒体保活容错时长约为65秒,信令保活约为60秒
转录语言漂移如需固定语言转录而非自动切换语言,可设置
src_language
+
enable_lid: false
仅支持单参会者视频流
VIDEO_SINGLE_INDIVIDUAL_STREAM
单次仅支持一个参会者,新的
VIDEO_SUBSCRIPTION_REQ
会覆盖之前的选择
优雅关闭需要显式调用后端需要干净终止流时,使用
STREAM_CLOSE_REQ
/
STREAM_CLOSE_RESP
机制

Environment Variables

环境变量

SDK Environment Variables

SDK环境变量

bash
undefined
bash
undefined

Required - Authentication

必填 - 鉴权

ZM_RTMS_CLIENT=your_client_id # Zoom OAuth Client ID ZM_RTMS_SECRET=your_client_secret # Zoom OAuth Client Secret
ZM_RTMS_CLIENT=your_client_id # Zoom OAuth Client ID ZM_RTMS_SECRET=your_client_secret # Zoom OAuth Client Secret

Optional - Webhook server

可选 - Webhook服务器

ZM_RTMS_PORT=8080 # Default: 8080 ZM_RTMS_PATH=/webhook # Default: /
ZM_RTMS_PORT=8080 # 默认:8080 ZM_RTMS_PATH=/webhook # 默认:/

Optional - Logging

可选 - 日志

ZM_RTMS_LOG_LEVEL=info # error, warn, info, debug, trace ZM_RTMS_LOG_FORMAT=progressive # progressive or json ZM_RTMS_LOG_ENABLED=true
undefined
ZM_RTMS_LOG_LEVEL=info # error, warn, info, debug, trace ZM_RTMS_LOG_FORMAT=progressive # progressive 或 json ZM_RTMS_LOG_ENABLED=true
undefined

Manual Implementation Variables

手动实现的变量

bash
ZOOM_CLIENT_ID=your_client_id
ZOOM_CLIENT_SECRET=your_client_secret
ZOOM_SECRET_TOKEN=your_webhook_token   # For webhook validation
bash
ZOOM_CLIENT_ID=your_client_id
ZOOM_CLIENT_SECRET=your_client_secret
ZOOM_SECRET_TOKEN=your_webhook_token   # 用于Webhook校验

Zoom App Setup

Zoom应用配置

For Meetings and Webinars (General App)

会议和网络研讨会(通用应用)

  1. Go to marketplace.zoom.us -> Develop -> Build App
  2. Choose General App -> User-Managed
  3. Features -> Access -> Enable Event Subscription
  4. Add Events -> Search "rtms" -> Select:
    • meeting.rtms_started
    • meeting.rtms_stopped
    • webinar.rtms_started
      (if using webinars)
    • webinar.rtms_stopped
      (if using webinars)
  5. Scopes -> Add Scopes -> Search "rtms" -> Add:
    • meeting:read:meeting_audio
    • meeting:read:meeting_video
    • meeting:read:meeting_transcript
    • meeting:read:meeting_chat
    • webinar:read:webinar_audio
      (if using webinars)
    • webinar:read:webinar_video
      (if using webinars)
    • webinar:read:webinar_transcript
      (if using webinars)
    • webinar:read:webinar_chat
      (if using webinars)
  1. 打开marketplace.zoom.us -> 开发 -> 构建应用
  2. 选择通用应用 -> 用户管理
  3. 功能 -> 访问 -> 开启事件订阅
  4. 添加事件 -> 搜索"rtms" -> 选择:
    • meeting.rtms_started
    • meeting.rtms_stopped
    • webinar.rtms_started
      (如果使用网络研讨会)
    • webinar.rtms_stopped
      (如果使用网络研讨会)
  5. 权限 -> 添加权限 -> 搜索"rtms" -> 添加:
    • meeting:read:meeting_audio
    • meeting:read:meeting_video
    • meeting:read:meeting_transcript
    • meeting:read:meeting_chat
    • webinar:read:webinar_audio
      (如果使用网络研讨会)
    • webinar:read:webinar_video
      (如果使用网络研讨会)
    • webinar:read:webinar_transcript
      (如果使用网络研讨会)
    • webinar:read:webinar_chat
      (如果使用网络研讨会)

For Video SDK (Video SDK App)

Video SDK(Video SDK应用)

  1. Go to marketplace.zoom.us -> Develop -> Build App
  2. Choose Video SDK App
  3. Use your SDK Key and SDK Secret (not OAuth Client ID/Secret)
  4. Add Events:
    • session.rtms_started
    • session.rtms_stopped
  1. 打开marketplace.zoom.us -> 开发 -> 构建应用
  2. 选择Video SDK应用
  3. 使用你的SDK Key和SDK Secret(不是OAuth Client ID/Secret)
  4. 添加事件:
    • session.rtms_started
    • session.rtms_stopped

Sample Repositories

示例仓库

Official Samples

官方示例

RepositoryDescription
rtms-samplesRTMSManager, boilerplates, AI samples
rtms-quickstart-jsJavaScript SDK quickstart
rtms-quickstart-pyPython SDK quickstart
rtms-sdk-cppC++ SDK
zoom-rtmsMain SDK repository
仓库说明
rtms-samplesRTMSManager、模板代码、AI示例
rtms-quickstart-jsJavaScript SDK快速入门
rtms-quickstart-pyPython SDK快速入门
rtms-sdk-cppC++ SDK
zoom-rtms核心SDK仓库

AI Integration Samples

AI集成示例

SampleDescription
rtms-meeting-assistant-starter-kitAI meeting assistant with summaries
arlo-meeting-assistantProduction meeting assistant with DB
videosdk-rtms-transcribe-audioWhisper transcription
示例说明
rtms-meeting-assistant-starter-kit支持摘要功能的AI会议助手
arlo-meeting-assistant带数据库的生产级会议助手
videosdk-rtms-transcribe-audioWhisper转录实现

Complete Documentation

完整文档

Concepts

概念

  • Connection Architecture - Two-phase WebSocket design
  • Lifecycle Flow - Webhook to streaming flow
  • 连接架构 - 两阶段WebSocket设计
  • 生命周期流程 - Webhook到推流的流程

Examples

示例

  • SDK Quickstart - Using @zoom/rtms SDK
  • Manual WebSocket - Raw protocol implementation
  • RTMS Bot - Complete bot implementation guide
  • AI Integration - Transcription and analysis patterns
  • SDK快速入门 - 使用@zoom/rtms SDK
  • 手动实现WebSocket - 原生协议实现
  • RTMS机器人 - 完整的机器人实现指南
  • AI集成 - 转录和分析模式

References

参考

  • Media Types - Audio, video, transcript, chat, screen share
  • Data Types - All enums and constants
  • Connection - WebSocket protocol details
  • Webhooks - Event subscription
  • 媒体类型 - 音频、视频、转录、聊天、屏幕共享
  • 数据类型 - 所有枚举和常量
  • 连接 - WebSocket协议详情
  • Webhooks - 事件订阅

Troubleshooting

故障排查

  • Common Issues - FAQ and solutions
  • 常见问题 - 常见问题和解决方案

Resources

资源


Need help? Start with Integrated Index section below for complete navigation.


需要帮助? 先查看下方的集成索引章节获取完整导航。

Integrated Index

集成索引

This section was migrated from
SKILL.md
.
RTMS provides real-time access to live audio, video, transcript, chat, and screen share from Zoom meetings, webinars, and Video SDK sessions.
本章节迁移自
SKILL.md
RTMS提供Zoom会议、网络研讨会和Video SDK会话的直播音频、视频、转录、聊天和屏幕共享的实时访问能力。

Critical Positioning

核心定位

Treat RTMS as a backend service for receiving and processing media streams.
  • Backend role: ingest audio/video/share/chat/transcript, run AI/analytics, persist/forward data.
  • Optional frontend role: Zoom App SDK or web dashboard that consumes processed stream data from backend transport (WebSocket/SSE/other).
  • Kickoff model: backend waits for RTMS start webhook events, then starts stream processing.
Do not model RTMS as a frontend-only SDK.
将RTMS作为接收和处理媒体流的后端服务使用。
  • 后端职责:接入音频/视频/共享/聊天/转录数据、运行AI/分析任务、持久化/转发数据。
  • 可选前端职责:通过Zoom App SDK或web看板消费后端传输(WebSocket/SSE/其他方式)的处理后流数据。
  • 启动模式:后端等待RTMS启动Webhook事件后,再开始流处理。
不要将RTMS作为纯前端SDK使用。

Quick Start Path

快速入门路径

If you're new to RTMS, follow this order:
  1. Run preflight checks first -> RUNBOOK.md
  2. Understand the architecture -> concepts/connection-architecture.md
    • Two-phase WebSocket: Signaling + Media
    • Why RTMS doesn't use bots
  3. Choose your approach -> SDK or Manual
    • SDK (recommended): examples/sdk-quickstart.md
    • Manual WebSocket: examples/manual-websocket.md
  4. Understand the lifecycle -> concepts/lifecycle-flow.md
    • Webhook -> Signaling -> Media -> Streaming
  5. Configure media types -> references/media-types.md
    • Audio, video, transcript, chat, screen share
  6. Troubleshoot issues -> troubleshooting/common-issues.md
    • Connection problems, duplicate webhooks, missing data

如果你首次使用RTMS,请按照以下顺序操作:
  1. 先运行预检检查 -> RUNBOOK.md
  2. 了解架构 -> concepts/connection-architecture.md
    • 两阶段WebSocket:信令+媒体
    • RTMS不使用机器人的原因
  3. 选择实现方式 -> SDK或手动实现
    • SDK(推荐): examples/sdk-quickstart.md
    • 手动实现WebSocket: examples/manual-websocket.md
  4. 了解生命周期 -> concepts/lifecycle-flow.md
    • Webhook -> 信令 -> 媒体 -> 推流
  5. 配置媒体类型 -> references/media-types.md
    • 音频、视频、转录、聊天、屏幕共享
  6. 排查问题 -> troubleshooting/common-issues.md
    • 连接问题、重复Webhook、数据缺失

Documentation Structure

文档结构

rtms/
├── SKILL.md                           # Main skill overview
├── SKILL.md                           # This file - navigation guide
├── concepts/                          # Core architectural patterns
│   ├── connection-architecture.md     # Two-phase WebSocket design
│   └── lifecycle-flow.md              # Webhook to streaming flow
├── examples/                          # Complete working code
│   ├── sdk-quickstart.md              # Using @zoom/rtms SDK
│   ├── manual-websocket.md            # Raw protocol implementation
│   ├── rtms-bot.md                    # Complete RTMS bot implementation
│   └── ai-integration.md              # Transcription and analysis
├── references/                        # Reference documentation
│   ├── media-types.md                 # Audio, video, transcript, chat, share
│   ├── data-types.md                  # All enums and constants
│   ├── connection.md                  # WebSocket protocol details
│   └── webhooks.md                    # Event subscription
└── troubleshooting/                   # Problem solving guides
    └── common-issues.md               # FAQ and solutions

rtms/
├── SKILL.md                           # 技能总览
├── SKILL.md                           # 本文件 - 导航指南
├── concepts/                          # 核心架构模式
│   ├── connection-architecture.md     # 两阶段WebSocket设计
│   └── lifecycle-flow.md              # Webhook到推流流程
├── examples/                          # 完整可运行代码
│   ├── sdk-quickstart.md              # 使用@zoom/rtms SDK
│   ├── manual-websocket.md            # 原生协议实现
│   ├── rtms-bot.md                    # 完整RTMS机器人实现
│   └── ai-integration.md              # 转录和分析
├── references/                        # 参考文档
│   ├── media-types.md                 # 音频、视频、转录、聊天、共享
│   ├── data-types.md                  # 所有枚举和常量
│   ├── connection.md                  # WebSocket协议详情
│   └── webhooks.md                    # 事件订阅
└── troubleshooting/                   # 问题解决指南
    └── common-issues.md               # 常见问题和解决方案

By Use Case

按使用场景分类

I want to get meeting transcripts

我需要获取会议转录

  1. SDK Quickstart - Fastest approach
  2. Media Types - Transcript configuration
  3. AI Integration - Whisper, Deepgram, AssemblyAI
  1. SDK快速入门 - 最快的实现方式
  2. 媒体类型 - 转录配置
  3. AI集成 - Whisper、Deepgram、AssemblyAI

I want to record meetings

我需要录制会议

  1. Media Types - Audio + Video configuration
  2. SDK Quickstart - Receiving media
  3. AI Integration - Gap-filled recording
  1. 媒体类型 - 音频+视频配置
  2. SDK快速入门 - 接收媒体
  3. AI集成 - 无间隙录制

I want to build an AI meeting assistant

我需要构建AI会议助手

  1. AI Integration - Complete patterns
  2. SDK Quickstart - Media ingestion
  3. Lifecycle Flow - Event handling
  1. AI集成 - 完整实现模式
  2. SDK快速入门 - 媒体接入
  3. 生命周期流程 - 事件处理

I want to build a complete RTMS bot

我需要构建完整的RTMS机器人

  1. RTMS Bot - Complete implementation guide
  2. Lifecycle Flow - Webhook to streaming flow
  3. Connection Architecture - Two-phase design
  1. RTMS机器人 - 完整实现指南
  2. 生命周期流程 - Webhook到推流流程
  3. 连接架构 - 两阶段设计

I need full protocol control

我需要完整的协议控制权

  1. Manual WebSocket - START HERE
  2. Connection Architecture - Two-phase design
  3. Data Types - All message types and enums
  4. Connection - Protocol details
  1. 手动实现WebSocket - 从这里开始
  2. 连接架构 - 两阶段设计
  3. 数据类型 - 所有消息类型和枚举
  4. 连接 - 协议详情

I'm getting connection errors

我遇到连接错误

  1. Common Issues - Diagnostic checklist
  2. Connection Architecture - Verify flow
  3. Webhooks - Validation and timing
  1. 常见问题 - 诊断检查清单
  2. 连接架构 - 验证流程
  3. Webhooks - 校验和时序要求

I want to understand the architecture

我需要了解架构

  1. Connection Architecture - Two-phase WebSocket
  2. Lifecycle Flow - Complete flow diagram
  3. Data Types - Protocol constants

  1. 连接架构 - 两阶段WebSocket
  2. 生命周期流程 - 完整流程图
  3. 数据类型 - 协议常量

By Product

按产品分类

I'm building for Zoom Meetings

我正在为Zoom会议开发

  • Standard RTMS setup. Webhook event:
    meeting.rtms_started
    . Uses General App with OAuth.
  • Start with SDK Quickstart or Manual WebSocket.
  • 标准RTMS配置,Webhook事件:
    meeting.rtms_started
    ,使用带OAuth的通用应用。
  • SDK快速入门手动实现WebSocket开始。

I'm building for Zoom Webinars

我正在为Zoom网络研讨会开发

  • Same as meetings, but webhook event is
    webinar.rtms_started
    . Payload still uses
    meeting_uuid
    (NOT
    webinar_uuid
    ).
  • Add webinar scopes and event subscriptions. See Webhooks.
  • Only panelist streams are confirmed available. Attendee streams may not be individual.
  • 和会议配置一致,但Webhook事件为
    webinar.rtms_started
    ,负载仍然使用
    meeting_uuid
    (不是
    webinar_uuid
    )。
  • 添加网络研讨会权限和事件订阅,参考Webhooks
  • 嘉宾流确认可用,观众流可能无法单独获取。

I'm building for Zoom Video SDK

我正在为Zoom Video SDK开发

  • Webhook event:
    session.rtms_started
    . Payload uses
    session_id
    (NOT
    meeting_uuid
    ).
  • Requires a Video SDK App with SDK Key/Secret (not OAuth Client ID/Secret).
  • Once connected, the protocol is identical to meetings.
  • See Webhooks for payload details.

  • Webhook事件:
    session.rtms_started
    ,负载使用
    session_id
    (不是
    meeting_uuid
    )。
  • 需要Video SDK应用,使用SDK Key/Secret(不是OAuth Client ID/Secret)。
  • 连接成功后,协议和会议完全一致
  • 参考Webhooks查看负载详情。

Key Documents

关键文档

1. Connection Architecture (CRITICAL)

1. 连接架构(关键)

concepts/connection-architecture.md
RTMS uses two separate WebSocket connections:
  • Signaling WebSocket: Authentication, control, heartbeats
  • Media WebSocket: Actual audio/video/transcript data
concepts/connection-architecture.md
RTMS使用两个独立的WebSocket连接
  • 信令WebSocket: 鉴权、控制、心跳
  • 媒体WebSocket: 实际的音频/视频/转录数据

2. SDK vs Manual (DECISION POINT)

2. SDK vs 手动实现(决策点)

examples/sdk-quickstart.md vs examples/manual-websocket.md
SDKManual
Handles WebSocket complexityFull protocol control
Automatic reconnectionDIY reconnection
Less codeMore code
Best for most use casesBest for custom requirements
examples/sdk-quickstart.md vs examples/manual-websocket.md
SDK手动实现
封装了WebSocket的复杂逻辑完整的协议控制权
自动重连自行实现重连
代码量更少代码量更多
适合绝大多数场景适合自定义需求场景

3. Critical Gotchas (MOST COMMON ISSUES)

3. 关键注意事项(最常见问题)

troubleshooting/common-issues.md
  1. Respond 200 immediately - Delayed webhook responses cause duplicates
  2. Only 1 connection per stream - New connections kick out existing
  3. Heartbeat required - Must respond to keep-alive or connection dies
  4. Track active sessions - Prevent duplicate join attempts

troubleshooting/common-issues.md
  1. 立即响应200状态码 - Webhook响应延迟会导致重复连接
  2. 每个流仅允许1个连接 - 新连接会踢掉已有连接
  3. 心跳机制是强制要求 - 必须响应保活请求,否则连接会断开
  4. 跟踪活跃会话 - 避免重复加入请求

Key Learnings

核心要点

Critical Discoveries:

关键发现:

  1. Two-Phase WebSocket Design
    • Signaling: Control plane (handshake, heartbeat, start/stop)
    • Media: Data plane (audio, video, transcript, chat, share)
    • See: Connection Architecture
  2. Webhook Response Timing
    • MUST respond 200 BEFORE any processing
    • Delayed response -> Zoom retries -> duplicate connections
    • See: Common Issues
  3. Heartbeat is Mandatory
    • Signaling: Receive msg_type 12, respond with msg_type 13
    • Media: Same pattern
    • Failure to respond = connection closed
    • See: Connection
  4. Signature Generation
    • Format:
      HMAC-SHA256(clientSecret, "clientId,meetingUuid,streamId")
    • For Video SDK, use
      session_id
      in place of
      meetingUuid
    • Webinars still use
      meeting_uuid
      (not
      webinar_uuid
      )
    • Required for both signaling and media handshakes
    • See: Manual WebSocket
  5. Media Types are Bitmasks
    • Audio=1, Video=2, Share=4, Transcript=8, Chat=16, All=32
    • Combine with OR: Audio+Transcript = 1|8 = 9
    • See: Media Types
  6. Screen Share is SEPARATE from Video
    • Different msg_type (16 vs 15)
    • Different media flag (4 vs 2)
    • Must subscribe separately
    • See: Media Types

  1. 两阶段WebSocket设计
    • 信令:控制平面(握手、心跳、启动/停止)
    • 媒体:数据平面(音频、视频、转录、聊天、共享)
    • 参考:连接架构
  2. Webhook响应时序
    • 必须在处理任何逻辑之前响应200
    • 响应延迟 -> Zoom重试 -> 重复连接
    • 参考:常见问题
  3. 心跳是强制要求
    • 信令:收到msg_type 12,回复msg_type 13
    • 媒体:相同的逻辑
    • 未响应会导致连接关闭
    • 参考:连接
  4. 签名生成
    • 格式:
      HMAC-SHA256(clientSecret, "clientId,meetingUuid,streamId")
    • Video SDK场景使用
      session_id
      替换
      meetingUuid
    • 网络研讨会仍然使用
      meeting_uuid
      (不是
      webinar_uuid
    • 信令和媒体握手都需要签名
    • 参考:手动实现WebSocket
  5. 媒体类型是位掩码
    • 音频=1、视频=2、共享=4、转录=8、聊天=16、全部=32
    • 通过按位或组合:音频+转录=1|8=9
    • 参考:媒体类型
  6. 屏幕共享和视频是独立类型
    • 不同的msg_type(16 vs 15)
    • 不同的媒体标志(4 vs 2)
    • 需要单独订阅
    • 参考:媒体类型

Quick Reference

快速参考

"Connection fails"

"连接失败"

-> Common Issues
-> 常见问题

"Duplicate connections"

"重复连接"

-> Webhook timing
-> Webhook时序

"No audio/video data"

"没有音频/视频数据"

-> Media Types - Check configuration
-> 媒体类型 - 检查配置

"How do I implement manually?"

"我要如何手动实现?"

-> Manual WebSocket
-> 手动实现WebSocket

"What message types exist?"

"有哪些消息类型?"

-> Data Types
-> 数据类型

"How do I integrate AI?"

"我要如何集成AI?"

-> AI Integration

-> AI集成

Document Version

文档版本

Based on Zoom RTMS SDK v1.x and official documentation as of 2026.

Happy coding!
Remember: Start with SDK Quickstart for the fastest path, or Manual WebSocket if you need full control.
基于Zoom RTMS SDK v1.x和2026年更新的官方文档。

编码愉快!
提醒:最快的实现路径是从SDK快速入门开始,如果你需要完全控制权可参考手动实现WebSocket