zoom-rtms

Compare original and translation side by side

🇺🇸

Original

English

🇨🇳

Translation

Chinese

Zoom Realtime Media Streams (RTMS)

Zoom 实时媒体流（RTMS）

Background reference for live Zoom media pipelines. Prefer

build-zoom-bot

first, then use this skill for stream types, capabilities, and RTMS-specific implementation constraints.

Zoom直播媒体管道的背景参考。优先使用

build-zoom-bot

，然后可参考本技能了解流类型、功能以及RTMS特有的实现约束。

Zoom Realtime Media Streams (RTMS)

Zoom 实时媒体流（RTMS）

Expert guidance for accessing live audio, video, transcript, chat, and screen share data from Zoom meetings, webinars, Video SDK sessions, and Zoom Contact Center Voice in real-time. RTMS uses a WebSocket-based protocol with open standards and does not require a meeting bot to capture the media plane.

用于实时获取Zoom会议、网络研讨会、Video SDK会话以及Zoom联络中心语音的直播音频、视频、转录文本、聊天和屏幕共享数据的专业指南。RTMS采用基于WebSocket的开放标准协议，无需会议机器人即可捕获媒体平面数据。

Read This First (Critical)

请先阅读本段（关键内容）

RTMS is primarily a backend media ingestion service.

Your backend receives and processes live media: audio, video, screen share, chat, transcript.
RTMS is not a frontend UI SDK by itself.
Processing is event-triggered: backend waits for RTMS start webhook events before stream handling begins.

Optional architecture (common):

Add a Zoom App SDK frontend for in-client UI/controls.
Stream backend RTMS outputs to frontend via WebSocket (or SSE, gRPC, queue workers, etc.).

Use RTMS for media/data plane, and use frontend frameworks/Zoom Apps for presentation + user interactions.

Official Documentation: https://developers.zoom.us/docs/rtms/ SDK Reference (JS): https://zoom.github.io/rtms/js/ SDK Reference (Python): https://zoom.github.io/rtms/py/ Sample Repository: https://github.com/zoom/rtms-samples

RTMS本质上是后端媒体接入服务。

你的后端接收并处理直播媒体：音频、视频、屏幕共享、聊天、转录文本。
RTMS本身不是前端UI SDK。
处理是事件触发的：后端需要先等待RTMS启动的Webhook事件，再开始处理流。

常用可选架构：

新增Zoom App SDK前端实现客户端内UI/控制功能。
后端通过WebSocket（或SSE、gRPC、队列worker等）将RTMS输出流推送到前端。

使用RTMS处理媒体/数据平面，使用前端框架/Zoom Apps实现展示层和用户交互功能。

官方文档: https://developers.zoom.us/docs/rtms/ SDK参考（JS）: https://zoom.github.io/rtms/js/ SDK参考（Python）: https://zoom.github.io/rtms/py/ 示例仓库: https://github.com/zoom/rtms-samples

Quick Links

快速链接

New to RTMS? Follow this path:

Connection Architecture - Two-phase WebSocket design
SDK Quickstart - Fastest way to receive media (recommended)
Manual WebSocket - Full protocol control without SDK
Media Types - Audio, video, transcript, chat, screen share

Complete Implementation:

RTMS Bot - End-to-end bot implementation guide

Reference:

Lifecycle Flow - Complete webhook-to-streaming flow
Data Types - All enums and constants
Webhooks - Event subscription details
Environment Variables - credential modes and runtime knobs
Quickstart Notes - Secondary quickstart guide
Integrated Index - see the section below in this file

Having issues?

Connection fails -> Common Issues
Duplicate connections -> Webhook Gotchas
No audio/video -> Media Configuration
Start with preflight checks -> 5-Minute Runbook

首次使用RTMS请按照以下路径操作：

连接架构 - 两阶段WebSocket设计
SDK快速入门 - 最快的媒体接收方式（推荐）
手动实现WebSocket - 不使用SDK实现完整协议控制
媒体类型 - 音频、视频、转录、聊天、屏幕共享

完整实现参考：

RTMS机器人 - 端到端机器人实现指南

参考资料：

生命周期流程 - 完整的Webhook到推流流程
数据类型 - 所有枚举和常量
Webhooks - 事件订阅详情
环境变量 - 凭证模式和运行时配置项
快速入门说明 - 补充快速入门指南
集成索引 - 查看本文件下方的章节

遇到问题？

连接失败 -> 常见问题
重复连接 -> Webhook注意事项
无音频/视频 -> 媒体配置
先运行预检检查 -> 5分钟运行手册

Supported Products

支持的产品

Product	Webhook Event	Payload ID	App Type
Meetings	`meeting.rtms_started` / `meeting.rtms_stopped`	`meeting_uuid`	General App
Webinars	`webinar.rtms_started` / `webinar.rtms_stopped`	`meeting_uuid` (same!)	General App
Video SDK	`session.rtms_started` / `session.rtms_stopped`	`session_id`	Video SDK App
Zoom Contact Center Voice	Product-specific RTMS/ZCC Voice events	Product-specific stream/session identifiers	Contact Center / approved RTMS integration

Once connected, the core signaling/media socket model is shared across products. Meetings, webinars, and Video SDK sessions use the familiar start/stop webhooks. Zoom Contact Center Voice adds its own RTMS/ZCC Voice event family and should be treated as the same transport model with product-specific event payloads.

产品	Webhook事件	负载ID	应用类型
会议	`meeting.rtms_started` / `meeting.rtms_stopped`	`meeting_uuid`	通用应用
网络研讨会	`webinar.rtms_started` / `webinar.rtms_stopped`	`meeting_uuid` （和会议一致！）	通用应用
Video SDK	`session.rtms_started` / `session.rtms_stopped`	`session_id`	Video SDK应用
Zoom联络中心语音	产品专属的RTMS/ZCC语音事件	产品专属的流/会话标识符	联络中心/获批的RTMS集成

连接成功后，所有产品共用核心信令/媒体Socket模型。会议、网络研讨会和Video SDK会话使用通用的启动/停止Webhook。Zoom联络中心语音有专属的RTMS/ZCC语音事件体系，传输模型和其他产品一致，仅事件负载为产品专属。

RTMS Overview

RTMS概述

RTMS is a data pipeline that gives your app access to live media from Zoom meetings, webinars, and Video SDK sessions without participant bots. Instead of having automated clients join meetings, use RTMS to collect media data directly from Zoom's infrastructure.

RTMS是一个数据管道，无需参会机器人即可让你的应用获取Zoom会议、网络研讨会和Video SDK会话的直播媒体。你无需让自动化客户端加入会议，通过RTMS可直接从Zoom基础设施采集媒体数据。

What RTMS Provides

RTMS提供的能力

Media Type	Format	Use Cases
Audio	PCM (L16), G.711, G.722, Opus	Transcription, voice analysis, recording
Video	H.264, JPG, PNG	Recording, AI vision, thumbnails, active participant selection
Screen Share	H.264, JPG, PNG	Content capture, slide extraction
Transcript	JSON text	Meeting notes, search, compliance
Chat	JSON text	Archive, sentiment analysis

媒体类型	格式	适用场景
音频	PCM (L16), G.711, G.722, Opus	转录、语音分析、录制
视频	H.264, JPG, PNG	录制、AI视觉、缩略图、活跃参会者选择
屏幕共享	H.264, JPG, PNG	内容捕获、幻灯片提取
转录文本	JSON文本	会议纪要、搜索、合规
聊天	JSON文本	归档、情感分析

March 2026 Protocol Changes

2026年3月协议变更

Zoom Contact Center Voice support: RTMS now covers Contact Center Voice audio and transcript scenarios.
Transcript Language Identification control: transcript media handshakes now support
```
src_language
```
and
```
enable_lid
```
. Default behavior is LID enabled. Set
```
enable_lid: false
```
to force a fixed language.
Single individual video stream subscription: RTMS can now stream one participant's camera feed at a time when
```
data_opt
```
is set to
```
VIDEO_SINGLE_INDIVIDUAL_STREAM
```
.
Graceful client-initiated shutdown: backends can send
```
STREAM_CLOSE_REQ
```
over the signaling socket and wait for
```
STREAM_CLOSE_RESP
```
.
Media keep-alive tolerance increased: media socket keep-alive timeout is now 65 seconds, not 35.

新增Zoom联络中心语音支持: RTMS现在覆盖联络中心语音音频和转录场景。
转录语言识别控制: 转录媒体握手现在支持
```
src_language
```
和
```
enable_lid
```
参数。默认开启LID（语言识别），设置
```
enable_lid: false
```
可强制使用固定语言。
单用户视频流订阅: 当
```
data_opt
```
设置为
```
VIDEO_SINGLE_INDIVIDUAL_STREAM
```
时，RTMS可单次推送单个参会者的摄像头流。
客户端主动优雅关闭: 后端可通过信令Socket发送
```
STREAM_CLOSE_REQ
```
，等待
```
STREAM_CLOSE_RESP
```
完成关闭。
媒体保活容错时长提升: 媒体Socket保活超时时间现在为65秒，此前为35秒。

Two Approaches

两种实现方式

Approach	Best For	Complexity
SDK ( `@zoom/rtms` )	Most use cases	Low - handles WebSocket complexity
Manual WebSocket	Custom protocols, other languages	High - full protocol implementation

实现方式	适用场景	复杂度
SDK ( `@zoom/rtms` )	绝大多数场景	低 - 封装了WebSocket的复杂逻辑
手动实现WebSocket	自定义协议、其他编程语言场景	高 - 需要完整实现协议逻辑

Prerequisites

前置要求

Node.js 20.3.0+ (24 LTS recommended) for JavaScript SDK
Python 3.10+ for Python SDK
Zoom General App (for meetings/webinars) or Video SDK App (for Video SDK) with RTMS feature enabled
Webhook endpoint for RTMS events
Server to receive WebSocket streams

Need RTMS access? Post in Zoom Developer Forum requesting RTMS access with your use case.

JavaScript SDK需要Node.js 20.3.0+（推荐24 LTS）
Python SDK需要Python 3.10+
开启了RTMS功能的Zoom通用应用（用于会议/网络研讨会）或Video SDK应用（用于Video SDK）
用于接收RTMS事件的Webhook端点
用于接收WebSocket流的服务器

需要申请RTMS权限？ 在Zoom开发者论坛发帖说明你的使用场景，申请RTMS访问权限。

Quick Start (SDK - Recommended)

快速入门（SDK - 推荐）

javascript

import rtms from "@zoom/rtms";

// All RTMS start/stop events across products
const RTMS_EVENTS = ["meeting.rtms_started", "webinar.rtms_started", "session.rtms_started"];

// Handle webhook events
rtms.onWebhookEvent(({ event, payload }) => {
  if (!RTMS_EVENTS.includes(event)) return;

  const client = new rtms.Client();

  client.onAudioData((data, timestamp, metadata) => {
    console.log(`Audio from ${metadata.userName}: ${data.length} bytes`);
  });

  client.onTranscriptData((data, timestamp, metadata) => {
    const text = data.toString('utf8');
    console.log(`${metadata.userName}: ${text}`);
  });

  client.onJoinConfirm((reason) => {
    console.log(`Joined session: ${reason}`);
  });

  // SDK handles all WebSocket connections automatically
  // Accepts both meeting_uuid and session_id transparently
  client.join(payload);
});

javascript

import rtms from "@zoom/rtms";

// 覆盖所有产品的RTMS启动/停止事件
const RTMS_EVENTS = ["meeting.rtms_started", "webinar.rtms_started", "session.rtms_started"];

// 处理Webhook事件
rtms.onWebhookEvent(({ event, payload }) => {
  if (!RTMS_EVENTS.includes(event)) return;

  const client = new rtms.Client();

  client.onAudioData((data, timestamp, metadata) => {
    console.log(`来自${metadata.userName}的音频：${data.length}字节`);
  });

  client.onTranscriptData((data, timestamp, metadata) => {
    const text = data.toString('utf8');
    console.log(`${metadata.userName}: ${text}`);
  });

  client.onJoinConfirm((reason) => {
    console.log(`已加入会话：${reason}`);
  });

  // SDK自动处理所有WebSocket连接
  // 透明兼容meeting_uuid和session_id
  client.join(payload);
});

Quick Start (Manual WebSocket)

快速入门（手动实现WebSocket）

For full control or non-SDK languages, implement the two-phase WebSocket protocol:

javascript

const WebSocket = require('ws');
const crypto = require('crypto');

const RTMS_EVENTS = ['meeting.rtms_started', 'webinar.rtms_started', 'session.rtms_started'];

// 1. Generate signature
// For meetings/webinars: uses meeting_uuid. For Video SDK: uses session_id.
function generateSignature(clientId, idValue, streamId, clientSecret) {
  const message = `${clientId},${idValue},${streamId}`;
  return crypto.createHmac('sha256', clientSecret).update(message).digest('hex');
}

// 2. Handle webhook
app.post('/webhook', (req, res) => {
  res.status(200).send();  // CRITICAL: Respond immediately!
  
  const { event, payload } = req.body;
  if (RTMS_EVENTS.includes(event)) {
    connectToRTMS(payload);
  }
});

// 3. Connect to signaling WebSocket
function connectToRTMS(payload) {
  const { server_urls, rtms_stream_id } = payload;
  // meeting_uuid for meetings/webinars, session_id for Video SDK
  const idValue = payload.meeting_uuid || payload.session_id;
  const signature = generateSignature(CLIENT_ID, idValue, rtms_stream_id, CLIENT_SECRET);
  
  const signalingWs = new WebSocket(server_urls);
  
  signalingWs.on('open', () => {
    signalingWs.send(JSON.stringify({
      msg_type: 1,  // Handshake request
      protocol_version: 1,
      meeting_uuid: idValue,
      rtms_stream_id,
      signature,
      media_type: 9  // AUDIO(1) | TRANSCRIPT(8)
    }));
  });
  
  // ... handle responses, connect to media WebSocket
}

See: Manual WebSocket Guide for complete implementation.

如果需要完全控制或者使用非SDK支持的语言，可实现两阶段WebSocket协议：

javascript

const WebSocket = require('ws');
const crypto = require('crypto');

const RTMS_EVENTS = ['meeting.rtms_started', 'webinar.rtms_started', 'session.rtms_started'];

// 1. 生成签名
// 会议/网络研讨会使用meeting_uuid，Video SDK使用session_id
function generateSignature(clientId, idValue, streamId, clientSecret) {
  const message = `${clientId},${idValue},${streamId}`;
  return crypto.createHmac('sha256', clientSecret).update(message).digest('hex');
}

// 2. 处理Webhook
app.post('/webhook', (req, res) => {
  res.status(200).send();  // 关键：必须立即响应！
  
  const { event, payload } = req.body;
  if (RTMS_EVENTS.includes(event)) {
    connectToRTMS(payload);
  }
});

// 3. 连接信令WebSocket
function connectToRTMS(payload) {
  const { server_urls, rtms_stream_id } = payload;
  // 会议/网络研讨会为meeting_uuid，Video SDK为session_id
  const idValue = payload.meeting_uuid || payload.session_id;
  const signature = generateSignature(CLIENT_ID, idValue, rtms_stream_id, CLIENT_SECRET);
  
  const signalingWs = new WebSocket(server_urls);
  
  signalingWs.on('open', () => {
    signalingWs.send(JSON.stringify({
      msg_type: 1,  // 握手请求
      protocol_version: 1,
      meeting_uuid: idValue,
      rtms_stream_id,
      signature,
      media_type: 9  // 音频(1) | 转录(8)
    }));
  });
  
  // ... 处理响应、连接媒体WebSocket
}

参考: 手动WebSocket指南获取完整实现代码。

Media Type Bitmask

媒体类型位掩码

Combine types with bitwise OR:

Type	Value	Description
Audio	1	PCM audio samples
Video	2	H.264/JPG video frames
Screen Share	4	Separate from video!
Transcript	8	Real-time speech-to-text
Chat	16	In-meeting chat messages
All	32	All media types

Example: Audio + Transcript =

1 | 8

可通过按位或组合多种类型：

类型	数值	说明
音频	1	PCM音频采样
视频	2	H.264/JPG视频帧
屏幕共享	4	和视频是独立类型！
转录	8	实时语音转文本
聊天	16	会议内聊天消息
全部	32	所有媒体类型

示例: 音频+转录 =

1 | 8

Critical Gotchas

关键注意事项

Issue	Solution
Only 1 connection allowed	New connections kick out existing ones. Track active sessions!
Respond 200 immediately	If webhook delays, Zoom retries creating duplicate connections
Heartbeat mandatory	Respond to msg_type 12 with msg_type 13, or connection dies
Reconnection is YOUR job	RTMS doesn't auto-reconnect. Media keep-alive tolerance is now about 65s; signaling remains around 60s
Transcript language drift	Use `src_language` plus `enable_lid: false` when you want fixed-language transcription instead of automatic language switching
Single participant video only	`VIDEO_SINGLE_INDIVIDUAL_STREAM` supports one participant at a time. A new `VIDEO_SUBSCRIPTION_REQ` overrides the previous selection
Graceful close is explicit now	Use `STREAM_CLOSE_REQ` / `STREAM_CLOSE_RESP` when your backend wants to terminate the stream cleanly

问题	解决方案
仅允许1个连接	新连接会踢掉已有连接，需跟踪活跃会话！
立即响应200状态码	Webhook响应延迟会导致Zoom重试，产生重复连接
心跳机制是强制要求	收到msg_type 12必须回复msg_type 13，否则连接会断开
重连需要自行实现	RTMS不会自动重连，媒体保活容错时长约为65秒，信令保活约为60秒
转录语言漂移	如需固定语言转录而非自动切换语言，可设置 `src_language` + `enable_lid: false`
仅支持单参会者视频流	`VIDEO_SINGLE_INDIVIDUAL_STREAM` 单次仅支持一个参会者，新的 `VIDEO_SUBSCRIPTION_REQ` 会覆盖之前的选择
优雅关闭需要显式调用	后端需要干净终止流时，使用 `STREAM_CLOSE_REQ` / `STREAM_CLOSE_RESP` 机制

Environment Variables

环境变量

SDK Environment Variables

SDK环境变量

bash

undefined

bash

undefined

Required - Authentication

必填 - 鉴权

ZM_RTMS_CLIENT=your_client_id # Zoom OAuth Client ID ZM_RTMS_SECRET=your_client_secret # Zoom OAuth Client Secret

Optional - Webhook server

可选 - Webhook服务器

ZM_RTMS_PORT=8080 # Default: 8080 ZM_RTMS_PATH=/webhook # Default: /

ZM_RTMS_PORT=8080 # 默认：8080 ZM_RTMS_PATH=/webhook # 默认：/

Optional - Logging

可选 - 日志

ZM_RTMS_LOG_LEVEL=info # error, warn, info, debug, trace ZM_RTMS_LOG_FORMAT=progressive # progressive or json ZM_RTMS_LOG_ENABLED=true

undefined

ZM_RTMS_LOG_LEVEL=info # error, warn, info, debug, trace ZM_RTMS_LOG_FORMAT=progressive # progressive 或 json ZM_RTMS_LOG_ENABLED=true

undefined

Manual Implementation Variables

手动实现的变量

bash

ZOOM_CLIENT_ID=your_client_id
ZOOM_CLIENT_SECRET=your_client_secret
ZOOM_SECRET_TOKEN=your_webhook_token   # For webhook validation

bash

ZOOM_CLIENT_ID=your_client_id
ZOOM_CLIENT_SECRET=your_client_secret
ZOOM_SECRET_TOKEN=your_webhook_token   # 用于Webhook校验

Zoom App Setup

Zoom应用配置

For Meetings and Webinars (General App)

会议和网络研讨会（通用应用）

Go to marketplace.zoom.us -> Develop -> Build App
Choose General App -> User-Managed
Features -> Access -> Enable Event Subscription

Add Events -> Search "rtms" -> Select:

```
meeting.rtms_started
```
```
meeting.rtms_stopped
```
```
webinar.rtms_started
```
(if using webinars)
```
webinar.rtms_stopped
```
(if using webinars)

Scopes -> Add Scopes -> Search "rtms" -> Add:

```
meeting:read:meeting_audio
```
```
meeting:read:meeting_video
```
```
meeting:read:meeting_transcript
```
```
meeting:read:meeting_chat
```
```
webinar:read:webinar_audio
```
(if using webinars)
```
webinar:read:webinar_video
```
(if using webinars)
```
webinar:read:webinar_transcript
```
(if using webinars)
```
webinar:read:webinar_chat
```
(if using webinars)

打开marketplace.zoom.us -> 开发 -> 构建应用
选择通用应用 -> 用户管理
功能 -> 访问 -> 开启事件订阅
添加事件 -> 搜索"rtms" -> 选择：
- ```
meeting.rtms_started
```
- ```
meeting.rtms_stopped
```
- ```
webinar.rtms_started
```
  （如果使用网络研讨会）
- ```
webinar.rtms_stopped
```
  （如果使用网络研讨会）

权限 -> 添加权限 -> 搜索"rtms" -> 添加：

```
meeting:read:meeting_audio
```
```
meeting:read:meeting_video
```
```
meeting:read:meeting_transcript
```
```
meeting:read:meeting_chat
```
```
webinar:read:webinar_audio
```
（如果使用网络研讨会）
```
webinar:read:webinar_video
```
（如果使用网络研讨会）
```
webinar:read:webinar_transcript
```
（如果使用网络研讨会）
```
webinar:read:webinar_chat
```
（如果使用网络研讨会）

For Video SDK (Video SDK App)

Video SDK（Video SDK应用）

Go to marketplace.zoom.us -> Develop -> Build App
Choose Video SDK App
Use your SDK Key and SDK Secret (not OAuth Client ID/Secret)

Add Events:

```
session.rtms_started
```
```
session.rtms_stopped
```

打开marketplace.zoom.us -> 开发 -> 构建应用
选择Video SDK应用
使用你的SDK Key和SDK Secret（不是OAuth Client ID/Secret）

添加事件：

```
session.rtms_started
```
```
session.rtms_stopped
```

Sample Repositories

示例仓库

Official Samples

官方示例

Repository	Description
rtms-samples	RTMSManager, boilerplates, AI samples
rtms-quickstart-js	JavaScript SDK quickstart
rtms-quickstart-py	Python SDK quickstart
rtms-sdk-cpp	C++ SDK
zoom-rtms	Main SDK repository

仓库	说明
rtms-samples	RTMSManager、模板代码、AI示例
rtms-quickstart-js	JavaScript SDK快速入门
rtms-quickstart-py	Python SDK快速入门
rtms-sdk-cpp	C++ SDK
zoom-rtms	核心SDK仓库

AI Integration Samples

AI集成示例

Sample	Description
rtms-meeting-assistant-starter-kit	AI meeting assistant with summaries
arlo-meeting-assistant	Production meeting assistant with DB
videosdk-rtms-transcribe-audio	Whisper transcription

示例	说明
rtms-meeting-assistant-starter-kit	支持摘要功能的AI会议助手
arlo-meeting-assistant	带数据库的生产级会议助手
videosdk-rtms-transcribe-audio	Whisper转录实现

Complete Documentation

完整文档

Concepts

概念

Connection Architecture - Two-phase WebSocket design
Lifecycle Flow - Webhook to streaming flow

连接架构 - 两阶段WebSocket设计
生命周期流程 - Webhook到推流的流程

Examples

示例

SDK Quickstart - Using @zoom/rtms SDK
Manual WebSocket - Raw protocol implementation
RTMS Bot - Complete bot implementation guide
AI Integration - Transcription and analysis patterns

SDK快速入门 - 使用@zoom/rtms SDK
手动实现WebSocket - 原生协议实现
RTMS机器人 - 完整的机器人实现指南
AI集成 - 转录和分析模式

References

参考

Media Types - Audio, video, transcript, chat, screen share
Data Types - All enums and constants
Connection - WebSocket protocol details
Webhooks - Event subscription

媒体类型 - 音频、视频、转录、聊天、屏幕共享
数据类型 - 所有枚举和常量
连接 - WebSocket协议详情
Webhooks - 事件订阅

Troubleshooting

故障排查

Common Issues - FAQ and solutions

常见问题 - 常见问题和解决方案

Resources

资源

Official docs: https://developers.zoom.us/docs/rtms/
Data types: https://developers.zoom.us/docs/rtms/data-types/
Media params: https://developers.zoom.us/docs/rtms/media-parameter-definition/
Developer forum: https://devforum.zoom.us/

Need help? Start with Integrated Index section below for complete navigation.

官方文档: https://developers.zoom.us/docs/rtms/
数据类型: https://developers.zoom.us/docs/rtms/data-types/
媒体参数: https://developers.zoom.us/docs/rtms/media-parameter-definition/
开发者论坛: https://devforum.zoom.us/

需要帮助？ 先查看下方的集成索引章节获取完整导航。

Integrated Index

集成索引

This section was migrated from
SKILL.md
.

RTMS provides real-time access to live audio, video, transcript, chat, and screen share from Zoom meetings, webinars, and Video SDK sessions.

本章节迁移自
SKILL.md
。

RTMS提供Zoom会议、网络研讨会和Video SDK会话的直播音频、视频、转录、聊天和屏幕共享的实时访问能力。

Critical Positioning

核心定位

Treat RTMS as a backend service for receiving and processing media streams.

Backend role: ingest audio/video/share/chat/transcript, run AI/analytics, persist/forward data.
Optional frontend role: Zoom App SDK or web dashboard that consumes processed stream data from backend transport (WebSocket/SSE/other).
Kickoff model: backend waits for RTMS start webhook events, then starts stream processing.

Do not model RTMS as a frontend-only SDK.

将RTMS作为接收和处理媒体流的后端服务使用。

后端职责：接入音频/视频/共享/聊天/转录数据、运行AI/分析任务、持久化/转发数据。
可选前端职责：通过Zoom App SDK或web看板消费后端传输（WebSocket/SSE/其他方式）的处理后流数据。
启动模式：后端等待RTMS启动Webhook事件后，再开始流处理。

不要将RTMS作为纯前端SDK使用。

Quick Start Path

快速入门路径

If you're new to RTMS, follow this order:

Run preflight checks first -> RUNBOOK.md
Understand the architecture -> concepts/connection-architecture.md
- Two-phase WebSocket: Signaling + Media
- Why RTMS doesn't use bots
Choose your approach -> SDK or Manual
- SDK (recommended): examples/sdk-quickstart.md
- Manual WebSocket: examples/manual-websocket.md
Understand the lifecycle -> concepts/lifecycle-flow.md
- Webhook -> Signaling -> Media -> Streaming
Configure media types -> references/media-types.md
- Audio, video, transcript, chat, screen share
Troubleshoot issues -> troubleshooting/common-issues.md
- Connection problems, duplicate webhooks, missing data

如果你首次使用RTMS，请按照以下顺序操作：

先运行预检检查 -> RUNBOOK.md
了解架构 -> concepts/connection-architecture.md
- 两阶段WebSocket：信令+媒体
- RTMS不使用机器人的原因
选择实现方式 -> SDK或手动实现
- SDK（推荐）: examples/sdk-quickstart.md
- 手动实现WebSocket: examples/manual-websocket.md
了解生命周期 -> concepts/lifecycle-flow.md
- Webhook -> 信令 -> 媒体 -> 推流
配置媒体类型 -> references/media-types.md
- 音频、视频、转录、聊天、屏幕共享
排查问题 -> troubleshooting/common-issues.md
- 连接问题、重复Webhook、数据缺失

Documentation Structure

文档结构

rtms/
├── SKILL.md                           # Main skill overview
├── SKILL.md                           # This file - navigation guide
│
├── concepts/                          # Core architectural patterns
│   ├── connection-architecture.md     # Two-phase WebSocket design
│   └── lifecycle-flow.md              # Webhook to streaming flow
│
├── examples/                          # Complete working code
│   ├── sdk-quickstart.md              # Using @zoom/rtms SDK
│   ├── manual-websocket.md            # Raw protocol implementation
│   ├── rtms-bot.md                    # Complete RTMS bot implementation
│   └── ai-integration.md              # Transcription and analysis
│
├── references/                        # Reference documentation
│   ├── media-types.md                 # Audio, video, transcript, chat, share
│   ├── data-types.md                  # All enums and constants
│   ├── connection.md                  # WebSocket protocol details
│   └── webhooks.md                    # Event subscription
│
└── troubleshooting/                   # Problem solving guides
    └── common-issues.md               # FAQ and solutions

rtms/
├── SKILL.md                           # 技能总览
├── SKILL.md                           # 本文件 - 导航指南
│
├── concepts/                          # 核心架构模式
│   ├── connection-architecture.md     # 两阶段WebSocket设计
│   └── lifecycle-flow.md              # Webhook到推流流程
│
├── examples/                          # 完整可运行代码
│   ├── sdk-quickstart.md              # 使用@zoom/rtms SDK
│   ├── manual-websocket.md            # 原生协议实现
│   ├── rtms-bot.md                    # 完整RTMS机器人实现
│   └── ai-integration.md              # 转录和分析
│
├── references/                        # 参考文档
│   ├── media-types.md                 # 音频、视频、转录、聊天、共享
│   ├── data-types.md                  # 所有枚举和常量
│   ├── connection.md                  # WebSocket协议详情
│   └── webhooks.md                    # 事件订阅
│
└── troubleshooting/                   # 问题解决指南
    └── common-issues.md               # 常见问题和解决方案

By Use Case

按使用场景分类

I want to get meeting transcripts

我需要获取会议转录

SDK Quickstart - Fastest approach
Media Types - Transcript configuration
AI Integration - Whisper, Deepgram, AssemblyAI

SDK快速入门 - 最快的实现方式
媒体类型 - 转录配置
AI集成 - Whisper、Deepgram、AssemblyAI

I want to record meetings

我需要录制会议

Media Types - Audio + Video configuration
SDK Quickstart - Receiving media
AI Integration - Gap-filled recording

媒体类型 - 音频+视频配置
SDK快速入门 - 接收媒体
AI集成 - 无间隙录制

I want to build an AI meeting assistant

我需要构建AI会议助手

AI Integration - Complete patterns
SDK Quickstart - Media ingestion
Lifecycle Flow - Event handling

AI集成 - 完整实现模式
SDK快速入门 - 媒体接入
生命周期流程 - 事件处理

I want to build a complete RTMS bot

我需要构建完整的RTMS机器人

RTMS Bot - Complete implementation guide
Lifecycle Flow - Webhook to streaming flow
Connection Architecture - Two-phase design

RTMS机器人 - 完整实现指南
生命周期流程 - Webhook到推流流程
连接架构 - 两阶段设计

I need full protocol control

我需要完整的协议控制权

Manual WebSocket - START HERE
Connection Architecture - Two-phase design
Data Types - All message types and enums
Connection - Protocol details

手动实现WebSocket - 从这里开始
连接架构 - 两阶段设计
数据类型 - 所有消息类型和枚举
连接 - 协议详情

I'm getting connection errors

我遇到连接错误

Common Issues - Diagnostic checklist
Connection Architecture - Verify flow
Webhooks - Validation and timing

常见问题 - 诊断检查清单
连接架构 - 验证流程
Webhooks - 校验和时序要求

I want to understand the architecture

我需要了解架构

Connection Architecture - Two-phase WebSocket
Lifecycle Flow - Complete flow diagram
Data Types - Protocol constants

连接架构 - 两阶段WebSocket
生命周期流程 - 完整流程图
数据类型 - 协议常量

By Product

按产品分类

I'm building for Zoom Meetings

我正在为Zoom会议开发

Standard RTMS setup. Webhook event:
```
meeting.rtms_started
```
. Uses General App with OAuth.
Start with SDK Quickstart or Manual WebSocket.

标准RTMS配置，Webhook事件：
```
meeting.rtms_started
```
，使用带OAuth的通用应用。
从SDK快速入门或手动实现WebSocket开始。

I'm building for Zoom Webinars

我正在为Zoom网络研讨会开发

Same as meetings, but webhook event is
```
webinar.rtms_started
```
. Payload still uses
```
meeting_uuid
```
(NOT
```
webinar_uuid
```
).
Add webinar scopes and event subscriptions. See Webhooks.
Only panelist streams are confirmed available. Attendee streams may not be individual.

和会议配置一致，但Webhook事件为
```
webinar.rtms_started
```
，负载仍然使用
```
meeting_uuid
```
（不是
```
webinar_uuid
```
）。
添加网络研讨会权限和事件订阅，参考Webhooks。
仅嘉宾流确认可用，观众流可能无法单独获取。

I'm building for Zoom Video SDK

我正在为Zoom Video SDK开发

Webhook event:

session.rtms_started

. Payload uses

session_id

(NOT

meeting_uuid

Requires a Video SDK App with SDK Key/Secret (not OAuth Client ID/Secret).
Once connected, the protocol is identical to meetings.
See Webhooks for payload details.

Webhook事件：
```
session.rtms_started
```
，负载使用
```
session_id
```
（不是
```
meeting_uuid
```
）。
需要Video SDK应用，使用SDK Key/Secret（不是OAuth Client ID/Secret）。
连接成功后，协议和会议完全一致。
参考Webhooks查看负载详情。

Key Documents

关键文档

1. Connection Architecture (CRITICAL)

1. 连接架构（关键）

concepts/connection-architecture.md

RTMS uses two separate WebSocket connections:

Signaling WebSocket: Authentication, control, heartbeats
Media WebSocket: Actual audio/video/transcript data

concepts/connection-architecture.md

RTMS使用两个独立的WebSocket连接：

信令WebSocket: 鉴权、控制、心跳
媒体WebSocket: 实际的音频/视频/转录数据

2. SDK vs Manual (DECISION POINT)

2. SDK vs 手动实现（决策点）

examples/sdk-quickstart.md vs examples/manual-websocket.md

SDK	Manual
Handles WebSocket complexity	Full protocol control
Automatic reconnection	DIY reconnection
Less code	More code
Best for most use cases	Best for custom requirements

examples/sdk-quickstart.md vs examples/manual-websocket.md

SDK	手动实现
封装了WebSocket的复杂逻辑	完整的协议控制权
自动重连	自行实现重连
代码量更少	代码量更多
适合绝大多数场景	适合自定义需求场景

3. Critical Gotchas (MOST COMMON ISSUES)

3. 关键注意事项（最常见问题）

troubleshooting/common-issues.md

Respond 200 immediately - Delayed webhook responses cause duplicates
Only 1 connection per stream - New connections kick out existing
Heartbeat required - Must respond to keep-alive or connection dies
Track active sessions - Prevent duplicate join attempts

troubleshooting/common-issues.md

立即响应200状态码 - Webhook响应延迟会导致重复连接
每个流仅允许1个连接 - 新连接会踢掉已有连接
心跳机制是强制要求 - 必须响应保活请求，否则连接会断开
跟踪活跃会话 - 避免重复加入请求

Key Learnings

核心要点

Critical Discoveries:

关键发现：

Two-Phase WebSocket Design
- Signaling: Control plane (handshake, heartbeat, start/stop)
- Media: Data plane (audio, video, transcript, chat, share)
- See: Connection Architecture
Webhook Response Timing
- MUST respond 200 BEFORE any processing
- Delayed response -> Zoom retries -> duplicate connections
- See: Common Issues
Heartbeat is Mandatory
- Signaling: Receive msg_type 12, respond with msg_type 13
- Media: Same pattern
- Failure to respond = connection closed
- See: Connection
Signature Generation
- Format:
```
HMAC-SHA256(clientSecret, "clientId,meetingUuid,streamId")
```
- For Video SDK, use
```
session_id
```
  in place of
```
meetingUuid
```
- Webinars still use
```
meeting_uuid
```
  (not
```
webinar_uuid
```
  )
- Required for both signaling and media handshakes
- See: Manual WebSocket
Media Types are Bitmasks
- Audio=1, Video=2, Share=4, Transcript=8, Chat=16, All=32
- Combine with OR: Audio+Transcript = 1|8 = 9
- See: Media Types
Screen Share is SEPARATE from Video
- Different msg_type (16 vs 15)
- Different media flag (4 vs 2)
- Must subscribe separately
- See: Media Types

两阶段WebSocket设计
- 信令：控制平面（握手、心跳、启动/停止）
- 媒体：数据平面（音频、视频、转录、聊天、共享）
- 参考：连接架构
Webhook响应时序
- 必须在处理任何逻辑之前响应200
- 响应延迟 -> Zoom重试 -> 重复连接
- 参考：常见问题
心跳是强制要求
- 信令：收到msg_type 12，回复msg_type 13
- 媒体：相同的逻辑
- 未响应会导致连接关闭
- 参考：连接
签名生成
- 格式：
```
HMAC-SHA256(clientSecret, "clientId,meetingUuid,streamId")
```
- Video SDK场景使用
```
session_id
```
  替换
```
meetingUuid
```
- 网络研讨会仍然使用
```
meeting_uuid
```
  （不是
```
webinar_uuid
```
  ）
- 信令和媒体握手都需要签名
- 参考：手动实现WebSocket
媒体类型是位掩码
- 音频=1、视频=2、共享=4、转录=8、聊天=16、全部=32
- 通过按位或组合：音频+转录=1|8=9
- 参考：媒体类型
屏幕共享和视频是独立类型
- 不同的msg_type（16 vs 15）
- 不同的媒体标志（4 vs 2）
- 需要单独订阅
- 参考：媒体类型

Quick Reference

快速参考

"Connection fails"

"连接失败"

-> Common Issues

-> 常见问题

"Duplicate connections"

"重复连接"

-> Webhook timing

-> Webhook时序

"No audio/video data"

"没有音频/视频数据"

-> Media Types - Check configuration

-> 媒体类型 - 检查配置

"How do I implement manually?"

"我要如何手动实现？"

-> Manual WebSocket

-> 手动实现WebSocket

"What message types exist?"

"有哪些消息类型？"

-> Data Types

-> 数据类型

"How do I integrate AI?"

"我要如何集成AI？"

-> AI Integration

-> AI集成

Document Version

文档版本

Based on Zoom RTMS SDK v1.x and official documentation as of 2026.

Happy coding!

Remember: Start with SDK Quickstart for the fastest path, or Manual WebSocket if you need full control.

基于Zoom RTMS SDK v1.x和2026年更新的官方文档。

编码愉快！

提醒：最快的实现路径是从SDK快速入门开始，如果你需要完全控制权可参考手动实现WebSocket。