vss-deploy-video-embedding

Compare original and translation side by side

🇺🇸

Original

English

🇨🇳

Translation

Chinese

VSS Video Embedding (RT-Embed)

VSS视频嵌入服务（RT-Embed）

Use this skill when you need to:

Deploy the VSS Video Embedding microservice from a Docker Compose file.
Generate text or video embeddings against the Cosmos-Embed1-448p model.
Embed an uploaded file, an HTTP/S3/file/data URL, or a live RTSP stream.
Wire the service into a VSS deployment alongside Redis, Kafka, and OpenTelemetry.
Triage readiness, model-download, GPU, or stream-reconnection failures.

Trigger phrases:

vss-deploy-video-embedding

RT-Embed

rtvi-embed

video embedding service

Cosmos-Embed1

embed live stream

embed video file

generate video embeddings

text embedding for video search

当你需要完成以下操作时使用本技能：

通过Docker Compose文件部署VSS视频嵌入微服务。
基于Cosmos-Embed1-448p模型生成文本或视频嵌入。
对上传文件、HTTP/S3/file/data URL或实时RTSP流进行嵌入处理。
将该服务与Redis、Kafka和OpenTelemetry集成到VSS部署架构中。
排查就绪状态、模型下载、GPU或流重连相关故障。

触发短语：

vss-deploy-video-embedding

RT-Embed

rtvi-embed

video embedding service

Cosmos-Embed1

embed live stream

embed video file

generate video embeddings

text embedding for video search

。

Service Snapshot

服务快照

VSS 3.2 GA skill:
```
vss-deploy-video-embedding
```
.
Legacy 3.1 name: RT-Embed.
Compose service:
```
rtvi-embed
```
.
Container name:
```
vss-rtvi-embed
```
.

Image:

nvcr.io/nvstaging/vss-core/vss-rt-embed

(override with

RTVI_EMBED_IMAGE

Default tag:
```
3.2.0-26.05.4
```
(override with
```
RTVI_EMBED_TAG
```
).
Profile:
```
bp_developer_search_2d
```
.
Container port:
```
8000
```
(host-side
```
${RTVI_EMBED_PORT}
```
).

Default model:

cosmos-embed1-448p

from

nvidia/Cosmos-Embed1-448p

Health endpoint:
```
GET /v1/ready
```
.
Healthcheck startup grace:
```
1200s
```
(20 minutes) on first boot.

VSS 3.2 GA技能：
```
vss-deploy-video-embedding
```
。
旧版3.1名称： RT-Embed。
Compose服务名：
```
rtvi-embed
```
。
容器名称：
```
vss-rtvi-embed
```
。

镜像：

nvcr.io/nvstaging/vss-core/vss-rt-embed

（可通过

RTVI_EMBED_IMAGE

覆盖）。

默认标签：
```
3.2.0-26.05.4
```
（可通过
```
RTVI_EMBED_TAG
```
覆盖）。
配置文件：
```
bp_developer_search_2d
```
。
容器端口：
```
8000
```
（主机侧为
```
${RTVI_EMBED_PORT}
```
）。

默认模型： 来自

nvidia/Cosmos-Embed1-448p

的

cosmos-embed1-448p

。

健康检查端点：
```
GET /v1/ready
```
。
健康检查启动宽限期： 首次启动时为
```
1200s
```
（20分钟）。

Prerequisites

前提条件

Before bringing the service up:

NVIDIA driver + NVIDIA Container Toolkit installed; default runtime set to
```
nvidia
```
.
Docker Engine and Docker Compose plugin recent enough to support
```
${VAR:+value}
```
conditional volume substitution.
```
docker login nvcr.io
```
completed with
```
$oauthtoken
```
and a valid NGC API key.
Host environment provides at minimum:
```
RTVI_EMBED_PORT
```
,
```
VSS_DATA_DIR
```
,
```
NGC_API_KEY
```
, and optionally
```
HF_TOKEN
```
to avoid Hugging Face 429 rate-limit errors during the Cosmos-Embed1 weights download.

Free disk space for persistent caches:

rtvi-hf-cache

rtvi-ngc-model-cache

rtvi-triton-model-repo

(multi-GB).

See

references/deploy-vss-deploy-video-embedding.md

for the full prerequisite list and

references/environment.md

for the variable matrix.

启动服务前需满足：

已安装NVIDIA驱动 + NVIDIA Container Toolkit；默认运行时设置为
```
nvidia
```
。
Docker Engine和Docker Compose插件版本足够新，支持
```
${VAR:+value}
```
条件式卷替换。
已使用
```
$oauthtoken
```
和有效的NGC API密钥完成
```
docker login nvcr.io
```
登录。
主机环境至少提供：
```
RTVI_EMBED_PORT
```
、
```
VSS_DATA_DIR
```
、
```
NGC_API_KEY
```
，可选提供
```
HF_TOKEN
```
以避免Cosmos-Embed1权重下载时出现Hugging Face 429限流错误。
为持久化缓存预留足够磁盘空间：
```
rtvi-hf-cache
```
、
```
rtvi-ngc-model-cache
```
、
```
rtvi-triton-model-repo
```
（多GB级）。

完整前提条件列表请查看

references/deploy-vss-deploy-video-embedding.md

，变量矩阵请查看

references/environment.md

。

Deploy

部署

For standalone RT-Embed, work from the service directory:

bash

cd "{{repo_root}}/deploy/docker/services/rtvi/rtvi-embed"

Do not use

/vss-deploy-profile

scripts/dev-profile.sh

for this standalone deployment.

Set a minimal standalone environment before

docker compose up

bash

export RTVI_EMBED_PORT=8017
export VSS_DATA_DIR="${VSS_DATA_DIR:-$(pwd)/.standalone-data}"
export NGC_API_KEY="<your-ngc-api-key>"
export HOST_IP="$(hostname -I | awk '{print $1}')"
export HF_TOKEN="${HF_TOKEN:-}"  # optional, but recommended to avoid HF 429s
mkdir -p "${VSS_DATA_DIR}/data_log/vst/clip_storage"
export RTVI_EMBED_KAFKA_ENABLED=false
export ENABLE_REDIS_ERROR_MESSAGES=false

This avoids mounting

/data_log/vst/clip_storage

from filesystem root when

VSS_DATA_DIR

is unset, and prevents startup stalls from missing Kafka/Redis peers in standalone mode.

bash

undefined

对于独立部署的RT-Embed，请从服务目录开始操作：

bash

cd "{{repo_root}}/deploy/docker/services/rtvi/rtvi-embed"

请勿使用

/vss-deploy-profile

或

scripts/dev-profile.sh

进行此独立部署。

在执行

docker compose up

前设置最小化独立环境：

bash

export RTVI_EMBED_PORT=8017
export VSS_DATA_DIR="${VSS_DATA_DIR:-$(pwd)/.standalone-data}"
export NGC_API_KEY="<your-ngc-api-key>"
export HOST_IP="$(hostname -I | awk '{print $1}')"
export HF_TOKEN="${HF_TOKEN:-}"  # 可选，但建议设置以避免HF 429限流错误
mkdir -p "${VSS_DATA_DIR}/data_log/vst/clip_storage"
export RTVI_EMBED_KAFKA_ENABLED=false
export ENABLE_REDIS_ERROR_MESSAGES=false

此设置可避免当

VSS_DATA_DIR

未设置时从文件系统根目录挂载

/data_log/vst/clip_storage

，并防止独立模式下因缺少Kafka/Redis节点导致启动停滞。

bash

undefined

Bring up the service under the required Compose profile.

在所需的Compose配置文件下启动服务。

docker compose -f rtvi-embed-docker-compose.yml
--profile bp_developer_search_2d up -d rtvi-embed

Watch logs while the model downloads and Triton repo builds.

查看日志，监控模型下载和Triton仓库构建过程。

docker compose -f rtvi-embed-docker-compose.yml logs -f rtvi-embed


First-boot startup may take 20 minutes for the Cosmos-Embed1 download and Triton model repository build. Do not shorten the `start_period: 1200s` healthcheck during the first boot or the container will be marked unhealthy while still warming up.

docker compose -f rtvi-embed-docker-compose.yml logs -f rtvi-embed


首次启动时，Cosmos-Embed1模型下载和Triton模型仓库构建可能需要20分钟。首次启动期间请勿缩短`start_period: 1200s`的健康检查时间，否则容器在预热阶段会被标记为不健康。

Verify

验证

bash

BASE_URL="http://localhost:${RTVI_EMBED_PORT}"

curl -fsS "$BASE_URL/v1/ready"               # 200 when warm.
curl -fsS "$BASE_URL/v1/ready?detailed=true" # Component-level status.
curl -fsS "$BASE_URL/v1/version"
MODELS_JSON=$(curl -fsS "$BASE_URL/v1/models")
echo "$MODELS_JSON"                          # Confirms cosmos-embed1-448p is loaded.

MODEL_ID="$(echo "$MODELS_JSON" | jq -r '.data[0].id // empty')"
test -n "$MODEL_ID" || { echo "ERROR: /v1/models has no model id — wait until /v1/ready is 200" >&2; exit 1; }

The sections below that call the API reuse

$BASE_URL

and

$MODEL_ID

from this block.

bash

BASE_URL="http://localhost:${RTVI_EMBED_PORT}"

curl -fsS "$BASE_URL/v1/ready"               # 服务预热完成后返回200。
curl -fsS "$BASE_URL/v1/ready?detailed=true" # 组件级状态。
curl -fsS "$BASE_URL/v1/version"
MODELS_JSON=$(curl -fsS "$BASE_URL/v1/models")
echo "$MODELS_JSON"                          # 确认cosmos-embed1-448p已加载。

MODEL_ID="$(echo "$MODELS_JSON" | jq -r '.data[0].id // empty')"
test -n "$MODEL_ID" || { echo "ERROR: /v1/models未返回模型ID — 等待/v1/ready返回200" >&2; exit 1; }

后续调用API的章节会复用此代码块中的

$BASE_URL

和

$MODEL_ID

。

Common Operations

常见操作

Generate video embeddings from an uploaded file

从上传文件生成视频嵌入

bash

FILE_ID=$(curl -fsS -X POST "$BASE_URL/v1/files" \
  -F purpose=vision \
  -F media_type=video \
  -F file=@/path/to/clip.mp4 | jq -r .id)

curl -fsS -X POST "$BASE_URL/v1/generate_video_embeddings" \
  -H "Content-Type: application/json" \
  -d "{
    \"id\": \"$FILE_ID\",
    \"model\": \"$MODEL_ID\",
    \"chunk_duration\": 60,
    \"chunk_overlap_duration\": 10
  }"

bash

FILE_ID=$(curl -fsS -X POST "$BASE_URL/v1/files" \
  -F purpose=vision \
  -F media_type=video \
  -F file=@/path/to/clip.mp4 | jq -r .id)

curl -fsS -X POST "$BASE_URL/v1/generate_video_embeddings" \
  -H "Content-Type: application/json" \
  -d "{
    \"id\": \"$FILE_ID\",
    \"model\": \"$MODEL_ID\",
    \"chunk_duration\": 60,
    \"chunk_overlap_duration\": 10
  }"

Generate text embeddings (for text-to-video search)

生成文本嵌入（用于文本到视频搜索）

bash

curl -fsS -X POST "$BASE_URL/v1/generate_text_embeddings" \
  -H "Content-Type: application/json" \
  -d "{\"text_input\":\"a forklift moving pallets\",\"model\":\"${MODEL_ID}\"}"

bash

curl -fsS -X POST "$BASE_URL/v1/generate_text_embeddings" \
  -H "Content-Type: application/json" \
  -d "{\"text_input\":\"a forklift moving pallets\",\"model\":\"${MODEL_ID}\"}"

Embed a live RTSP stream

嵌入实时RTSP流

Live streams require

stream: true

and

chunk_duration > 0

. A synchronous call returns

400 BadParameters: "Only streaming output is supported for live-streams"

, and the

chunk_duration: 0

returned by

streams/add

is a placeholder — it must be overridden on the embed request or you get

400 BadParameter: "chunk_duration must be greater than 0"

POST /v1/streams/add

does not deduplicate by

liveStreamUrl

— submitting the same URL twice mints two distinct

stream_id

s. Before adding, call

GET /v1/streams/get-stream-info

and reuse any existing registration for that URL to avoid orphaned entries.

bash

STREAM_ID=$(curl -fsS -X POST "$BASE_URL/v1/streams/add" \
  -H "Content-Type: application/json" \
  -d '{"streams":[{"liveStreamUrl":"rtsp://host:port/live/video","description":"camera-001"}]}' \
  | jq -r '.results[0].id')

curl -N -X POST "$BASE_URL/v1/generate_video_embeddings" \
  -H "Content-Type: application/json" \
  -H "Accept: text/event-stream" \
  -d "{
    \"id\": \"$STREAM_ID\",
    \"model\": \"$MODEL_ID\",
    \"stream\": true,
    \"chunk_duration\": 10,
    \"chunk_overlap_duration\": 2
  }"

实时流必须设置

stream: true

且

chunk_duration > 0

。同步调用会返回

400 BadParameters: "Only streaming output is supported for live-streams"

，

streams/add

返回的

chunk_duration: 0

是占位符 — 必须在嵌入请求中覆盖该值，否则会返回

400 BadParameter: "chunk_duration must be greater than 0"

。

POST /v1/streams/add

不会根据

liveStreamUrl

去重 — 提交相同URL两次会生成两个不同的

stream_id

。添加前请调用

GET /v1/streams/get-stream-info

，复用该URL已有的注册信息以避免孤立条目。

bash

STREAM_ID=$(curl -fsS -X POST "$BASE_URL/v1/streams/add" \
  -H "Content-Type: application/json" \
  -d '{"streams":[{"liveStreamUrl":"rtsp://host:port/live/video","description":"camera-001"}]}' \
  | jq -r '.results[0].id')

curl -N -X POST "$BASE_URL/v1/generate_video_embeddings" \
  -H "Content-Type: application/json" \
  -H "Accept: text/event-stream" \
  -d "{
    \"id\": \"$STREAM_ID\",
    \"model\": \"$MODEL_ID\",
    \"stream\": true,
    \"chunk_duration\": 10,
    \"chunk_overlap_duration\": 2
  }"

List registered live streams (use this to recover stream_ids across sessions).

列出已注册的实时流（用于跨会话恢复stream_id）。

curl -fsS "$BASE_URL/v1/streams/get-stream-info"

Stop embedding for the stream when done (terminates SSE with data: [DONE]).

完成后停止流嵌入（终止SSE并返回data: [DONE]）。

curl -fsS -X DELETE "$BASE_URL/v1/generate_video_embeddings/$STREAM_ID"


See `references/rest-api.md` for the full endpoint catalog, SSE streaming, and single-stream control-plane patterns.

curl -fsS -X DELETE "$BASE_URL/v1/generate_video_embeddings/$STREAM_ID"


完整端点目录、SSE流和单流控制平面模式请查看`references/rest-api.md`。

Logs, Metrics, And Status

日志、指标与状态

bash

docker compose -f rtvi-embed-docker-compose.yml ps
docker compose -f rtvi-embed-docker-compose.yml logs -f rtvi-embed
docker stats vss-rtvi-embed

curl -fsS "$BASE_URL/v1/metrics"          # Prometheus.
curl -fsS "$BASE_URL/v1/assets/stats"     # Asset storage counts and TTL.

RTVI_EMBED_LOG_DIR

is bound to a host directory, log files are also available at

/opt/nvidia/rtvi/log/rtvi/

on the host.

bash

docker compose -f rtvi-embed-docker-compose.yml ps
docker compose -f rtvi-embed-docker-compose.yml logs -f rtvi-embed
docker stats vss-rtvi-embed

curl -fsS "$BASE_URL/v1/metrics"          # Prometheus指标。
curl -fsS "$BASE_URL/v1/assets/stats"     # 资产存储计数与TTL。

如果

RTVI_EMBED_LOG_DIR

绑定到主机目录，日志文件也可在主机的

/opt/nvidia/rtvi/log/rtvi/

路径下获取。

Integration Surface

集成接口

Inputs: REST API on

:${RTVI_EMBED_PORT}

(

POST /v1/files

POST /v1/generate_text_embeddings

POST /v1/generate_video_embeddings

, live-stream control endpoints).

Outputs: Synchronous REST responses, optional SSE for chunked video embeddings, optional Kafka messages on the topics named by
```
RTVI_EMBED_KAFKA_TOPIC
```
(container
```
KAFKA_TOPIC
```
) and
```
RTVI_EMBED_ERROR_MESSAGE_TOPIC
```
(container
```
ERROR_MESSAGE_TOPIC
```
) when Kafka is enabled (host:
```
RTVI_EMBED_KAFKA_ENABLED=true
```
, which Compose maps to container
```
KAFKA_ENABLED
```
).

Optional peers: Redis (

ENABLE_REDIS_ERROR_MESSAGES=true

), Kafka (host:

RTVI_EMBED_KAFKA_ENABLED=true

→ container

KAFKA_ENABLED

), OpenTelemetry collector (host:

RTVI_EMBED_ENABLE_OTEL_MONITORING=true

→ container

ENABLE_OTEL_MONITORING

references/integrate-vss-deploy-video-embedding.md

documents the full integration contract.

输入：

:${RTVI_EMBED_PORT}

上的REST API（

POST /v1/files

、

POST /v1/generate_text_embeddings

、

POST /v1/generate_video_embeddings

、实时流控制端点）。

输出： 同步REST响应、可选的分块视频嵌入SSE流、当Kafka启用时（主机侧：
```
RTVI_EMBED_KAFKA_ENABLED=true
```
，Compose映射到容器侧
```
KAFKA_ENABLED
```
），可选的Kafka消息发送到
```
RTVI_EMBED_KAFKA_TOPIC
```
（容器侧
```
KAFKA_TOPIC
```
）和
```
RTVI_EMBED_ERROR_MESSAGE_TOPIC
```
（容器侧
```
ERROR_MESSAGE_TOPIC
```
指定的主题）。

可选依赖： Redis（

ENABLE_REDIS_ERROR_MESSAGES=true

）、Kafka（主机侧：

RTVI_EMBED_KAFKA_ENABLED=true

→ 容器侧

KAFKA_ENABLED

）、OpenTelemetry收集器（主机侧：

RTVI_EMBED_ENABLE_OTEL_MONITORING=true

→ 容器侧

ENABLE_OTEL_MONITORING

）。

完整集成协议请查看

references/integrate-vss-deploy-video-embedding.md

。

Troubleshooting

故障排查

For common failure patterns and resolutions, see

references/troubleshooting.md

. Frequent issues:

```
/v1/ready
```
stuck at 503 → check for missing
```
NGC_API_KEY
```
, Hugging Face 429 rate-limit failures during the first-boot model download (set
```
HF_TOKEN
```
to avoid), or unreachable Redis/Kafka peers when those flags are enabled.
Healthcheck flipping unhealthy in the first 20 minutes → restore
```
start_period: 1200s
```
.
Permission errors on bind-mounted cache directories →
```
chown -R 1001:1001
```
on the host paths.

常见故障模式与解决方案请查看

references/troubleshooting.md

。高频问题：

```
/v1/ready
```
返回503状态 → 检查是否缺少
```
NGC_API_KEY
```
、首次启动模型下载时出现Hugging Face 429限流错误（设置
```
HF_TOKEN
```
可避免），或启用相关标志时无法连接Redis/Kafka节点。
首次启动20分钟内健康检查频繁切换为不健康 → 恢复
```
start_period: 1200s
```
设置。
绑定挂载的缓存目录出现权限错误 → 在主机路径上执行
```
chown -R 1001:1001
```
。

Upgrade And Rollback

升级与回滚

Update
```
RTVI_EMBED_IMAGE
```
and
```
RTVI_EMBED_TAG
```
to the target build.

docker compose -f rtvi-embed-docker-compose.yml pull rtvi-embed

docker compose -f rtvi-embed-docker-compose.yml --profile bp_developer_search_2d up -d rtvi-embed

Watch
```
/v1/ready
```
until it returns 200.
To roll back, re-pin
```
RTVI_EMBED_TAG
```
to the previous build and repeat. Named volumes persist across the swap.

将
```
RTVI_EMBED_IMAGE
```
和
```
RTVI_EMBED_TAG
```
更新为目标版本。

执行

docker compose -f rtvi-embed-docker-compose.yml pull rtvi-embed

。

执行

docker compose -f rtvi-embed-docker-compose.yml --profile bp_developer_search_2d up -d rtvi-embed

。

监控
```
/v1/ready
```
直到返回200状态。
如需回滚，将
```
RTVI_EMBED_TAG
```
重新指定为之前的版本并重复上述步骤。命名卷会在版本切换时保留。

Tear Down

服务拆除

bash

undefined

bash

undefined

Preserve caches (named volumes survive).

保留缓存（命名卷会保留）。

docker compose -f rtvi-embed-docker-compose.yml down

WARNING: removes rtvi-hf-cache, rtvi-ngc-model-cache, rtvi-triton-model-repo.

警告：此操作会删除rtvi-hf-cache、rtvi-ngc-model-cache、rtvi-triton-model-repo。

Next start will re-download the model and rebuild the Triton repo (20+ min).

下次启动时需重新下载模型并重建Triton仓库（耗时20+分钟）。

docker compose -f rtvi-embed-docker-compose.yml down -v

undefined

docker compose -f rtvi-embed-docker-compose.yml down -v

undefined

References

参考文档

File	When to read
references/README.md	Table of contents for all reference files.
references/deploy-vss-deploy-video-embedding.md	Build Vision Agent deployment reference: image, GPU, storage, startup, prerequisites, known issues.
references/integrate-vss-deploy-video-embedding.md	Build Vision Agent integration reference: peers, inputs/outputs, env vars, network, example Compose snippet.
references/rest-api.md	Full REST endpoint catalog with worked `curl` examples for file uploads, video/text embeddings, live streams, and health/metrics.
references/environment.md	Complete environment-variable matrix, including host-to-container renames and secret-sensitive variables.
references/troubleshooting.md	Operational diagnostics for startup, model/cache, runtime, and observability issues.

文件	阅读场景
references/README.md	所有参考文件的目录。
references/deploy-vss-deploy-video-embedding.md	视觉代理部署参考：镜像、GPU、存储、启动流程、前提条件、已知问题。
references/integrate-vss-deploy-video-embedding.md	视觉代理集成参考：依赖组件、输入/输出、环境变量、网络、Compose示例片段。
references/rest-api.md	完整REST端点目录，包含文件上传、视频/文本嵌入、实时流、健康/指标的 `curl` 示例。
references/environment.md	完整环境变量矩阵，包含主机到容器的变量重命名和敏感变量说明。
references/troubleshooting.md	启动、模型/缓存、运行时、可观测性问题的运维诊断指南。