vss-deploy-video-embedding

Compare original and translation side by side

🇺🇸

Original

English
🇨🇳

Translation

Chinese

VSS Video Embedding (RT-Embed)

VSS视频嵌入服务(RT-Embed)

Use this skill when you need to:
  • Deploy the VSS Video Embedding microservice from a Docker Compose file.
  • Generate text or video embeddings against the Cosmos-Embed1-448p model.
  • Embed an uploaded file, an HTTP/S3/file/data URL, or a live RTSP stream.
  • Wire the service into a VSS deployment alongside Redis, Kafka, and OpenTelemetry.
  • Triage readiness, model-download, GPU, or stream-reconnection failures.
Trigger phrases:
vss-deploy-video-embedding
,
RT-Embed
,
rtvi-embed
,
video embedding service
,
Cosmos-Embed1
,
embed live stream
,
embed video file
,
generate video embeddings
,
text embedding for video search
.
当你需要完成以下操作时使用本技能:
  • 通过Docker Compose文件部署VSS视频嵌入微服务。
  • 基于Cosmos-Embed1-448p模型生成文本或视频嵌入。
  • 对上传文件、HTTP/S3/file/data URL或实时RTSP流进行嵌入处理。
  • 将该服务与Redis、Kafka和OpenTelemetry集成到VSS部署架构中。
  • 排查就绪状态、模型下载、GPU或流重连相关故障。
触发短语:
vss-deploy-video-embedding
,
RT-Embed
,
rtvi-embed
,
video embedding service
,
Cosmos-Embed1
,
embed live stream
,
embed video file
,
generate video embeddings
,
text embedding for video search

Service Snapshot

服务快照

  • VSS 3.2 GA skill:
    vss-deploy-video-embedding
    .
  • Legacy 3.1 name: RT-Embed.
  • Compose service:
    rtvi-embed
    .
  • Container name:
    vss-rtvi-embed
    .
  • Image:
    nvcr.io/nvstaging/vss-core/vss-rt-embed
    (override with
    RTVI_EMBED_IMAGE
    ).
  • Default tag:
    3.2.0-26.05.4
    (override with
    RTVI_EMBED_TAG
    ).
  • Profile:
    bp_developer_search_2d
    .
  • Container port:
    8000
    (host-side
    ${RTVI_EMBED_PORT}
    ).
  • Default model:
    cosmos-embed1-448p
    from
    nvidia/Cosmos-Embed1-448p
    .
  • Health endpoint:
    GET /v1/ready
    .
  • Healthcheck startup grace:
    1200s
    (20 minutes) on first boot.
  • VSS 3.2 GA技能:
    vss-deploy-video-embedding
  • 旧版3.1名称: RT-Embed。
  • Compose服务名:
    rtvi-embed
  • 容器名称:
    vss-rtvi-embed
  • 镜像:
    nvcr.io/nvstaging/vss-core/vss-rt-embed
    (可通过
    RTVI_EMBED_IMAGE
    覆盖)。
  • 默认标签:
    3.2.0-26.05.4
    (可通过
    RTVI_EMBED_TAG
    覆盖)。
  • 配置文件:
    bp_developer_search_2d
  • 容器端口:
    8000
    (主机侧为
    ${RTVI_EMBED_PORT}
    )。
  • 默认模型: 来自
    nvidia/Cosmos-Embed1-448p
    cosmos-embed1-448p
  • 健康检查端点:
    GET /v1/ready
  • 健康检查启动宽限期: 首次启动时为
    1200s
    (20分钟)。

Prerequisites

前提条件

Before bringing the service up:
  1. NVIDIA driver + NVIDIA Container Toolkit installed; default runtime set to
    nvidia
    .
  2. Docker Engine and Docker Compose plugin recent enough to support
    ${VAR:+value}
    conditional volume substitution.
  3. docker login nvcr.io
    completed with
    $oauthtoken
    and a valid NGC API key.
  4. Host environment provides at minimum:
    RTVI_EMBED_PORT
    ,
    VSS_DATA_DIR
    ,
    NGC_API_KEY
    , and optionally
    HF_TOKEN
    to avoid Hugging Face 429 rate-limit errors during the Cosmos-Embed1 weights download.
  5. Free disk space for persistent caches:
    rtvi-hf-cache
    ,
    rtvi-ngc-model-cache
    ,
    rtvi-triton-model-repo
    (multi-GB).
See
references/deploy-vss-deploy-video-embedding.md
for the full prerequisite list and
references/environment.md
for the variable matrix.
启动服务前需满足:
  1. 已安装NVIDIA驱动 + NVIDIA Container Toolkit;默认运行时设置为
    nvidia
  2. Docker Engine和Docker Compose插件版本足够新,支持
    ${VAR:+value}
    条件式卷替换。
  3. 已使用
    $oauthtoken
    和有效的NGC API密钥完成
    docker login nvcr.io
    登录。
  4. 主机环境至少提供:
    RTVI_EMBED_PORT
    VSS_DATA_DIR
    NGC_API_KEY
    ,可选提供
    HF_TOKEN
    以避免Cosmos-Embed1权重下载时出现Hugging Face 429限流错误。
  5. 为持久化缓存预留足够磁盘空间:
    rtvi-hf-cache
    rtvi-ngc-model-cache
    rtvi-triton-model-repo
    (多GB级)。
完整前提条件列表请查看
references/deploy-vss-deploy-video-embedding.md
,变量矩阵请查看
references/environment.md

Deploy

部署

For standalone RT-Embed, work from the service directory:
bash
cd "{{repo_root}}/deploy/docker/services/rtvi/rtvi-embed"
Do not use
/vss-deploy-profile
or
scripts/dev-profile.sh
for this standalone deployment.
Set a minimal standalone environment before
docker compose up
:
bash
export RTVI_EMBED_PORT=8017
export VSS_DATA_DIR="${VSS_DATA_DIR:-$(pwd)/.standalone-data}"
export NGC_API_KEY="<your-ngc-api-key>"
export HOST_IP="$(hostname -I | awk '{print $1}')"
export HF_TOKEN="${HF_TOKEN:-}"  # optional, but recommended to avoid HF 429s
mkdir -p "${VSS_DATA_DIR}/data_log/vst/clip_storage"
export RTVI_EMBED_KAFKA_ENABLED=false
export ENABLE_REDIS_ERROR_MESSAGES=false
This avoids mounting
/data_log/vst/clip_storage
from filesystem root when
VSS_DATA_DIR
is unset, and prevents startup stalls from missing Kafka/Redis peers in standalone mode.
bash
undefined
对于独立部署的RT-Embed,请从服务目录开始操作:
bash
cd "{{repo_root}}/deploy/docker/services/rtvi/rtvi-embed"
请勿使用
/vss-deploy-profile
scripts/dev-profile.sh
进行此独立部署。
在执行
docker compose up
前设置最小化独立环境:
bash
export RTVI_EMBED_PORT=8017
export VSS_DATA_DIR="${VSS_DATA_DIR:-$(pwd)/.standalone-data}"
export NGC_API_KEY="<your-ngc-api-key>"
export HOST_IP="$(hostname -I | awk '{print $1}')"
export HF_TOKEN="${HF_TOKEN:-}"  # 可选,但建议设置以避免HF 429限流错误
mkdir -p "${VSS_DATA_DIR}/data_log/vst/clip_storage"
export RTVI_EMBED_KAFKA_ENABLED=false
export ENABLE_REDIS_ERROR_MESSAGES=false
此设置可避免当
VSS_DATA_DIR
未设置时从文件系统根目录挂载
/data_log/vst/clip_storage
,并防止独立模式下因缺少Kafka/Redis节点导致启动停滞。
bash
undefined

Bring up the service under the required Compose profile.

在所需的Compose配置文件下启动服务。

docker compose -f rtvi-embed-docker-compose.yml
--profile bp_developer_search_2d up -d rtvi-embed
docker compose -f rtvi-embed-docker-compose.yml
--profile bp_developer_search_2d up -d rtvi-embed

Watch logs while the model downloads and Triton repo builds.

查看日志,监控模型下载和Triton仓库构建过程。

docker compose -f rtvi-embed-docker-compose.yml logs -f rtvi-embed

First-boot startup may take 20 minutes for the Cosmos-Embed1 download and Triton model repository build. Do not shorten the `start_period: 1200s` healthcheck during the first boot or the container will be marked unhealthy while still warming up.
docker compose -f rtvi-embed-docker-compose.yml logs -f rtvi-embed

首次启动时,Cosmos-Embed1模型下载和Triton模型仓库构建可能需要20分钟。首次启动期间请勿缩短`start_period: 1200s`的健康检查时间,否则容器在预热阶段会被标记为不健康。

Verify

验证

bash
BASE_URL="http://localhost:${RTVI_EMBED_PORT}"

curl -fsS "$BASE_URL/v1/ready"               # 200 when warm.
curl -fsS "$BASE_URL/v1/ready?detailed=true" # Component-level status.
curl -fsS "$BASE_URL/v1/version"
MODELS_JSON=$(curl -fsS "$BASE_URL/v1/models")
echo "$MODELS_JSON"                          # Confirms cosmos-embed1-448p is loaded.

MODEL_ID="$(echo "$MODELS_JSON" | jq -r '.data[0].id // empty')"
test -n "$MODEL_ID" || { echo "ERROR: /v1/models has no model id — wait until /v1/ready is 200" >&2; exit 1; }
The sections below that call the API reuse
$BASE_URL
and
$MODEL_ID
from this block.
bash
BASE_URL="http://localhost:${RTVI_EMBED_PORT}"

curl -fsS "$BASE_URL/v1/ready"               # 服务预热完成后返回200。
curl -fsS "$BASE_URL/v1/ready?detailed=true" # 组件级状态。
curl -fsS "$BASE_URL/v1/version"
MODELS_JSON=$(curl -fsS "$BASE_URL/v1/models")
echo "$MODELS_JSON"                          # 确认cosmos-embed1-448p已加载。

MODEL_ID="$(echo "$MODELS_JSON" | jq -r '.data[0].id // empty')"
test -n "$MODEL_ID" || { echo "ERROR: /v1/models未返回模型ID — 等待/v1/ready返回200" >&2; exit 1; }
后续调用API的章节会复用此代码块中的
$BASE_URL
$MODEL_ID

Common Operations

常见操作

Generate video embeddings from an uploaded file

从上传文件生成视频嵌入

bash
FILE_ID=$(curl -fsS -X POST "$BASE_URL/v1/files" \
  -F purpose=vision \
  -F media_type=video \
  -F file=@/path/to/clip.mp4 | jq -r .id)

curl -fsS -X POST "$BASE_URL/v1/generate_video_embeddings" \
  -H "Content-Type: application/json" \
  -d "{
    \"id\": \"$FILE_ID\",
    \"model\": \"$MODEL_ID\",
    \"chunk_duration\": 60,
    \"chunk_overlap_duration\": 10
  }"
bash
FILE_ID=$(curl -fsS -X POST "$BASE_URL/v1/files" \
  -F purpose=vision \
  -F media_type=video \
  -F file=@/path/to/clip.mp4 | jq -r .id)

curl -fsS -X POST "$BASE_URL/v1/generate_video_embeddings" \
  -H "Content-Type: application/json" \
  -d "{
    \"id\": \"$FILE_ID\",
    \"model\": \"$MODEL_ID\",
    \"chunk_duration\": 60,
    \"chunk_overlap_duration\": 10
  }"

Generate text embeddings (for text-to-video search)

生成文本嵌入(用于文本到视频搜索)

bash
curl -fsS -X POST "$BASE_URL/v1/generate_text_embeddings" \
  -H "Content-Type: application/json" \
  -d "{\"text_input\":\"a forklift moving pallets\",\"model\":\"${MODEL_ID}\"}"
bash
curl -fsS -X POST "$BASE_URL/v1/generate_text_embeddings" \
  -H "Content-Type: application/json" \
  -d "{\"text_input\":\"a forklift moving pallets\",\"model\":\"${MODEL_ID}\"}"

Embed a live RTSP stream

嵌入实时RTSP流

Live streams require
stream: true
and
chunk_duration > 0
. A synchronous call returns
400 BadParameters: "Only streaming output is supported for live-streams"
, and the
chunk_duration: 0
returned by
streams/add
is a placeholder — it must be overridden on the embed request or you get
400 BadParameter: "chunk_duration must be greater than 0"
.
POST /v1/streams/add
does not deduplicate by
liveStreamUrl
— submitting the same URL twice mints two distinct
stream_id
s. Before adding, call
GET /v1/streams/get-stream-info
and reuse any existing registration for that URL to avoid orphaned entries.
bash
STREAM_ID=$(curl -fsS -X POST "$BASE_URL/v1/streams/add" \
  -H "Content-Type: application/json" \
  -d '{"streams":[{"liveStreamUrl":"rtsp://host:port/live/video","description":"camera-001"}]}' \
  | jq -r '.results[0].id')

curl -N -X POST "$BASE_URL/v1/generate_video_embeddings" \
  -H "Content-Type: application/json" \
  -H "Accept: text/event-stream" \
  -d "{
    \"id\": \"$STREAM_ID\",
    \"model\": \"$MODEL_ID\",
    \"stream\": true,
    \"chunk_duration\": 10,
    \"chunk_overlap_duration\": 2
  }"
实时流必须设置
stream: true
chunk_duration > 0
。同步调用会返回
400 BadParameters: "Only streaming output is supported for live-streams"
streams/add
返回的
chunk_duration: 0
是占位符 — 必须在嵌入请求中覆盖该值,否则会返回
400 BadParameter: "chunk_duration must be greater than 0"
POST /v1/streams/add
不会根据
liveStreamUrl
去重 — 提交相同URL两次会生成两个不同的
stream_id
。添加前请调用
GET /v1/streams/get-stream-info
,复用该URL已有的注册信息以避免孤立条目。
bash
STREAM_ID=$(curl -fsS -X POST "$BASE_URL/v1/streams/add" \
  -H "Content-Type: application/json" \
  -d '{"streams":[{"liveStreamUrl":"rtsp://host:port/live/video","description":"camera-001"}]}' \
  | jq -r '.results[0].id')

curl -N -X POST "$BASE_URL/v1/generate_video_embeddings" \
  -H "Content-Type: application/json" \
  -H "Accept: text/event-stream" \
  -d "{
    \"id\": \"$STREAM_ID\",
    \"model\": \"$MODEL_ID\",
    \"stream\": true,
    \"chunk_duration\": 10,
    \"chunk_overlap_duration\": 2
  }"

List registered live streams (use this to recover stream_ids across sessions).

列出已注册的实时流(用于跨会话恢复stream_id)。

curl -fsS "$BASE_URL/v1/streams/get-stream-info"
curl -fsS "$BASE_URL/v1/streams/get-stream-info"

Stop embedding for the stream when done (terminates SSE with data: [DONE]).

完成后停止流嵌入(终止SSE并返回data: [DONE])。

curl -fsS -X DELETE "$BASE_URL/v1/generate_video_embeddings/$STREAM_ID"

See `references/rest-api.md` for the full endpoint catalog, SSE streaming, and single-stream control-plane patterns.
curl -fsS -X DELETE "$BASE_URL/v1/generate_video_embeddings/$STREAM_ID"

完整端点目录、SSE流和单流控制平面模式请查看`references/rest-api.md`。

Logs, Metrics, And Status

日志、指标与状态

bash
docker compose -f rtvi-embed-docker-compose.yml ps
docker compose -f rtvi-embed-docker-compose.yml logs -f rtvi-embed
docker stats vss-rtvi-embed

curl -fsS "$BASE_URL/v1/metrics"          # Prometheus.
curl -fsS "$BASE_URL/v1/assets/stats"     # Asset storage counts and TTL.
If
RTVI_EMBED_LOG_DIR
is bound to a host directory, log files are also available at
/opt/nvidia/rtvi/log/rtvi/
on the host.
bash
docker compose -f rtvi-embed-docker-compose.yml ps
docker compose -f rtvi-embed-docker-compose.yml logs -f rtvi-embed
docker stats vss-rtvi-embed

curl -fsS "$BASE_URL/v1/metrics"          # Prometheus指标。
curl -fsS "$BASE_URL/v1/assets/stats"     # 资产存储计数与TTL。
如果
RTVI_EMBED_LOG_DIR
绑定到主机目录,日志文件也可在主机的
/opt/nvidia/rtvi/log/rtvi/
路径下获取。

Integration Surface

集成接口

  • Inputs: REST API on
    :${RTVI_EMBED_PORT}
    (
    POST /v1/files
    ,
    POST /v1/generate_text_embeddings
    ,
    POST /v1/generate_video_embeddings
    , live-stream control endpoints).
  • Outputs: Synchronous REST responses, optional SSE for chunked video embeddings, optional Kafka messages on the topics named by
    RTVI_EMBED_KAFKA_TOPIC
    (container
    KAFKA_TOPIC
    ) and
    RTVI_EMBED_ERROR_MESSAGE_TOPIC
    (container
    ERROR_MESSAGE_TOPIC
    ) when Kafka is enabled (host:
    RTVI_EMBED_KAFKA_ENABLED=true
    , which Compose maps to container
    KAFKA_ENABLED
    ).
  • Optional peers: Redis (
    ENABLE_REDIS_ERROR_MESSAGES=true
    ), Kafka (host:
    RTVI_EMBED_KAFKA_ENABLED=true
    → container
    KAFKA_ENABLED
    ), OpenTelemetry collector (host:
    RTVI_EMBED_ENABLE_OTEL_MONITORING=true
    → container
    ENABLE_OTEL_MONITORING
    ).
references/integrate-vss-deploy-video-embedding.md
documents the full integration contract.
  • 输入:
    :${RTVI_EMBED_PORT}
    上的REST API(
    POST /v1/files
    POST /v1/generate_text_embeddings
    POST /v1/generate_video_embeddings
    、实时流控制端点)。
  • 输出: 同步REST响应、可选的分块视频嵌入SSE流、当Kafka启用时(主机侧:
    RTVI_EMBED_KAFKA_ENABLED=true
    ,Compose映射到容器侧
    KAFKA_ENABLED
    ),可选的Kafka消息发送到
    RTVI_EMBED_KAFKA_TOPIC
    (容器侧
    KAFKA_TOPIC
    )和
    RTVI_EMBED_ERROR_MESSAGE_TOPIC
    (容器侧
    ERROR_MESSAGE_TOPIC
    指定的主题)。
  • 可选依赖: Redis(
    ENABLE_REDIS_ERROR_MESSAGES=true
    )、Kafka(主机侧:
    RTVI_EMBED_KAFKA_ENABLED=true
    → 容器侧
    KAFKA_ENABLED
    )、OpenTelemetry收集器(主机侧:
    RTVI_EMBED_ENABLE_OTEL_MONITORING=true
    → 容器侧
    ENABLE_OTEL_MONITORING
    )。
完整集成协议请查看
references/integrate-vss-deploy-video-embedding.md

Troubleshooting

故障排查

For common failure patterns and resolutions, see
references/troubleshooting.md
. Frequent issues:
  • /v1/ready
    stuck at 503 → check for missing
    NGC_API_KEY
    , Hugging Face 429 rate-limit failures during the first-boot model download (set
    HF_TOKEN
    to avoid), or unreachable Redis/Kafka peers when those flags are enabled.
  • Healthcheck flipping unhealthy in the first 20 minutes → restore
    start_period: 1200s
    .
  • Permission errors on bind-mounted cache directories →
    chown -R 1001:1001
    on the host paths.
常见故障模式与解决方案请查看
references/troubleshooting.md
。高频问题:
  • /v1/ready
    返回503状态 → 检查是否缺少
    NGC_API_KEY
    、首次启动模型下载时出现Hugging Face 429限流错误(设置
    HF_TOKEN
    可避免),或启用相关标志时无法连接Redis/Kafka节点。
  • 首次启动20分钟内健康检查频繁切换为不健康 → 恢复
    start_period: 1200s
    设置。
  • 绑定挂载的缓存目录出现权限错误 → 在主机路径上执行
    chown -R 1001:1001

Upgrade And Rollback

升级与回滚

  1. Update
    RTVI_EMBED_IMAGE
    and
    RTVI_EMBED_TAG
    to the target build.
  2. docker compose -f rtvi-embed-docker-compose.yml pull rtvi-embed
    .
  3. docker compose -f rtvi-embed-docker-compose.yml --profile bp_developer_search_2d up -d rtvi-embed
    .
  4. Watch
    /v1/ready
    until it returns 200.
  5. To roll back, re-pin
    RTVI_EMBED_TAG
    to the previous build and repeat. Named volumes persist across the swap.
  1. RTVI_EMBED_IMAGE
    RTVI_EMBED_TAG
    更新为目标版本。
  2. 执行
    docker compose -f rtvi-embed-docker-compose.yml pull rtvi-embed
  3. 执行
    docker compose -f rtvi-embed-docker-compose.yml --profile bp_developer_search_2d up -d rtvi-embed
  4. 监控
    /v1/ready
    直到返回200状态。
  5. 如需回滚,将
    RTVI_EMBED_TAG
    重新指定为之前的版本并重复上述步骤。命名卷会在版本切换时保留。

Tear Down

服务拆除

bash
undefined
bash
undefined

Preserve caches (named volumes survive).

保留缓存(命名卷会保留)。

docker compose -f rtvi-embed-docker-compose.yml down
docker compose -f rtvi-embed-docker-compose.yml down

WARNING: removes rtvi-hf-cache, rtvi-ngc-model-cache, rtvi-triton-model-repo.

警告:此操作会删除rtvi-hf-cache、rtvi-ngc-model-cache、rtvi-triton-model-repo。

Next start will re-download the model and rebuild the Triton repo (20+ min).

下次启动时需重新下载模型并重建Triton仓库(耗时20+分钟)。

docker compose -f rtvi-embed-docker-compose.yml down -v
undefined
docker compose -f rtvi-embed-docker-compose.yml down -v
undefined

References

参考文档

FileWhen to read
references/README.mdTable of contents for all reference files.
references/deploy-vss-deploy-video-embedding.mdBuild Vision Agent deployment reference: image, GPU, storage, startup, prerequisites, known issues.
references/integrate-vss-deploy-video-embedding.mdBuild Vision Agent integration reference: peers, inputs/outputs, env vars, network, example Compose snippet.
references/rest-api.mdFull REST endpoint catalog with worked
curl
examples for file uploads, video/text embeddings, live streams, and health/metrics.
references/environment.mdComplete environment-variable matrix, including host-to-container renames and secret-sensitive variables.
references/troubleshooting.mdOperational diagnostics for startup, model/cache, runtime, and observability issues.
文件阅读场景
references/README.md所有参考文件的目录。
references/deploy-vss-deploy-video-embedding.md视觉代理部署参考:镜像、GPU、存储、启动流程、前提条件、已知问题。
references/integrate-vss-deploy-video-embedding.md视觉代理集成参考:依赖组件、输入/输出、环境变量、网络、Compose示例片段。
references/rest-api.md完整REST端点目录,包含文件上传、视频/文本嵌入、实时流、健康/指标的
curl
示例。
references/environment.md完整环境变量矩阵,包含主机到容器的变量重命名和敏感变量说明。
references/troubleshooting.md启动、模型/缓存、运行时、可观测性问题的运维诊断指南。