vss-deploy-profile
Compare original and translation side by side
🇺🇸
Original
English🇨🇳
Translation
ChineseVSS Deploy
VSS 部署
Purpose
用途
Deploy any VSS profile (, , , , , ) using a compose-centric workflow: build env overrides, generate resolved compose (dry-run), review, then deploy. This SKILL.md covers the cross-profile concerns (profile routing, prerequisites, NGC, GPU setup, and the deploy/teardown flow). Profile-specific service lists, sizing, env recipes, endpoints, and debugging live in per-profile reference docs — load the one that matches the user's intent.
basesearchlvswarehousealertsedgeHelper script: normalizes a dry-run dump for diff-friendly review during Step 3c. All other deployment work goes through / .
run_script("scripts/normalize_resolved_yml.py", "<resolved.yml>")docker compose configcomposedev-profile.sh采用以Compose为核心的工作流部署任意VSS配置文件(、、、、、):构建环境覆盖配置、生成已解析的Compose配置(dry-run)、审查、然后部署。本SKILL.md涵盖跨配置文件的通用事项(配置文件路由、前置条件、NGC、GPU设置以及部署/拆除流程)。特定配置文件的服务列表、资源规格、环境配置方案、端点和调试内容请查看对应配置文件的参考文档——加载与用户需求匹配的文档即可。
basesearchlvswarehousealertsedge辅助脚本: 用于标准化的dry-run输出结果,以便在步骤3c中进行便于对比的审查。所有其他部署操作均通过 / 完成。
run_script("scripts/normalize_resolved_yml.py", "<resolved.yml>")docker compose configcomposedev-profile.shAvailable Scripts
可用脚本
| Script | Purpose | Arguments |
|---|---|---|
| Strip optional | Path to |
| 脚本 | 用途 | 参数 |
|---|---|---|
| 移除 | |
Profile Routing
配置文件路由
Match the user's request to a profile, then load that profile's reference for sizing, services, env recipes, and debugging.
| User says | Profile | Reference |
|---|---|---|
| "deploy vss" / "deploy base" | | |
| "deploy alerts" / "alert verification" / "real-time alerts" / "deploy for incident report" | | |
| "deploy lvs" / "video summarization" | | |
| "deploy search" / "video search" | | |
| "deploy warehouse" / "warehouse blueprint" / "vss warehouse" | | |
| "debug warehouse" / "warehouse not working" / "warehouse FPS low" / "warehouse BEV out of sync" | | |
Edge hardware routing (DGX Spark, AGX/IGX Thor): see . DGX Spark uses the Spark Nano 9B standalone local LLM on port ; AGX/IGX Thor uses the Edge 4B standalone vLLM fallback.
references/edge.md30081Each profile's reference owns its sizing table. Don't pick a deployment shape from this file — open the profile reference and check minimum GPU count for the host's hardware against the (mode × platform) matrix there.
将用户请求与对应配置文件匹配,然后加载该配置文件的参考文档以获取资源规格、服务信息、环境配置方案和调试内容。
| 用户表述 | 配置文件 | 参考文档 |
|---|---|---|
| "deploy vss" / "deploy base" | | |
| "deploy alerts" / "alert verification" / "real-time alerts" / "deploy for incident report" | | |
| "deploy lvs" / "video summarization" | | |
| "deploy search" / "video search" | | |
| "deploy warehouse" / "warehouse blueprint" / "vss warehouse" | | |
| "debug warehouse" / "warehouse not working" / "warehouse FPS low" / "warehouse BEV out of sync" | | |
边缘硬件路由(DGX Spark、AGX/IGX Thor):请查看。DGX Spark使用端口上的Spark Nano 9B独立本地LLM;AGX/IGX Thor使用Edge 4B独立vLLM降级方案。
references/edge.md30081每个配置文件的参考文档独立维护其资源规格表。请勿从本文档中选择部署配置——请打开对应配置文件的参考文档,对照(模式×平台)矩阵检查主机硬件所需的最低GPU数量。
Instructions
操作步骤
The deployment flow is always: copy to , apply overrides, dry-run compose into , review, normalize, deploy, then wait for readiness.
.envgenerated.envresolved.ymlbash
undefined部署流程始终遵循以下步骤:复制到、应用覆盖配置、通过dry-run生成、审查、标准化、部署、然后等待就绪。
.envgenerated.envresolved.ymlbash
undefined1. cp dev-profile-<profile>/.env dev-profile-<profile>/generated.env (clean copy)
1. cp dev-profile-<profile>/.env dev-profile-<profile>/generated.env (干净副本)
2. Apply env overrides to generated.env (source .env stays untouched)
2. 对generated.env应用环境覆盖配置 (源.env保持不变)
3. docker compose --env-file generated.env config > resolved.yml (dry-run)
3. docker compose --env-file generated.env config > resolved.yml (dry-run)
4. Review resolved.yml
4. 审查resolved.yml
5. docker compose --env-file generated.env -f resolved.yml up -d
5. docker compose --env-file generated.env -f resolved.yml up -d
The source `.env` is treated as **read-only defaults** committed to the repo. The skill's per-deploy working copy is `generated.env` — same pattern `dev-profile.sh` uses internally. This keeps the checked-in `.env` clean across iterations.
源`.env`被视为提交到仓库的**只读默认配置**。技能的每次部署工作副本为`generated.env`——与`dev-profile.sh`内部使用的模式一致。这样可以确保迭代过程中已签入的`.env`保持干净。Prerequisites
前置条件
- Repo path — find on disk. Check
video-search-and-summarization/if available.TOOLS.md - NGC CLI & API key — see . Confirm
references/ngc.mdis set.$NGC_CLI_API_KEY - System prerequisites (GPU driver, Docker, NVIDIA Container Toolkit, kernel sysctls) — full checks in . Canonical hardware/driver matrix is the VSS prerequisites page.
references/prerequisites.md
- 仓库路径——在磁盘上找到目录。如果有
video-search-and-summarization/请查看该文档。TOOLS.md - NGC CLI & API密钥——请查看。确认已设置
references/ngc.md。$NGC_CLI_API_KEY - 系统前置条件(GPU驱动、Docker、NVIDIA Container Toolkit、内核系统参数)——完整检查项请查看。标准硬件/驱动矩阵请参考VSS前置条件页面。
references/prerequisites.md
Pre-flight check
预部署检查
Run before every deploy. The full system checklist and remediation steps live
in .
For DGX Spark / IGX Thor / AGX Thor, also run the cache-cleaner check in
.
references/prerequisites.mdreferences/edge.mdDetect sudo mode first. Several pre-flight remediations and the
edge cache-cleaner installer call . If the host requires a
sudo password, those steps will silently no-op under and
leave the deploy in a half-prepared state.
sudosudo -nbash
if sudo -n true 2>/dev/null; then
echo "passwordless sudo — pre-flight will auto-install missing pieces"
else
echo "sudo requires password — pre-flight will NOT auto-install; hand commands to the user"
fiWhen sudo needs a password, the skill must not run privileged
installers itself. Surface the copy-pasteable command block from
to the user with a "run this once and
confirm" handoff, then resume after the user replies.
references/prerequisites.mdMinimum smoke test (must succeed):
bash
nvidia-smi --query-gpu=index,name --format=csv,noheader
docker info 2>/dev/null | grep -qi runtimes \
&& docker run --rm --gpus all ubuntu:22.04 nvidia-smi >/dev/null 2>&1 \
&& echo "nvidia runtime OK"If the smoke test fails, do not proceed; open
for the remediation tree.
references/prerequisites.md每次部署前均需运行。完整的系统检查清单和修复步骤请查看。对于DGX Spark / IGX Thor / AGX Thor,还需运行中的缓存清理程序检查。
references/prerequisites.mdreferences/edge.md首先检测sudo模式。多项预部署修复操作和边缘缓存清理安装程序会调用。如果主机需要sudo密码,这些步骤在下会静默执行失败,导致部署处于半准备状态。
sudosudo -nbash
if sudo -n true 2>/dev/null; then
echo "无需密码的sudo权限——预部署将自动安装缺失组件"
else
echo "sudo需要密码——预部署将不会自动安装;请将命令交给用户手动执行"
fi当sudo需要密码时,技能不得自行运行特权安装程序。请从中提取可复制粘贴的命令块提供给用户,并提示“运行此命令一次并确认完成”,等待用户回复后再继续。
references/prerequisites.md最低验证测试(必须成功):
bash
nvidia-smi --query-gpu=index,name --format=csv,noheader
docker info 2>/dev/null | grep -qi runtimes \
&& docker run --rm --gpus all ubuntu:22.04 nvidia-smi >/dev/null 2>&1 \
&& echo "nvidia runtime OK"如果验证测试失败,请勿继续部署;请打开查看修复流程。
references/prerequisites.mdModel Selection
模型选择
- /
$LLM_REMOTE_URLif the user asks for remote$VLM_REMOTE_URL - (local NIMs) or
$NGC_CLI_API_KEY(remote)$NVIDIA_API_KEY
If no combination on this host satisfies the profile's sizing requirements, stop and report the blocker — don't silently pick another shape.
Edge shared mode is platform-specific. On DGX Spark, runas a standalone local NIM on portnvcr.io/nim/nvidia/nvidia-nemotron-nano-9b-v2-dgx-spark:1.0.0-variantand point the agent at it with30081. On AGX/IGX Thor, keep using the Edge 4B standalone vLLM fallback withLLM_MODE=remote. Full recipes are inHF_TOKEN.references/edge.md
- 如果用户要求使用远程模型,设置/
$LLM_REMOTE_URL$VLM_REMOTE_URL - 本地NIM使用,远程模型使用
$NGC_CLI_API_KEY$NVIDIA_API_KEY
如果主机无法满足所选配置文件的资源规格要求,请停止操作并报告阻塞问题——请勿擅自选择其他配置。
边缘共享模式具有平台特异性。在DGX Spark上,运行作为端口nvcr.io/nim/nvidia/nvidia-nemotron-nano-9b-v2-dgx-spark:1.0.0-variant上的独立本地NIM,并通过30081让agent指向该模型。在AGX/IGX Thor上,继续使用带LLM_MODE=remote的Edge 4B独立vLLM降级方案。完整配置方案请查看HF_TOKEN。references/edge.md
Deployment Flow
部署流程
Always follow this sequence. Never skip the dry-run.
请始终遵循以下步骤序列,切勿跳过dry-run步骤。
Step 0 — Tear down any existing deployment + clear data volumes
步骤0 — 拆除现有部署并清理数据卷
If a deployment already exists, tear it down AND clear stale data volumes before redeploying.
Full procedure lives in .
references/teardown.md如果已存在部署,在重新部署前需拆除现有部署并清理陈旧数据卷。
完整流程请查看。
references/teardown.mdStep 0a — Credentials gate (run before any env mutation)
步骤0a — 凭据验证(在任何环境变更前运行)
Validate every credential the chosen profile needs before Step 1c copies to . A 401 here is a 30-second failure; the same 401 inside a NIM cold-start is a 10–20 min failure. Run the discovery and probe flow in , then map the result against the chosen mode: missing or invalid required credentials are blockers, optional credentials are not.
.envgenerated.envreferences/credentials.md在步骤1c将复制到之前,验证所选配置文件所需的所有凭据。在此阶段出现401错误仅需30秒即可排查,而在NIM冷启动过程中出现相同错误则需要10–20分钟才能发现。请运行中的发现和探测流程,然后将结果与所选模式匹配:缺失或无效的必填凭据属于阻塞问题,可选凭据则不会阻塞部署。
.envgenerated.envreferences/credentials.mdStep 1 — Gather context
步骤1 — 收集上下文信息
Before building env overrides, confirm:
| Value | How to determine |
|---|---|
| Profile | Match user intent to the routing table above. Default: |
| Repo path | Find |
| Hardware | |
| LLM/VLM placement | Cross-reference available GPUs against the chosen profile's Minimum GPU count table |
| API keys | |
| |
| Browser-reachable host/IP. On Brev, use the secure-link domain (see |
| Browser-facing ingress port. Default |
Before , verify , , , and are populated with browser-reachable values. Otherwise the stack may appear healthy while UI/API/VST links 404 or loop through Cloudflare Access.
docker compose upEXTERNAL_IPHAPROXY_PORTVSS_PUBLIC_HOSTVSS_PUBLIC_PORT在构建环境覆盖配置前,请确认以下信息:
| 信息项 | 确认方式 |
|---|---|
| 配置文件 | 根据用户需求与上述路由表匹配。默认值: |
| 仓库路径 | 在磁盘上找到 |
| 硬件信息 | 运行 |
| LLM/VLM部署位置 | 将可用GPU与所选配置文件的最低GPU数量表交叉核对 |
| API密钥 | 本地NIM使用 |
| 运行 |
| 可通过浏览器访问的主机/IP。在Brev平台上,请使用安全链接域名(请查看 |
| 面向浏览器的入口端口。默认值为 |
在运行前,请验证、、和已配置为可通过浏览器访问的值。否则,即使堆栈显示健康状态,UI/API/VST链接也可能出现404错误或通过Cloudflare Access循环跳转。
docker compose upEXTERNAL_IPHAPROXY_PORTVSS_PUBLIC_HOSTVSS_PUBLIC_PORTStep 1b — Prepare the data directory
步骤1b — 准备数据目录
Layout (asset paths, ownership, mount points, profile-specific subdirs) is documented in . Read that file before deploying for the first time on a host or when changing profiles.
references/data-directory.mdFORBIDDEN:(or any recursive chown).chown -R ubuntu:ubuntu $VSS_DATA_DIRThis is "good housekeeping" to a shell-admin instinct but is the deploy-breaking command in this stack. You will observe a "healthy" deploy (containers Up, endpoints 200) while the video pipeline is silently broken. Useon the specific subdirs documented inchmod -R 777— nothing else.data-directory.md
目录结构(资源路径、权限、挂载点、配置文件特定子目录)请查看。首次在主机上部署或切换配置文件前,请阅读该文档。
references/data-directory.md禁止执行:(或任何递归chown命令)chown -R ubuntu:ubuntu $VSS_DATA_DIR对于shell管理员来说这看似是“良好的管理操作”,但却是导致本部署堆栈失败的关键命令。执行该命令后,部署可能显示“健康”状态(容器已启动、端点返回200),但视频流水线会静默失效。请仅对中记录的特定子目录执行data-directory.md——请勿对其他目录执行该操作。chmod -R 777
Step 1c — Initialize generated.env
generated.env步骤1c — 初始化generated.env
generated.envThe skill's per-deploy working copy. Always start from a fresh copy of the source — never mutate the source.
.envbash
PROFILE=base
ENV_SRC=$REPO/deploy/docker/developer-profiles/dev-profile-$PROFILE/.env
ENV_GEN=$REPO/deploy/docker/developer-profiles/dev-profile-$PROFILE/generated.env
cp "$ENV_SRC" "$ENV_GEN"All subsequent writes (Brev , the env_overrides dict from Step 2) go to . is read-only from here on.
EXTERNAL_IP$ENV_GEN$ENV_SRC技能的每次部署工作副本。请始终从源的干净副本开始——切勿修改源文件。
.envbash
PROFILE=base
ENV_SRC=$REPO/deploy/docker/developer-profiles/dev-profile-$PROFILE/.env
ENV_GEN=$REPO/deploy/docker/developer-profiles/dev-profile-$PROFILE/generated.env
cp "$ENV_SRC" "$ENV_GEN"所有后续写入操作(Brev平台的、步骤2中的env_overrides字典)均针对。从此处开始,为只读状态。
EXTERNAL_IP$ENV_GEN$ENV_SRCStep 1d — If deploying on Brev, set EXTERNAL_IP
to the secure-link domain
EXTERNAL_IP步骤1d — 如果在Brev平台部署,将EXTERNAL_IP
设置为安全链接域名
EXTERNAL_IPRead from and write into (NOT ). Full secure-link behavior and troubleshooting are in .
BREV_ENV_ID/etc/environmentEXTERNAL_IPgenerated.env.envreferences/brev.mdbash
brev_env_id=$(awk -F= '/^BREV_ENV_ID=/ {gsub(/"/, "", $2); print $2; exit}' /etc/environment)
sed -i "s|^EXTERNAL_IP=.*|EXTERNAL_IP=7777-${brev_env_id}.brevlab.com|" "$ENV_GEN"从中读取并将写入(不是.env)。完整的安全链接行为和故障排查请查看。
/etc/environmentBREV_ENV_IDEXTERNAL_IPgenerated.envreferences/brev.mdbash
brev_env_id=$(awk -F= '/^BREV_ENV_ID=/ {gsub(/"/, "", $2); print $2; exit}' /etc/environment)
sed -i "s|^EXTERNAL_IP=.*|EXTERNAL_IP=7777-${brev_env_id}.brevlab.com|" "$ENV_GEN"Step 2 — Build env_overrides
步骤2 — 构建env_overrides
Produce an dict from the user request and the gathered context: choose remote/local LLM/VLM, set credentials, point at endpoints, set platform-specific flags. The full mapping (every override key, when it applies, defaults, profile-specific differences) lives in . Each profile reference has worked examples for that profile's common scenarios.
env_overridesreferences/env-overrides.md根据用户请求和收集到的上下文信息生成字典:选择远程/本地LLM/VLM、设置凭据、指向端点、设置平台特定标志。完整的映射关系(所有覆盖配置项、适用场景、默认值、配置文件特定差异)请查看。每个配置文件的参考文档均包含该配置文件常见场景的示例。
env_overridesreferences/env-overrides.mdStep 3 — Apply overrides + dry-run
步骤3 — 应用覆盖配置 + dry-run
Working env file: (created in Step 1c).
<repo>/deploy/docker/developer-profiles/dev-profile-<profile>/generated.envTwo env files, distinct roles.
— read-only defaults, checked in. Don't mutate it from the skill..env — the skill's per-deploy working copy. All overrides (the dict from Step 2, plus the Brevgenerated.envfrom Step 1d) land here.EXTERNAL_IPalways points at this file. Post-deploy verifiers should also read from--env-filefor the actually-deployed values — see Debugging a Deployment.generated.envmatches the conventiongenerated.envuses internally — it's a per-invocation scratchpad regenerated bydev-profile.sheach run.cp .env generated.env
bash
undefined工作环境文件:(步骤1c中创建)。
<repo>/deploy/docker/developer-profiles/dev-profile-<profile>/generated.env两个环境文件,职责不同
—— 只读默认配置,已签入仓库。请勿通过技能修改该文件。.env —— 技能的每次部署工作副本。所有覆盖配置(步骤2中的字典、步骤1d中的Brev平台generated.env)均写入该文件。EXTERNAL_IP始终指向该文件。部署后验证程序也应从--env-file读取实际部署的值——请查看部署调试。generated.env与generated.env内部使用的约定一致——它是每次调用时通过dev-profile.sh重新生成的临时文件。cp .env generated.env
bash
undefined(Step 1c already ran: cp $ENV_SRC $ENV_GEN)
(步骤1c已执行:cp $ENV_SRC $ENV_GEN)
Apply the env_overrides dict from Step 2 to generated.env
将步骤2中的env_overrides字典应用到generated.env
(read lines, update matching keys, append new keys, write)
(读取行、更新匹配的键、追加新键、写入)
Example:
示例:
sed -i "s|^LLM_MODE=.*|LLM_MODE=remote|" "$ENV_GEN"
sed -i "s|^LLM_MODE=.*|LLM_MODE=remote|" "$ENV_GEN"
sed -i "s|^LLM_BASE_URL=.*|LLM_BASE_URL=http://localhost:30081|" "$ENV_GEN"
sed -i "s|^LLM_BASE_URL=.*|LLM_BASE_URL=http://localhost:30081|" "$ENV_GEN"
Resolve compose
解析Compose配置
cd $REPO/deploy/docker
docker compose --env-file $ENV_GEN config > resolved.yml
The resolved YAML is saved to `<repo>/deploy/docker/resolved.yml`.cd $REPO/deploy/docker
docker compose --env-file $ENV_GEN config > resolved.yml
已解析的YAML文件保存到`<repo>/deploy/docker/resolved.yml`。Step 3b — Verify resolved.yml has no unexpanded ${...} tokens
步骤3b — 验证resolved.yml中无未展开的${...}令牌
Unexpanded tokens in mean compose did not see those env values. Diagnostic procedure and common culprits live in .
${VAR}resolved.ymlreferences/troubleshooting.mdresolved.yml${VAR}references/troubleshooting.mdStep 3c — Strip dangling optional depends_on
from resolved.yml
depends_on步骤3c — 从resolved.yml中移除悬空的可选depends_on
条目
depends_onMUST run after Step 3, before Step 5. Skipping this aborts the deploy:
Normalize - drop optional dependencies for services filtered out from resolved.yml
bash
undefined必须在步骤3之后、步骤5之前运行。跳过此步骤将导致部署失败:
标准化——移除resolved.yml中被过滤掉的服务的可选依赖项
bash
undefinedFrom the repo root
从仓库根目录执行
uv run skills/vss-deploy-profile/scripts/normalize_resolved_yml.py "$REPO/deploy/docker/resolved.yml"
If `uv` isn't on the host, install it once with `curl -LsSf https://astral.sh/uv/install.sh | sh` (no root needed).
**Re-validate** before `up -d`:
```bash
docker compose -f "$REPO/deploy/docker/resolved.yml" config --quiet && echo "resolved.yml OK"If validation still fails after the normalizer runs, capture the error and inspect — that's a different bug (a dependency that's not optional, or another schema violation), not the dangling-depends_on case.
uv run skills/vss-deploy-profile/scripts/normalize_resolved_yml.py "$REPO/deploy/docker/resolved.yml"
如果主机上未安装`uv`,请运行`curl -LsSf https://astral.sh/uv/install.sh | sh`进行安装(无需root权限)。
**在执行`up -d`前重新验证**:
```bash
docker compose -f "$REPO/deploy/docker/resolved.yml" config --quiet && echo "resolved.yml OK"如果运行标准化程序后验证仍失败,请捕获错误并检查——这是另一个问题(非可选依赖项或其他架构违规),而非悬空depends_on的情况。
Step 4 — Review
步骤4 — 审查
Show the user a summary of what will be deployed:
- Profile name and hardware
- LLM/VLM models and mode (local/remote/local_shared)
- Services that will start
- GPU device assignment
- Key endpoints (UI port, agent port)
Ask: "Looks good — deploy now?" and wait for confirmation before Step 5.
Exception — autonomous mode. If the user's request already asks you to run autonomously (e.g. "deploy X autonomously", "run without confirmation", "non-interactive"), skip the confirmation prompt and proceed straight to Step 5. This path exists so automated eval / CI invocations don't hang waiting for a human reply they'll never get. In all other cases, a human must approve.
向用户展示即将部署的内容摘要:
- 配置文件名称和硬件信息
- LLM/VLM模型和模式(本地/远程/local_shared)
- 将启动的服务
- GPU设备分配
- 关键端点(UI端口、agent端口)
询问:“确认无误——是否现在部署?”,等待用户确认后再执行步骤5。
例外情况——自主模式。如果用户请求明确要求自主运行(例如“自主部署X”、“无需确认直接运行”、“非交互式”),则跳过确认提示直接执行步骤5。此模式用于自动化评估/CI调用,避免因等待人工回复而挂起。其他所有情况均需人工确认。
Step 5 — Deploy
步骤5 — 部署
bash
cd $REPO/deploy/docker
docker compose --env-file $ENV_GEN -f resolved.yml up -dis mandatory. Without the same--env-fileused in Step 3,generated.envmay be unset andCOMPOSE_PROFILEScan exit 0 with zero selected services.up -d
Do NOT useon retries. It destroys already-warm NIM containers, forcing another 3–5 min torch.compile + CUDA-graph capture per NIM. If the previous--force-recreatepartially failed, fix the root cause (usually perms or an env typo) and just re-runup -d— Docker will re-create only the containers whose config changed or that are down.up -d
docker compose up -dbash
cd $REPO/deploy/docker
docker compose --env-file $ENV_GEN -f resolved.yml up -d是必填项。如果未使用步骤3中相同的--env-file,generated.env可能未设置,COMPOSE_PROFILES可能返回0但未启动任何选定服务。up -d
重试时请勿使用。该命令会销毁已预热的NIM容器,导致每个NIM需重新进行3–5分钟的torch.compile + CUDA-graph捕获。如果之前的--force-recreate部分失败,请修复根本原因(通常是权限或环境变量输入错误)并重新运行up -d——Docker仅会重新创建配置已更改或处于停止状态的容器。up -d
docker compose up -dStep 5b — Wait until the stack is actually healthy
步骤5b — 等待堆栈完全健康
Gate 0 — container count must be > 0. Refuse to proceed past until compose started the expected services:
up -dbash
expected=$(docker compose --env-file $ENV_GEN -f resolved.yml config --services | wc -l)
actual=$(docker compose -f resolved.yml ps -q | wc -l)
[ "$actual" -gt 0 ] && [ "$actual" -ge "$expected" ] \
|| { echo "FAIL: expected $expected services, got $actual — re-check Step 5 --env-file"; exit 1; }Cold deploys can take 10–20 min. The full readiness procedure lives in , and each profile reference lists the required endpoints. Never declare deploy done after ; only after every documented endpoint succeeds.
references/readiness.mdup -d检查项0 — 容器数量必须>0。在执行后,需确认Compose已启动预期数量的服务,否则拒绝继续:
up -dbash
expected=$(docker compose --env-file $ENV_GEN -f resolved.yml config --services | wc -l)
actual=$(docker compose -f resolved.yml ps -q | wc -l)
[ "$actual" -gt 0 ] && [ "$actual" -ge "$expected" ] \
|| { echo "FAIL: expected $expected services, got $actual — re-check Step 5 --env-file"; exit 1; }冷启动部署可能需要10–20分钟。完整的就绪检查流程请查看,每个配置文件的参考文档均列出了所需的端点。在执行后切勿宣布部署完成;仅当所有文档中记录的端点均成功响应后才可宣布完成。
references/readiness.mdup -dTear Down
拆除部署
bash
cd $REPO/deploy/docker
docker compose -f resolved.yml downFor switching profiles or recovering from a partial deploy, follow the full procedure in .
references/teardown.mdbash
cd $REPO/deploy/docker
docker compose -f resolved.yml down如需切换配置文件或从部分部署中恢复,请遵循中的完整流程。
references/teardown.mdDebugging a Deployment
部署调试
Use this workflow when the user asks to "debug the deploy", "verify it's working", "why is the agent not responding", or similar. The goal is to confirm the full video-ingestion-to-agent-answer path, not just that containers are "Up".
Each profile reference has a Debugging section listing the exact commands and failure-mode table for that profile.
当用户要求“调试部署”、“验证是否正常工作”、“agent无响应的原因”或类似请求时,请使用此工作流。目标是确认从视频摄入到agent回复的完整路径正常,而非仅确认容器处于“运行中”状态。
每个配置文件的参考文档均包含调试部分,列出了该配置文件的具体命令和故障模式表。
Quick checks (all profiles)
快速检查(所有配置文件)
bash
undefinedbash
undefined1. All expected containers Up
1. 所有预期容器均处于运行状态
docker ps --format 'table {{.Names}}\t{{.Status}}'
docker ps --format 'table {{.Names}}\t{{.Status}}'
2. Agent API + UI responding
2. Agent API + UI可正常响应
curl -sf http://localhost:8000/docs >/dev/null && echo "agent OK"
curl -sf http://localhost:3000/ >/dev/null && echo "ui OK"
curl -sf http://localhost:8000/docs >/dev/null && echo "agent OK"
curl -sf http://localhost:3000/ >/dev/null && echo "ui OK"
3. VLM NIM responding (base/lvs profiles)
3. VLM NIM可正常响应(base/lvs配置文件)
curl -sf http://localhost:30082/v1/models | python3 -m json.tool
curl -sf http://localhost:30082/v1/models | python3 -m json.tool
4. LLM NIM responding
4. LLM NIM可正常响应
curl -sf http://localhost:30081/v1/models | python3 -m json.tool
undefinedcurl -sf http://localhost:30081/v1/models | python3 -m json.tool
undefinedEnd-to-end video sanity check
端到端视频验证
After the quick checks above pass, drive a real query through the agent — e.g. ask it over the REST API or UI to describe a video you've uploaded to VST. If the agent returns a non-empty answer, the upload → ingest → inference → reply path is healthy. If it fails, shows which stage tripped.
docker logs vss-agent在上述快速检查通过后,请通过agent发起真实查询——例如通过REST API或UI要求agent描述已上传到VST的视频。如果agent返回非空答案,则说明上传→摄入→推理→回复路径正常。如果失败,会显示故障发生的阶段。
docker logs vss-agentExamples
示例
- Base profile, remote models: route to , copy
basetodev-profile-base/.env, setgenerated.env/LLM_MODE=remote, dry-run, normalize, deploy, then verifyVLM_MODE=remoteand UI./docs - Search profile on RTX: route to , follow
searchfor sizing and endpoints, seed videos, then run the search-profile readiness checks.references/search.md - Edge target: route through , then use the same
references/edge.md→ dry-run → normalize → deploy flow.generated.env
- Base配置文件,远程模型:路由到,复制
base到dev-profile-base/.env,设置generated.env/LLM_MODE=remote,执行dry-run、标准化、部署,然后验证VLM_MODE=remote和UI。/docs - RTX上部署Search配置文件:路由到,遵循
search中的资源规格和端点要求,导入视频,然后运行Search配置文件的就绪检查。references/search.md - 边缘目标:通过路由,然后使用相同的
references/edge.md→ dry-run → 标准化 → 部署流程。generated.env
Limitations
限制
- This skill deploys compose-based VSS profiles only; standalone microservice deployment belongs to the matching skill.
vss-deploy-* - Hardware sizing, model placement, and profile-specific readiness are owned by profile references; do not infer them from memory.
- Privileged host remediation requires user approval when passwordless sudo is unavailable.
- 本技能仅部署基于Compose的VSS配置文件;独立微服务部署请使用对应的技能。
vss-deploy-* - 硬件资源规格、模型部署位置和配置文件特定的就绪检查由对应配置文件的参考文档维护;请勿凭记忆推断。
- 当无密码sudo权限不可用时,特权主机修复操作需要用户批准。
Troubleshooting
故障排查
Start with for cross-profile failures such as NIM cold-start timeouts, OOM, remote endpoint 5xx responses, missing / , unexpanded values in etc.
references/agent-failure-modes.mdNGC_CLI_API_KEYHF_TOKENresolved.yml对于跨配置文件的故障(例如NIM冷启动超时、OOM、远程端点5xx响应、缺失 / 、中存在未展开值等),请首先查看。
NGC_CLI_API_KEYHF_TOKENresolved.ymlreferences/agent-failure-modes.md