nemotron-voice-agent-deploy

Compare original and translation side by side

🇺🇸

Original

English
🇨🇳

Translation

Chinese

Nemotron Voice Agent Deployment

Nemotron Voice Agent 部署

Real-time conversational AI voice agent using NVIDIA NIMs (ASR, TTS, LLM) with WebRTC (default) or WebSocket transport.
基于NVIDIA NIMs(ASR、TTS、LLM),通过WebRTC(默认)或WebSocket传输实现的实时对话式AI语音Agent。

Deployment Flow

部署流程

Always verify hardware first, even if user mentions a specific platform.
无论用户提及何种特定平台,务必先验证硬件情况。

STEP 1: Hardware Detection

步骤1:硬件检测

bash
nvidia-smi --query-gpu=name,memory.total --format=csv,noheader 2>/dev/null
ResultAction
Command fails / No outputCloud NIMs
GPU detectedSTEP 2: Platform Detection

bash
nvidia-smi --query-gpu=name,memory.total --format=csv,noheader 2>/dev/null
检测结果操作建议
命令执行失败/无输出Cloud NIMs
检测到GPU步骤2:平台检测

Cloud NIMs (No GPU)

Cloud NIMs(无GPU环境)

bash
cd nemotron-voice-agent
git submodule update --init
cp config/env.example .env
Export your NVIDIA API key:
bash
export NVIDIA_API_KEY=your-api-key  # Get from https://build.nvidia.com
Then edit
.env
:
bash
NVIDIA_LLM_MODEL=nvidia/nemotron-3-nano-30b-a3b  # Cloud model name
If user requests WebSocket transport, also add to
.env
:
bash
TRANSPORT=WEBSOCKET
bash
docker compose up --build --no-deps -d python-app ui-app
bash
cd nemotron-voice-agent
git submodule update --init
cp config/env.example .env
导出你的NVIDIA API密钥:
bash
export NVIDIA_API_KEY=your-api-key  # Get from https://build.nvidia.com
然后编辑
.env
文件:
bash
NVIDIA_LLM_MODEL=nvidia/nemotron-3-nano-30b-a3b  # 云端模型名称
若用户需要WebSocket传输,请在
.env
中添加以下内容:
bash
TRANSPORT=WEBSOCKET
bash
docker compose up --build --no-deps -d python-app ui-app

> **Note:** Deployment may take 30-60 minutes on first run.

**If user requests Multilingual mode**, also add to `.env`:
```bash
ENABLE_MULTILINGUAL=true
ASR_CLOUD_FUNCTION_ID=71203149-d3b7-4460-8231-1be2543a1fca
ASR_MODEL_NAME=parakeet-rnnt-1.1b-unified-ml-cs-universal-multi-asr-streaming
Remote Access:
ssh -L 9000:localhost:9000 user@host
or
http://<HOST_IP>:9000


> **注意:**首次部署可能需要30-60分钟。

**若用户需要多语言模式**,请在`.env`中添加以下内容:
```bash
ENABLE_MULTILINGUAL=true
ASR_CLOUD_FUNCTION_ID=71203149-d3b7-4460-8231-1be2543a1fca
ASR_MODEL_NAME=parakeet-rnnt-1.1b-unified-ml-cs-universal-multi-asr-streaming
**远程访问:**使用
ssh -L 9000:localhost:9000 user@host
或直接访问
http://<HOST_IP>:9000

STEP 2: Platform Detection (if GPU detected)

步骤2:平台检测(若检测到GPU)

bash
uname -m  # x86_64 → Workstation, aarch64 → Jetson
cat /etc/nv_tegra_release 2>/dev/null && echo "Jetson"
PlatformReferenceRequirements
Workstation (x86_64)workstation-deployment.md2x GPU (24GB+ VRAM), NIM containers
Jetson Thor (aarch64)jetson-deployment.mdJetPack 7.0, Nemotron Speech ASR and TTS, vLLM
Note: Multilingual mode available on Workstation with WebRTC transport only.
bash
uname -m  # x86_64 → 工作站, aarch64 → Jetson
cat /etc/nv_tegra_release 2>/dev/null && echo "Jetson"
平台参考文档要求
Workstation (x86_64)workstation-deployment.md2块GPU(显存24GB以上)、NIM容器
Jetson Thor (aarch64)jetson-deployment.mdJetPack 7.0、Nemotron Speech ASR和TTS、vLLM
**注意:**多语言模式仅在使用WebRTC传输的工作站环境中可用。