deepstream-import-vision-model
Compare original and translation side by side
🇺🇸
Original
English🇨🇳
Translation
ChineseDeepStream Import Vision Model
DeepStream 视觉模型导入
When this skill is active, read the relevant reference document before starting each phase. Do not rely on memory — reference documents contain exact script paths, bash variable conventions, log filename contracts, and critical parsing rules.
Current scope: Object detection models only. Fail fast on classification, segmentation, or other architectures detected in .
config.json激活该技能后,在每个阶段开始前请阅读相关参考文档。请勿依赖记忆——参考文档包含精确的脚本路径、bash变量约定、日志文件名规则以及关键解析规则。
当前适用范围:仅支持目标检测模型。若在中检测到分类、分割或其他架构模型,将直接终止流程。
config.jsonPipeline Overview
流水线概述
| Step | Phase | Reference | What it does |
|---|---|---|---|
| 1–3 | Model Acquire | references/model-acquire.md | Browse HF/NGC, detect format, download ONNX or export SafeTensors |
| 4–5 | Engine Build | references/engine-build.md | Build dynamic TRT engine, run trtexec BS=1 and BS=MAX_BS |
| 6–7 | DS Pipeline | references/pipeline-run.md | Custom bbox parser, nvinfer config, single-stream + multi-stream benchmarks |
| 8 | Report | references/report-generation.md | 5 charts, HTML, PDF benchmark report |
Run the full pipeline autonomously without pausing for confirmation at each step.
| 步骤 | 阶段 | 参考文档 | 功能说明 |
|---|---|---|---|
| 1–3 | 模型获取 | references/model-acquire.md | 浏览HF/NGC、检测模型格式、下载ONNX或导出SafeTensors |
| 4–5 | 引擎构建 | references/engine-build.md | 构建动态TRT引擎、运行trtexec批大小为1和MAX_BS的测试 |
| 6–7 | DS流水线 | references/pipeline-run.md | 自定义边界框解析器、nvinfer配置、单流+多流基准测试 |
| 8 | 报告生成 | references/report-generation.md | 生成5种图表、HTML及PDF格式的基准测试报告 |
全程自动运行完整流水线,无需在每个步骤暂停等待确认。
Pre-flight Checks
预检查
Run before starting:
bash
undefined开始前执行以下检查:
bash
undefined1. GPU and drivers
1. GPU及驱动
nvidia-smi
nvidia-smi
2. TensorRT version match (must match between builder and DS runtime)
2. TensorRT版本匹配(构建器与DS运行时版本必须一致)
trtexec 2>&1 | head -3
dpkg -l | grep libnvinfer-bin
trtexec 2>&1 | head -3
dpkg -l | grep libnvinfer-bin
3. Shared Python venv — create once, reuse across all models
3. 共享Python虚拟环境 — 仅创建一次,所有模型复用
mkdir -p build
VENV=build/.venv_optimum
if [ ! -x "$VENV/bin/python3" ]; then
python3 -m venv "$VENV"
"$VENV/bin/pip" install --upgrade pip -q
"$VENV/bin/pip" install "optimum[exporters]>=1.20,<2.0" "torch<2.12"
transformers onnxruntime matplotlib numpy markdown -q fi
transformers onnxruntime matplotlib numpy markdown -q fi
mkdir -p build
VENV=build/.venv_optimum
if [ ! -x "$VENV/bin/python3" ]; then
python3 -m venv "$VENV"
"$VENV/bin/pip" install --upgrade pip -q
"$VENV/bin/pip" install "optimum[exporters]>=1.20,<2.0" "torch<2.12"
transformers onnxruntime matplotlib numpy markdown -q fi
transformers onnxruntime matplotlib numpy markdown -q fi
4. System tools
4. 系统工具
which wkhtmltopdf || apt-get install -y wkhtmltopdf
which mediainfo || apt-get install -y mediainfo
which deepstream-app # required for KITTI dump (Step 6g) and benchmark perf-measurement (Step 7c); shipped with DeepStream SDK
which wkhtmltopdf || apt-get install -y wkhtmltopdf
which mediainfo || apt-get install -y mediainfo
which deepstream-app # KITTI导出(步骤6g)和基准性能测试(步骤7c)必需;随DeepStream SDK一同发布
5. Sample video — only check default path when user has not provided a custom DS_VIDEO
5. 示例视频 — 仅当用户未提供自定义DS_VIDEO时检查默认路径
if [ -z "$DS_VIDEO" ]; then
[ -f /opt/nvidia/deepstream/deepstream/samples/streams/sample_720p.mp4 ] ||
echo "WARNING: sample_720p.mp4 not found. Install DeepStream samples or set DS_VIDEO=/path/to/your.mp4" fi
echo "WARNING: sample_720p.mp4 not found. Install DeepStream samples or set DS_VIDEO=/path/to/your.mp4" fi
undefinedif [ -z "$DS_VIDEO" ]; then
[ -f /opt/nvidia/deepstream/deepstream/samples/streams/sample_720p.mp4 ] ||
echo "警告:未找到sample_720p.mp4。请安装DeepStream示例或设置DS_VIDEO=/path/to/your.mp4" fi
echo "警告:未找到sample_720p.mp4。请安装DeepStream示例或设置DS_VIDEO=/path/to/your.mp4" fi
undefinedMandatory Output Structure
强制输出结构
Create once is known (Step 1). Never dump files flat.
MODEL_NAMEmodels/{model_name}/
model/ <- ONNX file(s)
parser/ <- .cpp, Makefile, .so
config/ <- nvinfer config, ds-app config, labels.txt
scripts/ <- run helper scripts
benchmarks/
engines/ <- _dynamic_b{MAX_BS}.engine, timing.cache, build logs
b1/ <- trtexec BS=1 log
b{MAX_BS}/ <- trtexec BS=MAX_BS log
ds/ <- DS benchmark logs
reports/ <- benchmark_report.md, .html, .pdf, benchmark_data.json
charts/ <- chart_*.png (5 charts)
samples/ <- output .mp4 or .ogv (theoraenc fallback), test frames
kitti_output/ <- KITTI detection .txt filesbash
mkdir -p models/$MODEL_NAME/{model,parser,config,scripts,benchmarks/engines,benchmarks/ds,reports/charts,samples/kitti_output}确定后(步骤1)立即创建以下结构。禁止将文件直接平铺存放。
MODEL_NAMEmodels/{model_name}/
model/ <- ONNX文件
parser/ <- .cpp、Makefile、.so文件
config/ <- nvinfer配置、ds-app配置、labels.txt
scripts/ <- 运行辅助脚本
benchmarks/
engines/ <- _dynamic_b{MAX_BS}.engine、timing.cache、构建日志
b1/ <- trtexec批大小为1的日志
b{MAX_BS}/ <- trtexec批大小为MAX_BS的日志
ds/ <- DS基准测试日志
reports/ <- benchmark_report.md、.html、.pdf、benchmark_data.json
charts/ <- chart_*.png(共5张图表)
samples/ <- 输出.mp4或.ogv(theoraenc降级方案)、测试帧
kitti_output/ <- KITTI检测结果.txt文件bash
mkdir -p models/$MODEL_NAME/{model,parser,config,scripts,benchmarks/engines,benchmarks/ds,reports/charts,samples/kitti_output}Critical Rules
关键规则
- Engine naming — always . Never bare
{model}_dynamic_b{MAX_BS}.engine.model_dynamic.engine - batch_size == num_streams — in DS runs, and stream count are always equal.
batch-size - Log filenames are fixed — ,
trtexec_b1.log,trtexec_b${MAX_BS}.log,ds_s${N}_run1.log. No timestamps. Report generation reads exact paths.ds_s${N}_run2.log - Parser zero-init — always . Required for DS 9.0 OBB support; bare
NvDsInferObjectDetectionInfo obj = {};leavesobj;uninitialized, causing tilted bounding boxes.rotation_angle - KITTI validation gate — do NOT proceed to Step 7 if KITTI frame count is zero or detection rate < 90%.
- Shared venv — reused across all models. Never create per-model venvs.
build/.venv_optimum - trtexec — GPU-only compute matches DeepStream's GPU-to-GPU data flow.
--noDataTransfers - Report HTML+PDF — always use . Never write a custom HTML generator or call
skills/deepstream-import-vision-model/scripts/report/md-to-html-pdf.pydirectly.wkhtmltopdf - Object detection only — reject non-detection architectures from before building anything.
config.json - Encoder fallback (MANDATORY) — and
x264encare prohibited. On NVENC-unavailable systems, useopenh264enc(LGPL; ships in gst-plugins-base; output istheoraenc + oggmux). If.ogv/theoraencare absent, skip video creation (oggmux). Report which mode was used:DS_SINGLE_STREAM_MODE=skipped/nvv4l2h264enc/theoraenc-fallback.skipped - Video source (MANDATORY) — default is always (1280×720). Never autonomously substitute
sample_720p.mp4or any other file. Only use a different video when the user explicitly provides a path (viasample_1080p_h264.mp4env var or script argument).DS_VIDEO
- 引擎命名 — 必须使用格式。禁止使用无后缀的
{model}_dynamic_b{MAX_BS}.engine。model_dynamic.engine - 批大小等于流数量 — 在DS运行时,必须与流数量保持一致。
batch-size - 日志文件名固定 — 必须使用、
trtexec_b1.log、trtexec_b${MAX_BS}.log、ds_s${N}_run1.log。禁止添加时间戳。报告生成程序会读取固定路径。ds_s${N}_run2.log - 解析器零初始化 — 必须使用。这是DS 9.0 OBB支持的必需操作;仅声明
NvDsInferObjectDetectionInfo obj = {};会导致obj;未初始化,进而产生倾斜边界框。rotation_angle - KITTI验证关卡 — 若KITTI帧数量为0或检测率低于90%,请勿进入步骤7。
- 共享虚拟环境 — 需在所有模型间复用。禁止为每个模型单独创建虚拟环境。
build/.venv_optimum - trtexec 参数 — 仅GPU计算模式需匹配DeepStream的GPU到GPU数据流。
--noDataTransfers - 报告HTML+PDF格式 — 必须使用生成。禁止自定义HTML生成器或直接调用
skills/deepstream-import-vision-model/scripts/report/md-to-html-pdf.py。wkhtmltopdf - 仅支持目标检测 — 在构建任何内容前,需拒绝中的非检测架构模型。
config.json - 编码器降级方案(强制要求) — 禁止使用和
x264enc。在无法使用NVENC的系统上,使用openh264enc(LGPL协议;随gst-plugins-base一同发布;输出格式为theoraenc + oggmux)。若.ogv/theoraenc不可用,则跳过视频生成(oggmux)。需在报告中说明使用的模式:DS_SINGLE_STREAM_MODE=skipped/nvv4l2h264enc/theoraenc-fallback。skipped - 视频源(强制要求) — 默认视频源始终为(1280×720)。禁止自动替换为
sample_720p.mp4或其他文件。仅当用户明确提供路径(通过sample_1080p_h264.mp4环境变量或脚本参数)时,才可使用其他视频。DS_VIDEO
Pipeline Timing
流水线计时
Wrap every step:
bash
STEP_START=$(date +%s.%N)每个步骤需包裹计时代码:
bash
STEP_START=$(date +%s.%N)... step commands ...
... 步骤命令 ...
STEP_END=$(date +%s.%N)
STEP_DURATION=$(echo "$STEP_END - $STEP_START" | bc)
echo "[Step N] completed in ${STEP_DURATION}s"
Track `PIPELINE_START` (before Step 1) and `PIPELINE_END` (after Step 8). Report all durations in the benchmark report.STEP_END=$(date +%s.%N)
STEP_DURATION=$(echo "$STEP_END - $STEP_START" | bc)
echo "[步骤N] 完成耗时 ${STEP_DURATION}s"
记录`PIPELINE_START`(步骤1前)和`PIPELINE_END`(步骤8后)。所有耗时需在基准测试报告中体现。Report Output (MANDATORY — all 3 formats)
报告输出(强制要求 — 三种格式)
- — markdown source (12 mandatory sections)
benchmark_report.md - — styled HTML (charts base64-inlined, no local file access)
benchmark_report.html - — via
benchmark_report_{model_name}.pdf; verify charts are embedded by countingmd-to-html-pdf.pyoccurrences in the HTML output:data:image/pngshould equal 5grep -o 'data:image/png' benchmark_report.html | wc -l
Run charts and report scripts with the shared venv active: .
source build/.venv_optimum/bin/activate- — Markdown源文件(包含12个必填章节)
benchmark_report.md - — 带样式的HTML(图表以base64内联,无本地文件依赖)
benchmark_report.html - — 通过
benchmark_report_{model_name}.pdf生成;需验证图表是否嵌入,可通过统计HTML输出中md-to-html-pdf.py的出现次数:data:image/png结果应等于5grep -o 'data:image/png' benchmark_report.html | wc -l
运行图表和报告脚本时需激活共享虚拟环境:。
source build/.venv_optimum/bin/activateReference Documents
参考文档
IMPORTANT: Read the relevant reference before starting each phase. Do NOT generate code from memory.
| Document | Use When |
|---|---|
| references/model-acquire.md | Steps 1–3: HF/NGC URL parsing, format detection, ONNX download, SafeTensors export, label extraction |
| references/engine-build.md | Steps 4–5: trtexec engine build, benchmarks, PEAK_GPU_STREAMS derivation, iterative scaling |
| references/pipeline-run.md | Steps 6–7: custom bbox parser, nvinfer config, single-stream validation, KITTI dump, multi-stream benchmark |
| references/report-generation.md | Step 8: benchmark_data.json, 5 charts, 12-section markdown report, HTML + PDF |
重要提示:每个阶段开始前需阅读对应参考文档。禁止凭记忆生成代码。
| 文档 | 使用场景 |
|---|---|
| references/model-acquire.md | 步骤1–3:HF/NGC URL解析、格式检测、ONNX下载、SafeTensors导出、标签提取 |
| references/engine-build.md | 步骤4–5:trtexec引擎构建、基准测试、PEAK_GPU_STREAMS推导、迭代缩放 |
| references/pipeline-run.md | 步骤6–7:自定义边界框解析器、nvinfer配置、单流验证、KITTI导出、多流基准测试 |
| references/report-generation.md | 步骤8:benchmark_data.json、5张图表、12章节Markdown报告、HTML + PDF生成 |
Scripts
脚本
Located in .
scripts/| Script | Phase | Purpose |
|---|---|---|
| 1–3 | List HuggingFace repo files |
| 1–3 | Download config.json from HF |
| 1–3 | List NGC model files |
| 1–3 | Download NGC model archive |
| 1–3 | Export SafeTensors → ONNX via optimum-cli |
| 1–5 | Inspect ONNX input/output shapes |
| 4–5 | Bake batch dim into ONNX |
| Any | Remove staging dirs, preserve shared venv |
| 4–5 | Run trtexec with standard flags |
| 6–7 | Single-stream visual validation (NVENC primary; theoraenc+oggmux fallback; skip if neither) |
| 6–7 | 2-phase batch size sweep |
| 6–7 | Fixed-stream DS benchmark |
| 6–7 | KITTI detection dump via deepstream-app |
| 7 | Step 7c two-run benchmark — wraps |
| 6–7 | Extract sample frames from output video ( |
| 8 | Generate 5 benchmark PNG charts |
| 8 | Markdown → styled HTML → PDF (canonical benchmark report path) |
| Any | Markdown → PDF via pandoc/pdflatex — for design docs and references only, NOT for benchmark reports (use md-to-html-pdf.py for those) |
| 8 | CSS for HTML report |
| 8 | Mermaid diagram → PNG |
| 8 | Vetted Puppeteer config for Mermaid (sandboxed; non-root) |
| 8 | Vetted Puppeteer config for Mermaid (used when running as root) |
所有脚本位于目录下。
scripts/| 脚本 | 阶段 | 用途 |
|---|---|---|
| 1–3 | 列出HuggingFace仓库文件 |
| 1–3 | 从HF下载config.json |
| 1–3 | 列出NGC模型文件 |
| 1–3 | 下载NGC模型压缩包 |
| 1–3 | 通过optimum-cli将SafeTensors导出为ONNX |
| 1–5 | 检查ONNX输入/输出形状 |
| 4–5 | 将批处理维度嵌入ONNX |
| 任意阶段 | 删除临时目录,保留共享虚拟环境 |
| 4–5 | 使用标准参数运行trtexec |
| 6–7 | 单流可视化验证(优先使用NVENC;降级方案为theoraenc+oggmux;若两者均不可用则跳过) |
| 6–7 | 两阶段批大小扫描 |
| 6–7 | 固定流数DS基准测试 |
| 6–7 | 通过deepstream-app导出KITTI检测结果 |
| 7 | 步骤7c的两轮基准测试 — 以 |
| 6–7 | 从输出视频(NVENC路径的 |
| 8 | 生成5张PNG格式的基准测试图表 |
| 8 | Markdown → 带样式HTML → PDF(基准测试报告的标准生成路径) |
| 任意阶段 | 通过pandoc/pdflatex将Markdown转为PDF — 仅用于设计文档和参考文档,禁止用于基准测试报告(此类报告请使用md-to-html-pdf.py) |
| 8 | HTML报告的CSS样式文件 |
| 8 | 将Mermaid图表转为PNG |
| 8 | 经过验证的Mermaid Puppeteer配置(沙箱模式;非root用户) |
| 8 | 经过验证的Mermaid Puppeteer配置(root用户运行时使用) |
Quick Error Reference
快速错误参考
| Error | Fix |
|---|---|
| Tilted/diagonal bounding boxes | Parser struct not zero-initialized — use |
| Zero KITTI files | |
| Engine rebuilds every DS run | |
| Add |
| Use |
| ForeignNode build failure (DETR) | Use dynamo export path or run |
| Zero detections | Wrong |
| Install into venv: |
| 错误 | 修复方案 |
|---|---|
| 边界框倾斜/对角线 | 解析器结构体未零初始化 — 使用 |
| KITTI文件数量为0 | nvinfer未读取 |
| 每次DS运行都重建引擎 | |
| 为动态ONNX模型的nvinfer配置添加 |
| 使用 |
| ForeignNode构建失败(DETR) | 使用dynamo导出路径或运行 |
| 检测结果数量为0 | |
| 在虚拟环境中安装: |