senior-computer-vision

Compare original and translation side by side

🇺🇸

Original

English
🇨🇳

Translation

Chinese

Senior Computer Vision Engineer

资深计算机视觉工程师

Production computer vision engineering skill for object detection, image segmentation, and visual AI system deployment.
面向目标检测、图像分割和视觉AI系统部署的生产级计算机视觉工程技能。

Table of Contents

目录

Quick Start

快速开始

bash
undefined
bash
undefined

Generate training configuration for YOLO or Faster R-CNN

为YOLO或Faster R-CNN生成训练配置

python scripts/vision_model_trainer.py models/ --task detection --arch yolov8
python scripts/vision_model_trainer.py models/ --task detection --arch yolov8

Analyze model for optimization opportunities (quantization, pruning)

分析模型以寻找优化机会(量化、剪枝)

python scripts/inference_optimizer.py model.pt --target onnx --benchmark
python scripts/inference_optimizer.py model.pt --target onnx --benchmark

Build dataset pipeline with augmentations

构建带数据增强的数据集流水线

python scripts/dataset_pipeline_builder.py images/ --format coco --augment
undefined
python scripts/dataset_pipeline_builder.py images/ --format coco --augment
undefined

Core Expertise

核心能力

This skill provides guidance on:
  • Object Detection: YOLO family (v5-v11), Faster R-CNN, DETR, RT-DETR
  • Instance Segmentation: Mask R-CNN, YOLACT, SOLOv2
  • Semantic Segmentation: DeepLabV3+, SegFormer, SAM (Segment Anything)
  • Image Classification: ResNet, EfficientNet, Vision Transformers (ViT, DeiT)
  • Video Analysis: Object tracking (ByteTrack, SORT), action recognition
  • 3D Vision: Depth estimation, point cloud processing, NeRF
  • Production Deployment: ONNX, TensorRT, OpenVINO, CoreML
本技能提供以下方面的指导:
  • 目标检测:YOLO系列(v5-v11)、Faster R-CNN、DETR、RT-DETR
  • 实例分割:Mask R-CNN、YOLACT、SOLOv2
  • 语义分割:DeepLabV3+、SegFormer、SAM(Segment Anything)
  • 图像分类:ResNet、EfficientNet、Vision Transformers(ViT、DeiT)
  • 视频分析:目标跟踪(ByteTrack、SORT)、动作识别
  • 3D视觉:深度估计、点云处理、NeRF
  • 生产部署:ONNX、TensorRT、OpenVINO、CoreML

Tech Stack

技术栈

CategoryTechnologies
FrameworksPyTorch, torchvision, timm
DetectionUltralytics (YOLO), Detectron2, MMDetection
Segmentationsegment-anything, mmsegmentation
OptimizationONNX, TensorRT, OpenVINO, torch.compile
Image ProcessingOpenCV, Pillow, albumentations
AnnotationCVAT, Label Studio, Roboflow
Experiment TrackingMLflow, Weights & Biases
ServingTriton Inference Server, TorchServe
类别技术
框架PyTorch、torchvision、timm
检测工具Ultralytics(YOLO)、Detectron2、MMDetection
分割工具segment-anything、mmsegmentation
优化工具ONNX、TensorRT、OpenVINO、torch.compile
图像处理OpenCV、Pillow、albumentations
标注工具CVAT、Label Studio、Roboflow
实验跟踪MLflow、Weights & Biases
模型服务Triton Inference Server、TorchServe

Workflow 1: Object Detection Pipeline

工作流1:目标检测流水线

Use this workflow when building an object detection system from scratch.
当从零开始构建目标检测系统时使用此工作流。

Step 1: Define Detection Requirements

步骤1:定义检测需求

Analyze the detection task requirements:
Detection Requirements Analysis:
- Target objects: [list specific classes to detect]
- Real-time requirement: [yes/no, target FPS]
- Accuracy priority: [speed vs accuracy trade-off]
- Deployment target: [cloud GPU, edge device, mobile]
- Dataset size: [number of images, annotations per class]
分析检测任务需求:
检测需求分析:
- 目标对象:[列出要检测的具体类别]
- 实时性要求:[是/否,目标FPS]
- 精度优先级:[速度与精度的权衡]
- 部署目标:[云端GPU、边缘设备、移动端]
- 数据集规模:[图像数量、每类标注数量]

Step 2: Select Detection Architecture

步骤2:选择检测架构

Choose architecture based on requirements:
RequirementRecommended ArchitectureWhy
Real-time (>30 FPS)YOLOv8/v11, RT-DETRSingle-stage, optimized for speed
High accuracyFaster R-CNN, DINOTwo-stage, better localization
Small objectsYOLO + SAHI, Faster R-CNN + FPNMulti-scale detection
Edge deploymentYOLOv8n, MobileNetV3-SSDLightweight architectures
Transformer-basedDETR, DINO, RT-DETREnd-to-end, no NMS required
根据需求选择架构:
需求推荐架构原因
实时性(>30 FPS)YOLOv8/v11、RT-DETR单阶段模型,针对速度优化
高精度Faster R-CNN、DINO两阶段模型,定位效果更好
小目标检测YOLO + SAHI、Faster R-CNN + FPN多尺度检测
边缘部署YOLOv8n、MobileNetV3-SSD轻量级架构
基于Transformer的模型DETR、DINO、RT-DETR端到端,无需NMS

Step 3: Prepare Dataset

步骤3:准备数据集

Convert annotations to required format:
bash
undefined
将标注转换为所需格式:
bash
undefined

COCO format (recommended)

COCO格式(推荐)

python scripts/dataset_pipeline_builder.py data/images/
--annotations data/labels/
--format coco
--split 0.8 0.1 0.1
--output data/coco/
python scripts/dataset_pipeline_builder.py data/images/
--annotations data/labels/
--format coco
--split 0.8 0.1 0.1
--output data/coco/

Verify dataset

验证数据集

python -c "from pycocotools.coco import COCO; coco = COCO('data/coco/train.json'); print(f'Images: {len(coco.imgs)}, Categories: {len(coco.cats)}')"
undefined
python -c "from pycocotools.coco import COCO; coco = COCO('data/coco/train.json'); print(f'Images: {len(coco.imgs)}, Categories: {len(coco.cats)}')"
undefined

Step 4: Configure Training

步骤4:配置训练

Generate training configuration:
bash
undefined
生成训练配置:
bash
undefined

For Ultralytics YOLO

针对Ultralytics YOLO

python scripts/vision_model_trainer.py data/coco/
--task detection
--arch yolov8m
--epochs 100
--batch 16
--imgsz 640
--output configs/
python scripts/vision_model_trainer.py data/coco/
--task detection
--arch yolov8m
--epochs 100
--batch 16
--imgsz 640
--output configs/

For Detectron2

针对Detectron2

python scripts/vision_model_trainer.py data/coco/
--task detection
--arch faster_rcnn_R_50_FPN
--framework detectron2
--output configs/
undefined
python scripts/vision_model_trainer.py data/coco/
--task detection
--arch faster_rcnn_R_50_FPN
--framework detectron2
--output configs/
undefined

Step 5: Train and Validate

步骤5:训练与验证

bash
undefined
bash
undefined

Ultralytics training

Ultralytics训练

yolo detect train data=data.yaml model=yolov8m.pt epochs=100 imgsz=640
yolo detect train data=data.yaml model=yolov8m.pt epochs=100 imgsz=640

Detectron2 training

Detectron2训练

python train_net.py --config-file configs/faster_rcnn.yaml --num-gpus 1
python train_net.py --config-file configs/faster_rcnn.yaml --num-gpus 1

Validate on test set

在测试集上验证

yolo detect val model=runs/detect/train/weights/best.pt data=data.yaml
undefined
yolo detect val model=runs/detect/train/weights/best.pt data=data.yaml
undefined

Step 6: Evaluate Results

步骤6:评估结果

Key metrics to analyze:
MetricTargetDescription
mAP@50>0.7Mean Average Precision at IoU 0.5
mAP@50:95>0.5COCO primary metric
Precision>0.8Low false positives
Recall>0.8Low missed detections
Inference time<33msFor 30 FPS real-time
需要分析的关键指标:
指标目标值描述
mAP@50>0.7IoU为0.5时的平均精度均值
mAP@50:95>0.5COCO主要指标
精确率>0.8低误报率
召回率>0.8低漏检率
推理时间<33ms满足30 FPS实时性要求

Workflow 2: Model Optimization and Deployment

工作流2:模型优化与部署

Use this workflow when preparing a trained model for production deployment.
当准备将训练好的模型部署到生产环境时使用此工作流。

Step 1: Benchmark Baseline Performance

步骤1:基准性能测试

bash
undefined
bash
undefined

Measure current model performance

测量当前模型性能

python scripts/inference_optimizer.py model.pt
--benchmark
--input-size 640 640
--batch-sizes 1 4 8 16
--warmup 10
--iterations 100

Expected output:
Baseline Performance (PyTorch FP32):
  • Batch 1: 45.2ms (22.1 FPS)
  • Batch 4: 89.4ms (44.7 FPS)
  • Batch 8: 165.3ms (48.4 FPS)
  • Memory: 2.1 GB
  • Parameters: 25.9M
undefined
python scripts/inference_optimizer.py model.pt
--benchmark
--input-size 640 640
--batch-sizes 1 4 8 16
--warmup 10
--iterations 100

预期输出:
基准性能(PyTorch FP32):
  • 批量1:45.2ms(22.1 FPS)
  • 批量4:89.4ms(44.7 FPS)
  • 批量8:165.3ms(48.4 FPS)
  • 内存占用:2.1 GB
  • 参数数量:25.9M
undefined

Step 2: Select Optimization Strategy

步骤2:选择优化策略

Deployment TargetOptimization Path
NVIDIA GPU (cloud)PyTorch → ONNX → TensorRT FP16
NVIDIA GPU (edge)PyTorch → TensorRT INT8
Intel CPUPyTorch → ONNX → OpenVINO
Apple SiliconPyTorch → CoreML
Generic CPUPyTorch → ONNX Runtime
MobilePyTorch → TFLite or ONNX Mobile
部署目标优化路径
NVIDIA GPU(云端)PyTorch → ONNX → TensorRT FP16
NVIDIA GPU(边缘)PyTorch → TensorRT INT8
Intel CPUPyTorch → ONNX → OpenVINO
Apple SiliconPyTorch → CoreML
通用CPUPyTorch → ONNX Runtime
移动端PyTorch → TFLite或ONNX Mobile

Step 3: Export to ONNX

步骤3:导出为ONNX格式

bash
undefined
bash
undefined

Export with dynamic batch size

导出支持动态批量大小的模型

python scripts/inference_optimizer.py model.pt
--export onnx
--input-size 640 640
--dynamic-batch
--simplify
--output model.onnx
python scripts/inference_optimizer.py model.pt
--export onnx
--input-size 640 640
--dynamic-batch
--simplify
--output model.onnx

Verify ONNX model

验证ONNX模型

python -c "import onnx; model = onnx.load('model.onnx'); onnx.checker.check_model(model); print('ONNX model valid')"
undefined
python -c "import onnx; model = onnx.load('model.onnx'); onnx.checker.check_model(model); print('ONNX model valid')"
undefined

Step 4: Apply Quantization (Optional)

步骤4:应用量化(可选)

For INT8 quantization with calibration:
bash
undefined
带校准的INT8量化:
bash
undefined

Generate calibration dataset

生成校准数据集

python scripts/inference_optimizer.py model.onnx
--quantize int8
--calibration-data data/calibration/
--calibration-samples 500
--output model_int8.onnx

Quantization impact analysis:

| Precision | Size | Speed | Accuracy Drop |
|-----------|------|-------|---------------|
| FP32 | 100% | 1x | 0% |
| FP16 | 50% | 1.5-2x | <0.5% |
| INT8 | 25% | 2-4x | 1-3% |
python scripts/inference_optimizer.py model.onnx
--quantize int8
--calibration-data data/calibration/
--calibration-samples 500
--output model_int8.onnx

量化影响分析:

| 精度 | 模型大小 | 速度 | 精度下降 |
|-----------|------|-------|---------------|
| FP32 | 100% | 1x | 0% |
| FP16 | 50% | 1.5-2x | <0.5% |
| INT8 | 25% | 2-4x | 1-3% |

Step 5: Convert to Target Runtime

步骤5:转换为目标运行时格式

bash
undefined
bash
undefined

TensorRT (NVIDIA GPU)

TensorRT(NVIDIA GPU)

trtexec --onnx=model.onnx --saveEngine=model.engine --fp16
trtexec --onnx=model.onnx --saveEngine=model.engine --fp16

OpenVINO (Intel)

OpenVINO(Intel)

mo --input_model model.onnx --output_dir openvino/
mo --input_model model.onnx --output_dir openvino/

CoreML (Apple)

CoreML(Apple)

python -c "import coremltools as ct; model = ct.convert('model.onnx'); model.save('model.mlpackage')"
undefined
python -c "import coremltools as ct; model = ct.convert('model.onnx'); model.save('model.mlpackage')"
undefined

Step 6: Benchmark Optimized Model

步骤6:基准测试优化后的模型

bash
python scripts/inference_optimizer.py model.engine \
    --benchmark \
    --runtime tensorrt \
    --compare model.pt
Expected speedup:
Optimization Results:
- Original (PyTorch FP32): 45.2ms
- Optimized (TensorRT FP16): 12.8ms
- Speedup: 3.5x
- Accuracy change: -0.3% mAP
bash
python scripts/inference_optimizer.py model.engine \
    --benchmark \
    --runtime tensorrt \
    --compare model.pt
预期加速效果:
优化结果:
- 原始模型(PyTorch FP32):45.2ms
- 优化后模型(TensorRT FP16):12.8ms
- 加速比:3.5x
- 精度变化:-0.3% mAP

Workflow 3: Custom Dataset Preparation

工作流3:自定义数据集准备

Use this workflow when preparing a computer vision dataset for training.
当准备用于训练的计算机视觉数据集时使用此工作流。

Step 1: Audit Raw Data

步骤1:原始数据审核

bash
undefined
bash
undefined

Analyze image dataset

分析图像数据集

python scripts/dataset_pipeline_builder.py data/raw/
--analyze
--output analysis/

Analysis report includes:
Dataset Analysis:
  • Total images: 5,234
  • Image sizes: 640x480 to 4096x3072 (variable)
  • Formats: JPEG (4,891), PNG (343)
  • Corrupted: 12 files
  • Duplicates: 45 pairs
Annotation Analysis:
  • Format detected: Pascal VOC XML
  • Total annotations: 28,456
  • Classes: 5 (car, person, bicycle, dog, cat)
  • Distribution: car (12,340), person (8,234), bicycle (3,456), dog (2,890), cat (1,536)
  • Empty images: 234
undefined
python scripts/dataset_pipeline_builder.py data/raw/
--analyze
--output analysis/

分析报告包含:
数据集分析:
  • 总图像数:5,234
  • 图像尺寸:640x480至4096x3072(可变)
  • 格式:JPEG(4,891)、PNG(343)
  • 损坏文件:12个
  • 重复文件:45对
标注分析:
  • 检测到的格式:Pascal VOC XML
  • 总标注数:28,456
  • 类别:5个(汽车、行人、自行车、狗、猫)
  • 分布:汽车(12,340)、行人(8,234)、自行车(3,456)、狗(2,890)、猫(1,536)
  • 无标注图像:234张
undefined

Step 2: Clean and Validate

步骤2:数据清理与验证

bash
undefined
bash
undefined

Remove corrupted and duplicate images

移除损坏和重复的图像

python scripts/dataset_pipeline_builder.py data/raw/
--clean
--remove-corrupted
--remove-duplicates
--output data/cleaned/
undefined
python scripts/dataset_pipeline_builder.py data/raw/
--clean
--remove-corrupted
--remove-duplicates
--output data/cleaned/
undefined

Step 3: Convert Annotation Format

步骤3:转换标注格式

bash
undefined
bash
undefined

Convert VOC to COCO format

将VOC格式转换为COCO格式

python scripts/dataset_pipeline_builder.py data/cleaned/
--annotations data/annotations/
--input-format voc
--output-format coco
--output data/coco/

Supported format conversions:

| From | To |
|------|-----|
| Pascal VOC XML | COCO JSON |
| YOLO TXT | COCO JSON |
| COCO JSON | YOLO TXT |
| LabelMe JSON | COCO JSON |
| CVAT XML | COCO JSON |
python scripts/dataset_pipeline_builder.py data/cleaned/
--annotations data/annotations/
--input-format voc
--output-format coco
--output data/coco/

支持的格式转换:

| 源格式 | 目标格式 |
|------|-----|
| Pascal VOC XML | COCO JSON |
| YOLO TXT | COCO JSON |
| COCO JSON | YOLO TXT |
| LabelMe JSON | COCO JSON |
| CVAT XML | COCO JSON |

Step 4: Apply Augmentations

步骤4:应用数据增强

bash
undefined
bash
undefined

Generate augmentation config

生成数据增强配置

python scripts/dataset_pipeline_builder.py data/coco/
--augment
--aug-config configs/augmentation.yaml
--output data/augmented/

Recommended augmentations for detection:

```yaml
python scripts/dataset_pipeline_builder.py data/coco/
--augment
--aug-config configs/augmentation.yaml
--output data/augmented/

推荐的检测任务数据增强方式:

```yaml

configs/augmentation.yaml

configs/augmentation.yaml

augmentations: geometric: - horizontal_flip: { p: 0.5 } - vertical_flip: { p: 0.1 } # Only if orientation invariant - rotate: { limit: 15, p: 0.3 } - scale: { scale_limit: 0.2, p: 0.5 }
color: - brightness_contrast: { brightness_limit: 0.2, contrast_limit: 0.2, p: 0.5 } - hue_saturation: { hue_shift_limit: 20, sat_shift_limit: 30, p: 0.3 } - blur: { blur_limit: 3, p: 0.1 }
advanced: - mosaic: { p: 0.5 } # YOLO-style mosaic - mixup: { p: 0.1 } # Image mixing - cutout: { num_holes: 8, max_h_size: 32, max_w_size: 32, p: 0.3 }
undefined
augmentations: geometric: - horizontal_flip: { p: 0.5 } - vertical_flip: { p: 0.1 } # 仅当目标不依赖方向时使用 - rotate: { limit: 15, p: 0.3 } - scale: { scale_limit: 0.2, p: 0.5 }
color: - brightness_contrast: { brightness_limit: 0.2, contrast_limit: 0.2, p: 0.5 } - hue_saturation: { hue_shift_limit: 20, sat_shift_limit: 30, p: 0.3 } - blur: { blur_limit: 3, p: 0.1 }
advanced: - mosaic: { p: 0.5 } # YOLO风格马赛克 - mixup: { p: 0.1 } # 图像混合 - cutout: { num_holes: 8, max_h_size: 32, max_w_size: 32, p: 0.3 }
undefined

Step 5: Create Train/Val/Test Splits

步骤5:创建训练/验证/测试集划分

bash
python scripts/dataset_pipeline_builder.py data/augmented/ \
    --split 0.8 0.1 0.1 \
    --stratify \
    --seed 42 \
    --output data/final/
Split strategy guidelines:
Dataset SizeTrainValTest
<1,000 images70%15%15%
1,000-10,00080%10%10%
>10,00090%5%5%
bash
python scripts/dataset_pipeline_builder.py data/augmented/ \
    --split 0.8 0.1 0.1 \
    --stratify \
    --seed 42 \
    --output data/final/
划分策略指南:
数据集规模训练集验证集测试集
<1,000张图像70%15%15%
1,000-10,000张图像80%10%10%
>10,000张图像90%5%5%

Step 6: Generate Dataset Configuration

步骤6:生成数据集配置文件

bash
undefined
bash
undefined

For Ultralytics YOLO

针对Ultralytics YOLO

python scripts/dataset_pipeline_builder.py data/final/
--generate-config yolo
--output data.yaml
python scripts/dataset_pipeline_builder.py data/final/
--generate-config yolo
--output data.yaml

For Detectron2

针对Detectron2

python scripts/dataset_pipeline_builder.py data/final/
--generate-config detectron2
--output detectron2_config.py
undefined
python scripts/dataset_pipeline_builder.py data/final/
--generate-config detectron2
--output detectron2_config.py
undefined

Architecture Selection Guide

架构选择指南

Object Detection Architectures

目标检测架构

ArchitectureSpeedAccuracyBest For
YOLOv8n1.2ms37.3 mAPEdge, mobile, real-time
YOLOv8s2.1ms44.9 mAPBalanced speed/accuracy
YOLOv8m4.2ms50.2 mAPGeneral purpose
YOLOv8l6.8ms52.9 mAPHigh accuracy
YOLOv8x10.1ms53.9 mAPMaximum accuracy
RT-DETR-L5.3ms53.0 mAPTransformer, no NMS
Faster R-CNN R5046ms40.2 mAPTwo-stage, high quality
DINO-4scale85ms49.0 mAPSOTA transformer
架构速度精度最佳适用场景
YOLOv8n1.2ms37.3 mAP边缘设备、移动端、实时场景
YOLOv8s2.1ms44.9 mAP速度与精度平衡
YOLOv8m4.2ms50.2 mAP通用场景
YOLOv8l6.8ms52.9 mAP高精度需求
YOLOv8x10.1ms53.9 mAP最高精度需求
RT-DETR-L5.3ms53.0 mAPTransformer模型,无需NMS
Faster R-CNN R5046ms40.2 mAP两阶段模型,高质量检测
DINO-4scale85ms49.0 mAP最先进的Transformer模型

Segmentation Architectures

分割架构

ArchitectureTypeSpeedBest For
YOLOv8-segInstance4.5msReal-time instance seg
Mask R-CNNInstance67msHigh-quality masks
SAMPromptable50msZero-shot segmentation
DeepLabV3+Semantic25msScene parsing
SegFormerSemantic15msEfficient semantic seg
架构类型速度最佳适用场景
YOLOv8-seg实例分割4.5ms实时实例分割
Mask R-CNN实例分割67ms高质量掩码
SAM可提示分割50ms零样本分割
DeepLabV3+语义分割25ms场景解析
SegFormer语义分割15ms高效语义分割

CNN vs Vision Transformer Trade-offs

CNN与Vision Transformer的权衡

AspectCNN (YOLO, R-CNN)ViT (DETR, DINO)
Training data needed1K-10K images10K-100K+ images
Training timeFastSlow (needs more epochs)
Inference speedFasterSlower
Small objectsGood with FPNNeeds multi-scale
Global contextLimitedExcellent
Positional encodingImplicitExplicit
方面CNN(YOLO、R-CNN)ViT(DETR、DINO)
所需训练数据1K-10K张图像10K-100K+张图像
训练时间慢(需要更多轮次)
推理速度更快更慢
小目标检测结合FPN效果好需要多尺度处理
全局上下文有限优秀
位置编码隐式显式

Reference Documentation

参考文档

1. Computer Vision Architectures

1. 计算机视觉架构

See
references/computer_vision_architectures.md
for:
  • CNN backbone architectures (ResNet, EfficientNet, ConvNeXt)
  • Vision Transformer variants (ViT, DeiT, Swin)
  • Detection heads (anchor-based vs anchor-free)
  • Feature Pyramid Networks (FPN, BiFPN, PANet)
  • Neck architectures for multi-scale detection
查看
references/computer_vision_architectures.md
获取:
  • CNN骨干架构(ResNet、EfficientNet、ConvNeXt)
  • Vision Transformer变体(ViT、DeiT、Swin)
  • 检测头(基于锚点 vs 无锚点)
  • 特征金字塔网络(FPN、BiFPN、PANet)
  • 用于多尺度检测的Neck架构

2. Object Detection Optimization

2. 目标检测优化

See
references/object_detection_optimization.md
for:
  • Non-Maximum Suppression variants (NMS, Soft-NMS, DIoU-NMS)
  • Anchor optimization and anchor-free alternatives
  • Loss function design (focal loss, GIoU, CIoU, DIoU)
  • Training strategies (warmup, cosine annealing, EMA)
  • Data augmentation for detection (mosaic, mixup, copy-paste)
查看
references/object_detection_optimization.md
获取:
  • 非极大值抑制变体(NMS、Soft-NMS、DIoU-NMS)
  • 锚点优化与无锚点替代方案
  • 损失函数设计(Focal Loss、GIoU、CIoU、DIoU)
  • 训练策略(热身、余弦退火、EMA)
  • 检测任务的数据增强(马赛克、MixUp、Copy-Paste)

3. Production Vision Systems

3. 生产级视觉系统

See
references/production_vision_systems.md
for:
  • ONNX export and optimization
  • TensorRT deployment pipeline
  • Batch inference optimization
  • Edge device deployment (Jetson, Intel NCS)
  • Model serving with Triton
  • Video processing pipelines
查看
references/production_vision_systems.md
获取:
  • ONNX导出与优化
  • TensorRT部署流水线
  • 批量推理优化
  • 边缘设备部署(Jetson、Intel NCS)
  • 基于Triton的模型服务
  • 视频处理流水线

Common Commands

常用命令

Ultralytics YOLO

Ultralytics YOLO

bash
undefined
bash
undefined

Training

训练

yolo detect train data=coco.yaml model=yolov8m.pt epochs=100 imgsz=640
yolo detect train data=coco.yaml model=yolov8m.pt epochs=100 imgsz=640

Validation

验证

yolo detect val model=best.pt data=coco.yaml
yolo detect val model=best.pt data=coco.yaml

Inference

推理

yolo detect predict model=best.pt source=images/ save=True
yolo detect predict model=best.pt source=images/ save=True

Export

导出

yolo export model=best.pt format=onnx simplify=True dynamic=True
undefined
yolo export model=best.pt format=onnx simplify=True dynamic=True
undefined

Detectron2

Detectron2

bash
undefined
bash
undefined

Training

训练

python train_net.py --config-file configs/COCO-Detection/faster_rcnn_R_50_FPN_3x.yaml
--num-gpus 1 OUTPUT_DIR ./output
python train_net.py --config-file configs/COCO-Detection/faster_rcnn_R_50_FPN_3x.yaml
--num-gpus 1 OUTPUT_DIR ./output

Evaluation

评估

python train_net.py --config-file configs/faster_rcnn.yaml --eval-only
MODEL.WEIGHTS output/model_final.pth
python train_net.py --config-file configs/faster_rcnn.yaml --eval-only
MODEL.WEIGHTS output/model_final.pth

Inference

推理

python demo.py --config-file configs/faster_rcnn.yaml
--input images/*.jpg --output results/
--opts MODEL.WEIGHTS output/model_final.pth
undefined
python demo.py --config-file configs/faster_rcnn.yaml
--input images/*.jpg --output results/
--opts MODEL.WEIGHTS output/model_final.pth
undefined

MMDetection

MMDetection

bash
undefined
bash
undefined

Training

训练

python tools/train.py configs/faster_rcnn/faster-rcnn_r50_fpn_1x_coco.py
python tools/train.py configs/faster_rcnn/faster-rcnn_r50_fpn_1x_coco.py

Testing

测试

python tools/test.py configs/faster_rcnn.py checkpoints/latest.pth --eval bbox
python tools/test.py configs/faster_rcnn.py checkpoints/latest.pth --eval bbox

Inference

推理

python demo/image_demo.py demo.jpg configs/faster_rcnn.py checkpoints/latest.pth
undefined
python demo/image_demo.py demo.jpg configs/faster_rcnn.py checkpoints/latest.pth
undefined

Model Optimization

模型优化

bash
undefined
bash
undefined

ONNX export and simplify

ONNX导出与简化

python -c "import torch; model = torch.load('model.pt'); torch.onnx.export(model, torch.randn(1,3,640,640), 'model.onnx', opset_version=17)" python -m onnxsim model.onnx model_sim.onnx
python -c "import torch; model = torch.load('model.pt'); torch.onnx.export(model, torch.randn(1,3,640,640), 'model.onnx', opset_version=17)" python -m onnxsim model.onnx model_sim.onnx

TensorRT conversion

TensorRT转换

trtexec --onnx=model.onnx --saveEngine=model.engine --fp16 --workspace=4096
trtexec --onnx=model.onnx --saveEngine=model.engine --fp16 --workspace=4096

Benchmark

基准测试

trtexec --loadEngine=model.engine --batch=1 --iterations=1000 --avgRuns=100
undefined
trtexec --loadEngine=model.engine --batch=1 --iterations=1000 --avgRuns=100
undefined

Performance Targets

性能目标

MetricReal-timeHigh AccuracyEdge
FPS>30>10>15
mAP@50>0.6>0.8>0.5
Latency P99<50ms<150ms<100ms
GPU Memory<4GB<8GB<2GB
Model Size<50MB<200MB<20MB
指标实时场景高精度场景边缘场景
FPS>30>10>15
mAP@50>0.6>0.8>0.5
P99延迟<50ms<150ms<100ms
GPU内存<4GB<8GB<2GB
模型大小<50MB<200MB<20MB

Resources

资源

  • Architecture Guide:
    references/computer_vision_architectures.md
  • Optimization Guide:
    references/object_detection_optimization.md
  • Deployment Guide:
    references/production_vision_systems.md
  • Scripts:
    scripts/
    directory for automation tools
  • 架构指南
    references/computer_vision_architectures.md
  • 优化指南
    references/object_detection_optimization.md
  • 部署指南
    references/production_vision_systems.md
  • 脚本
    scripts/
    目录下的自动化工具