fiftyone-dataset-inference
Compare original and translation side by side
🇺🇸
Original
English🇨🇳
Translation
ChineseRun Model Inference on FiftyOne Datasets
在FiftyOne数据集上运行模型推理
Key Directives
核心规则
ALWAYS follow these rules:
请始终遵循以下规则:
1. Check if dataset exists first
1. 先检查数据集是否存在
python
list_datasets()If the dataset doesn't exist, use the fiftyone-dataset-import skill to load it first.
python
list_datasets()如果数据集不存在,请使用fiftyone-dataset-import技能先加载数据。
2. Set context before operations
2. 操作前设置上下文
python
set_context(dataset_name="my-dataset")python
set_context(dataset_name="my-dataset")3. Launch App for inference
3. 启动App以执行推理
The App must be running to execute inference operators:
python
launch_app(dataset_name="my-dataset")必须运行App才能执行推理操作:
python
launch_app(dataset_name="my-dataset")4. Ask user for field names
4. 向用户确认字段名称
Always confirm with the user:
- Which model to use
- Label field name for predictions (e.g., ,
predictions,detections)embeddings
请始终与用户确认:
- 使用哪个模型
- 预测结果的标签字段名(例如、
predictions、detections)embeddings
5. Close app when done
5. 完成后关闭App
python
close_app()python
close_app()Workflow
工作流
Step 1: Verify Dataset Exists
步骤1:验证数据集是否存在
python
list_datasets()If the dataset is not in the list:
- Ask the user for the data location
- Use the fiftyone-dataset-import skill to import the data first
- Return to this workflow after import completes
python
list_datasets()如果列表中没有该数据集:
- 询问用户数据位置
- 使用fiftyone-dataset-import技能导入数据
- 导入完成后返回本工作流
Step 2: Load Dataset and Review
步骤2:加载数据集并查看摘要
python
set_context(dataset_name="my-dataset")
dataset_summary(name="my-dataset")Review:
- Sample count
- Media type
- Existing label fields
python
set_context(dataset_name="my-dataset")
dataset_summary(name="my-dataset")查看以下信息:
- 样本数量
- 媒体类型
- 已有的标签字段
Step 3: Launch App
步骤3:启动App
python
launch_app(dataset_name="my-dataset")python
launch_app(dataset_name="my-dataset")Step 4: Apply Model Inference
步骤4:执行模型推理
Ask user for:
- Model name (see Available Zoo Models below)
- Label field for predictions
python
execute_operator(
operator_uri="@voxel51/zoo/apply_zoo_model",
params={
"tab": "BUILTIN",
"model": "yolov8n-coco-torch",
"label_field": "predictions"
}
)向用户确认:
- 模型名称(见下方可用模型库模型)
- 预测结果的标签字段
python
execute_operator(
operator_uri="@voxel51/zoo/apply_zoo_model",
params={
"tab": "BUILTIN",
"model": "yolov8n-coco-torch",
"label_field": "predictions"
}
)Step 5: View Results
步骤5:查看结果
python
set_view(exists=["predictions"])python
set_view(exists=["predictions"])Step 6: Clean Up
步骤6:清理操作
python
close_app()python
close_app()Available Zoo Models
可用模型库模型
Some models require additional packages. If a model fails with a dependency error, the response includes . Offer to run it for the user.
install_command部分模型需要额外的依赖包。如果模型因依赖错误运行失败,返回结果会包含,请主动提出帮用户运行该命令。
install_commandDetection Models
检测模型
| Model | Description | Extra Deps |
|---|---|---|
| Faster R-CNN | None |
| RetinaNet | None |
| YOLOv8 nano (fast) | ultralytics |
| YOLOv8 small | ultralytics |
| YOLOv8 medium | ultralytics |
| YOLOv8 large | ultralytics |
| YOLOv8 extra-large | ultralytics |
| 模型 | 描述 | 额外依赖 |
|---|---|---|
| Faster R-CNN | 无 |
| RetinaNet | 无 |
| YOLOv8 nano(快速型) | ultralytics |
| YOLOv8小型 | ultralytics |
| YOLOv8中型 | ultralytics |
| YOLOv8大型 | ultralytics |
| YOLOv8超大型 | ultralytics |
Classification Models
分类模型
| Model | Description | Extra Deps |
|---|---|---|
| ResNet-50 | None |
| MobileNet v2 | None |
| Vision Transformer | None |
| 模型 | 描述 | 额外依赖 |
|---|---|---|
| ResNet-50 | 无 |
| MobileNet v2 | 无 |
| Vision Transformer | 无 |
Segmentation Models
分割模型
| Model | Description | Extra Deps |
|---|---|---|
| Segment Anything (base) | segment-anything |
| Segment Anything (large) | segment-anything |
| Segment Anything (huge) | segment-anything |
| DeepLabV3 | None |
| 模型 | 描述 | 额外依赖 |
|---|---|---|
| Segment Anything(基础版) | segment-anything |
| Segment Anything(大型版) | segment-anything |
| Segment Anything(超大型版) | segment-anything |
| DeepLabV3 | 无 |
Embedding Models
嵌入模型
| Model | Description | Extra Deps |
|---|---|---|
| CLIP embeddings | open-clip-torch |
| DINOv2 small | None |
| DINOv2 base | None |
| DINOv2 large | None |
| 模型 | 描述 | 额外依赖 |
|---|---|---|
| CLIP嵌入 | open-clip-torch |
| DINOv2小型 | 无 |
| DINOv2基础版 | 无 |
| DINOv2大型 | 无 |
Common Use Cases
常见使用场景
Use Case 1: Run Object Detection
场景1:运行目标检测
python
undefinedpython
undefinedVerify dataset exists
验证数据集是否存在
list_datasets()
list_datasets()
Set context and launch
设置上下文并启动App
set_context(dataset_name="my-dataset")
launch_app(dataset_name="my-dataset")
set_context(dataset_name="my-dataset")
launch_app(dataset_name="my-dataset")
Apply detection model
应用检测模型
execute_operator(
operator_uri="@voxel51/zoo/apply_zoo_model",
params={
"tab": "BUILTIN",
"model": "faster-rcnn-resnet50-fpn-coco-torch",
"label_field": "predictions"
}
)
execute_operator(
operator_uri="@voxel51/zoo/apply_zoo_model",
params={
"tab": "BUILTIN",
"model": "faster-rcnn-resnet50-fpn-coco-torch",
"label_field": "predictions"
}
)
View results
查看结果
set_view(exists=["predictions"])
undefinedset_view(exists=["predictions"])
undefinedUse Case 2: Run Classification
场景2:运行分类任务
python
set_context(dataset_name="my-dataset")
launch_app(dataset_name="my-dataset")
execute_operator(
operator_uri="@voxel51/zoo/apply_zoo_model",
params={
"tab": "BUILTIN",
"model": "resnet50-imagenet-torch",
"label_field": "classification"
}
)
set_view(exists=["classification"])python
set_context(dataset_name="my-dataset")
launch_app(dataset_name="my-dataset")
execute_operator(
operator_uri="@voxel51/zoo/apply_zoo_model",
params={
"tab": "BUILTIN",
"model": "resnet50-imagenet-torch",
"label_field": "classification"
}
)
set_view(exists=["classification"])Use Case 3: Generate Embeddings
场景3:生成嵌入向量
python
set_context(dataset_name="my-dataset")
launch_app(dataset_name="my-dataset")
execute_operator(
operator_uri="@voxel51/zoo/apply_zoo_model",
params={
"tab": "BUILTIN",
"model": "clip-vit-base32-torch",
"label_field": "clip_embeddings"
}
)python
set_context(dataset_name="my-dataset")
launch_app(dataset_name="my-dataset")
execute_operator(
operator_uri="@voxel51/zoo/apply_zoo_model",
params={
"tab": "BUILTIN",
"model": "clip-vit-base32-torch",
"label_field": "clip_embeddings"
}
)Use Case 4: Compare Ground Truth with Predictions
场景4:对比真实标签与预测结果
If dataset has existing labels:
python
set_context(dataset_name="my-dataset")
dataset_summary(name="my-dataset") # Check existing fields
launch_app(dataset_name="my-dataset")如果数据集已有标签:
python
set_context(dataset_name="my-dataset")
dataset_summary(name="my-dataset") # 检查已有字段
launch_app(dataset_name="my-dataset")Run inference with different field name
使用不同字段名运行推理
execute_operator(
operator_uri="@voxel51/zoo/apply_zoo_model",
params={
"tab": "BUILTIN",
"model": "yolov8m-coco-torch",
"label_field": "predictions" # Different from ground_truth
}
)
execute_operator(
operator_uri="@voxel51/zoo/apply_zoo_model",
params={
"tab": "BUILTIN",
"model": "yolov8m-coco-torch",
"label_field": "predictions" # 与ground_truth区分
}
)
View both fields to compare
查看两个字段进行对比
set_view(exists=["ground_truth", "predictions"])
undefinedset_view(exists=["ground_truth", "predictions"])
undefinedUse Case 5: Run Multiple Models
场景5:运行多个模型
python
set_context(dataset_name="my-dataset")
launch_app(dataset_name="my-dataset")python
set_context(dataset_name="my-dataset")
launch_app(dataset_name="my-dataset")Run detection
运行检测任务
execute_operator(
operator_uri="@voxel51/zoo/apply_zoo_model",
params={
"tab": "BUILTIN",
"model": "yolov8n-coco-torch",
"label_field": "detections"
}
)
execute_operator(
operator_uri="@voxel51/zoo/apply_zoo_model",
params={
"tab": "BUILTIN",
"model": "yolov8n-coco-torch",
"label_field": "detections"
}
)
Run classification
运行分类任务
execute_operator(
operator_uri="@voxel51/zoo/apply_zoo_model",
params={
"tab": "BUILTIN",
"model": "resnet50-imagenet-torch",
"label_field": "classification"
}
)
execute_operator(
operator_uri="@voxel51/zoo/apply_zoo_model",
params={
"tab": "BUILTIN",
"model": "resnet50-imagenet-torch",
"label_field": "classification"
}
)
Run embeddings
运行嵌入生成任务
execute_operator(
operator_uri="@voxel51/zoo/apply_zoo_model",
params={
"tab": "BUILTIN",
"model": "clip-vit-base32-torch",
"label_field": "embeddings"
}
)
undefinedexecute_operator(
operator_uri="@voxel51/zoo/apply_zoo_model",
params={
"tab": "BUILTIN",
"model": "clip-vit-base32-torch",
"label_field": "embeddings"
}
)
undefinedTroubleshooting
故障排除
Error: "Dataset not found"
- Use to see available datasets
list_datasets() - Use the fiftyone-dataset-import skill to import data first
Error: "Model not found"
- Check model name spelling
- Use to see available models
get_operator_schema("@voxel51/zoo/apply_zoo_model")
Error: "Missing dependency" (e.g., ultralytics, segment-anything)
- The MCP server detects missing dependencies
- Response includes and
missing_packageinstall_command - Install the required package:
pip install <package> - Restart MCP server after installing
Inference is slow
- Use smaller model variant (e.g., instead of
yolov8n)yolov8x - Use delegated execution for large datasets
- Consider filtering to a view first
Out of memory
- Reduce batch size
- Use smaller model variant
- Process dataset in chunks using views
错误:"Dataset not found"
- 使用查看可用数据集
list_datasets() - 使用fiftyone-dataset-import技能导入数据
错误:"Model not found"
- 检查模型名称拼写
- 使用查看可用模型
get_operator_schema("@voxel51/zoo/apply_zoo_model")
错误:"Missing dependency"(例如ultralytics、segment-anything)
- MCP服务器会检测到缺失的依赖
- 返回结果包含和
missing_packageinstall_command - 安装所需包:
pip install <package> - 安装后重启MCP服务器
推理速度慢
- 使用更小的模型变体(例如用代替
yolov8n)yolov8x - 对大型数据集使用委托执行
- 考虑先过滤出一个子集再处理
内存不足
- 减小批量大小
- 使用更小的模型变体
- 使用子集分块处理数据集
Best Practices
最佳实践
- Use descriptive field names - ,
predictions,yolo_detectionsclip_embeddings - Don't overwrite ground truth - Use different field names for predictions
- Start with fast models - Use nano/small variants first, upgrade if needed
- Check existing fields - Use before running inference
dataset_summary() - Filter first for testing - Test on a small view before processing full dataset
- 使用描述性字段名 - 例如、
predictions、yolo_detectionsclip_embeddings - 不要覆盖真实标签 - 预测结果使用与真实标签不同的字段名
- 先使用快速模型 - 先试用nano/small变体,必要时再升级
- 检查已有字段 - 运行推理前使用查看
dataset_summary() - 先过滤再测试 - 在处理完整数据集前,先在小子集上测试