fiftyone-dataset-inference

Compare original and translation side by side

🇺🇸

Original

English
🇨🇳

Translation

Chinese

Run Model Inference on FiftyOne Datasets

在FiftyOne数据集上运行模型推理

Key Directives

核心规则

ALWAYS follow these rules:
请始终遵循以下规则:

1. Check if dataset exists first

1. 先检查数据集是否存在

python
list_datasets()
If the dataset doesn't exist, use the fiftyone-dataset-import skill to load it first.
python
list_datasets()
如果数据集不存在,请使用fiftyone-dataset-import技能先加载数据。

2. Set context before operations

2. 操作前设置上下文

python
set_context(dataset_name="my-dataset")
python
set_context(dataset_name="my-dataset")

3. Launch App for inference

3. 启动App以执行推理

The App must be running to execute inference operators:
python
launch_app(dataset_name="my-dataset")
必须运行App才能执行推理操作:
python
launch_app(dataset_name="my-dataset")

4. Ask user for field names

4. 向用户确认字段名称

Always confirm with the user:
  • Which model to use
  • Label field name for predictions (e.g.,
    predictions
    ,
    detections
    ,
    embeddings
    )
请始终与用户确认:
  • 使用哪个模型
  • 预测结果的标签字段名(例如
    predictions
    detections
    embeddings

5. Close app when done

5. 完成后关闭App

python
close_app()
python
close_app()

Workflow

工作流

Step 1: Verify Dataset Exists

步骤1:验证数据集是否存在

python
list_datasets()
If the dataset is not in the list:
  • Ask the user for the data location
  • Use the fiftyone-dataset-import skill to import the data first
  • Return to this workflow after import completes
python
list_datasets()
如果列表中没有该数据集:
  • 询问用户数据位置
  • 使用fiftyone-dataset-import技能导入数据
  • 导入完成后返回本工作流

Step 2: Load Dataset and Review

步骤2:加载数据集并查看摘要

python
set_context(dataset_name="my-dataset")
dataset_summary(name="my-dataset")
Review:
  • Sample count
  • Media type
  • Existing label fields
python
set_context(dataset_name="my-dataset")
dataset_summary(name="my-dataset")
查看以下信息:
  • 样本数量
  • 媒体类型
  • 已有的标签字段

Step 3: Launch App

步骤3:启动App

python
launch_app(dataset_name="my-dataset")
python
launch_app(dataset_name="my-dataset")

Step 4: Apply Model Inference

步骤4:执行模型推理

Ask user for:
  • Model name (see Available Zoo Models below)
  • Label field for predictions
python
execute_operator(
    operator_uri="@voxel51/zoo/apply_zoo_model",
    params={
        "tab": "BUILTIN",
        "model": "yolov8n-coco-torch",
        "label_field": "predictions"
    }
)
向用户确认:
  • 模型名称(见下方可用模型库模型
  • 预测结果的标签字段
python
execute_operator(
    operator_uri="@voxel51/zoo/apply_zoo_model",
    params={
        "tab": "BUILTIN",
        "model": "yolov8n-coco-torch",
        "label_field": "predictions"
    }
)

Step 5: View Results

步骤5:查看结果

python
set_view(exists=["predictions"])
python
set_view(exists=["predictions"])

Step 6: Clean Up

步骤6:清理操作

python
close_app()
python
close_app()

Available Zoo Models

可用模型库模型

Some models require additional packages. If a model fails with a dependency error, the response includes
install_command
. Offer to run it for the user.
部分模型需要额外的依赖包。如果模型因依赖错误运行失败,返回结果会包含
install_command
,请主动提出帮用户运行该命令。

Detection Models

检测模型

ModelDescriptionExtra Deps
faster-rcnn-resnet50-fpn-coco-torch
Faster R-CNNNone
retinanet-resnet50-fpn-coco-torch
RetinaNetNone
yolov8n-coco-torch
YOLOv8 nano (fast)ultralytics
yolov8s-coco-torch
YOLOv8 smallultralytics
yolov8m-coco-torch
YOLOv8 mediumultralytics
yolov8l-coco-torch
YOLOv8 largeultralytics
yolov8x-coco-torch
YOLOv8 extra-largeultralytics
模型描述额外依赖
faster-rcnn-resnet50-fpn-coco-torch
Faster R-CNN
retinanet-resnet50-fpn-coco-torch
RetinaNet
yolov8n-coco-torch
YOLOv8 nano(快速型)ultralytics
yolov8s-coco-torch
YOLOv8小型ultralytics
yolov8m-coco-torch
YOLOv8中型ultralytics
yolov8l-coco-torch
YOLOv8大型ultralytics
yolov8x-coco-torch
YOLOv8超大型ultralytics

Classification Models

分类模型

ModelDescriptionExtra Deps
resnet50-imagenet-torch
ResNet-50None
mobilenet-v2-imagenet-torch
MobileNet v2None
vit-base-patch16-224-imagenet-torch
Vision TransformerNone
模型描述额外依赖
resnet50-imagenet-torch
ResNet-50
mobilenet-v2-imagenet-torch
MobileNet v2
vit-base-patch16-224-imagenet-torch
Vision Transformer

Segmentation Models

分割模型

ModelDescriptionExtra Deps
sam-vit-base-torch
Segment Anything (base)segment-anything
sam-vit-large-torch
Segment Anything (large)segment-anything
sam-vit-huge-torch
Segment Anything (huge)segment-anything
deeplabv3-resnet101-coco-torch
DeepLabV3None
模型描述额外依赖
sam-vit-base-torch
Segment Anything(基础版)segment-anything
sam-vit-large-torch
Segment Anything(大型版)segment-anything
sam-vit-huge-torch
Segment Anything(超大型版)segment-anything
deeplabv3-resnet101-coco-torch
DeepLabV3

Embedding Models

嵌入模型

ModelDescriptionExtra Deps
clip-vit-base32-torch
CLIP embeddingsopen-clip-torch
dinov2-vits14-torch
DINOv2 smallNone
dinov2-vitb14-torch
DINOv2 baseNone
dinov2-vitl14-torch
DINOv2 largeNone
模型描述额外依赖
clip-vit-base32-torch
CLIP嵌入open-clip-torch
dinov2-vits14-torch
DINOv2小型
dinov2-vitb14-torch
DINOv2基础版
dinov2-vitl14-torch
DINOv2大型

Common Use Cases

常见使用场景

Use Case 1: Run Object Detection

场景1:运行目标检测

python
undefined
python
undefined

Verify dataset exists

验证数据集是否存在

list_datasets()
list_datasets()

Set context and launch

设置上下文并启动App

set_context(dataset_name="my-dataset") launch_app(dataset_name="my-dataset")
set_context(dataset_name="my-dataset") launch_app(dataset_name="my-dataset")

Apply detection model

应用检测模型

execute_operator( operator_uri="@voxel51/zoo/apply_zoo_model", params={ "tab": "BUILTIN", "model": "faster-rcnn-resnet50-fpn-coco-torch", "label_field": "predictions" } )
execute_operator( operator_uri="@voxel51/zoo/apply_zoo_model", params={ "tab": "BUILTIN", "model": "faster-rcnn-resnet50-fpn-coco-torch", "label_field": "predictions" } )

View results

查看结果

set_view(exists=["predictions"])
undefined
set_view(exists=["predictions"])
undefined

Use Case 2: Run Classification

场景2:运行分类任务

python
set_context(dataset_name="my-dataset")
launch_app(dataset_name="my-dataset")

execute_operator(
    operator_uri="@voxel51/zoo/apply_zoo_model",
    params={
        "tab": "BUILTIN",
        "model": "resnet50-imagenet-torch",
        "label_field": "classification"
    }
)

set_view(exists=["classification"])
python
set_context(dataset_name="my-dataset")
launch_app(dataset_name="my-dataset")

execute_operator(
    operator_uri="@voxel51/zoo/apply_zoo_model",
    params={
        "tab": "BUILTIN",
        "model": "resnet50-imagenet-torch",
        "label_field": "classification"
    }
)

set_view(exists=["classification"])

Use Case 3: Generate Embeddings

场景3:生成嵌入向量

python
set_context(dataset_name="my-dataset")
launch_app(dataset_name="my-dataset")

execute_operator(
    operator_uri="@voxel51/zoo/apply_zoo_model",
    params={
        "tab": "BUILTIN",
        "model": "clip-vit-base32-torch",
        "label_field": "clip_embeddings"
    }
)
python
set_context(dataset_name="my-dataset")
launch_app(dataset_name="my-dataset")

execute_operator(
    operator_uri="@voxel51/zoo/apply_zoo_model",
    params={
        "tab": "BUILTIN",
        "model": "clip-vit-base32-torch",
        "label_field": "clip_embeddings"
    }
)

Use Case 4: Compare Ground Truth with Predictions

场景4:对比真实标签与预测结果

If dataset has existing labels:
python
set_context(dataset_name="my-dataset")
dataset_summary(name="my-dataset")  # Check existing fields

launch_app(dataset_name="my-dataset")
如果数据集已有标签:
python
set_context(dataset_name="my-dataset")
dataset_summary(name="my-dataset")  # 检查已有字段

launch_app(dataset_name="my-dataset")

Run inference with different field name

使用不同字段名运行推理

execute_operator( operator_uri="@voxel51/zoo/apply_zoo_model", params={ "tab": "BUILTIN", "model": "yolov8m-coco-torch", "label_field": "predictions" # Different from ground_truth } )
execute_operator( operator_uri="@voxel51/zoo/apply_zoo_model", params={ "tab": "BUILTIN", "model": "yolov8m-coco-torch", "label_field": "predictions" # 与ground_truth区分 } )

View both fields to compare

查看两个字段进行对比

set_view(exists=["ground_truth", "predictions"])
undefined
set_view(exists=["ground_truth", "predictions"])
undefined

Use Case 5: Run Multiple Models

场景5:运行多个模型

python
set_context(dataset_name="my-dataset")
launch_app(dataset_name="my-dataset")
python
set_context(dataset_name="my-dataset")
launch_app(dataset_name="my-dataset")

Run detection

运行检测任务

execute_operator( operator_uri="@voxel51/zoo/apply_zoo_model", params={ "tab": "BUILTIN", "model": "yolov8n-coco-torch", "label_field": "detections" } )
execute_operator( operator_uri="@voxel51/zoo/apply_zoo_model", params={ "tab": "BUILTIN", "model": "yolov8n-coco-torch", "label_field": "detections" } )

Run classification

运行分类任务

execute_operator( operator_uri="@voxel51/zoo/apply_zoo_model", params={ "tab": "BUILTIN", "model": "resnet50-imagenet-torch", "label_field": "classification" } )
execute_operator( operator_uri="@voxel51/zoo/apply_zoo_model", params={ "tab": "BUILTIN", "model": "resnet50-imagenet-torch", "label_field": "classification" } )

Run embeddings

运行嵌入生成任务

execute_operator( operator_uri="@voxel51/zoo/apply_zoo_model", params={ "tab": "BUILTIN", "model": "clip-vit-base32-torch", "label_field": "embeddings" } )
undefined
execute_operator( operator_uri="@voxel51/zoo/apply_zoo_model", params={ "tab": "BUILTIN", "model": "clip-vit-base32-torch", "label_field": "embeddings" } )
undefined

Troubleshooting

故障排除

Error: "Dataset not found"
  • Use
    list_datasets()
    to see available datasets
  • Use the fiftyone-dataset-import skill to import data first
Error: "Model not found"
  • Check model name spelling
  • Use
    get_operator_schema("@voxel51/zoo/apply_zoo_model")
    to see available models
Error: "Missing dependency" (e.g., ultralytics, segment-anything)
  • The MCP server detects missing dependencies
  • Response includes
    missing_package
    and
    install_command
  • Install the required package:
    pip install <package>
  • Restart MCP server after installing
Inference is slow
  • Use smaller model variant (e.g.,
    yolov8n
    instead of
    yolov8x
    )
  • Use delegated execution for large datasets
  • Consider filtering to a view first
Out of memory
  • Reduce batch size
  • Use smaller model variant
  • Process dataset in chunks using views
错误:"Dataset not found"
  • 使用
    list_datasets()
    查看可用数据集
  • 使用fiftyone-dataset-import技能导入数据
错误:"Model not found"
  • 检查模型名称拼写
  • 使用
    get_operator_schema("@voxel51/zoo/apply_zoo_model")
    查看可用模型
错误:"Missing dependency"(例如ultralytics、segment-anything)
  • MCP服务器会检测到缺失的依赖
  • 返回结果包含
    missing_package
    install_command
  • 安装所需包:
    pip install <package>
  • 安装后重启MCP服务器
推理速度慢
  • 使用更小的模型变体(例如用
    yolov8n
    代替
    yolov8x
  • 对大型数据集使用委托执行
  • 考虑先过滤出一个子集再处理
内存不足
  • 减小批量大小
  • 使用更小的模型变体
  • 使用子集分块处理数据集

Best Practices

最佳实践

  1. Use descriptive field names -
    predictions
    ,
    yolo_detections
    ,
    clip_embeddings
  2. Don't overwrite ground truth - Use different field names for predictions
  3. Start with fast models - Use nano/small variants first, upgrade if needed
  4. Check existing fields - Use
    dataset_summary()
    before running inference
  5. Filter first for testing - Test on a small view before processing full dataset
  1. 使用描述性字段名 - 例如
    predictions
    yolo_detections
    clip_embeddings
  2. 不要覆盖真实标签 - 预测结果使用与真实标签不同的字段名
  3. 先使用快速模型 - 先试用nano/small变体,必要时再升级
  4. 检查已有字段 - 运行推理前使用
    dataset_summary()
    查看
  5. 先过滤再测试 - 在处理完整数据集前,先在小子集上测试

Resources

资源