experiment-tracking

Compare original and translation side by side

🇺🇸

Original

English

🇨🇳

Translation

Chinese

Experiment Tracking

实验追踪

Track ML experiments, metrics, and models.

追踪机器学习实验、指标与模型。

Comparison

平台对比

Platform	Best For	Self-hosted	Visualization
MLflow	Open-source, model registry	Yes	Basic
W&B	Collaboration, sweeps	Limited	Excellent
Neptune	Team collaboration	No	Good
ClearML	Full MLOps	Yes	Good

平台	最适用场景	可自部署	可视化能力
MLflow	开源方案、模型注册	是	基础级
W&B	团队协作、超参数调优搜索	有限支持	优秀级
Neptune	团队协作	否	良好级
ClearML	全流程MLOps	是	良好级

MLflow

Open-source platform from Databricks.

Core components:

Tracking: Log parameters, metrics, artifacts
Projects: Reproducible runs (MLproject file)
Models: Package and deploy models
Registry: Model versioning and staging

Strengths: Self-hosted, open-source, model registry, framework integrations Limitations: Basic visualization, less collaborative features

Key concept: Autologging for major frameworks - automatic metric capture with one line.

Databricks推出的开源平台。

核心组件：

Tracking（追踪）：记录参数、指标、制品
Projects（项目）：可复现的运行环境（基于MLproject文件）
Models（模型）：模型打包与部署
Registry（注册中心）：模型版本管理与 staging 阶段管控

优势：支持自部署、开源、内置模型注册中心、多框架集成 局限性：可视化能力基础、协作功能较少 核心概念：主流框架自动记录——仅需一行代码即可自动捕获指标。

Weights & Biases (W&B)

Cloud-first experiment tracking with excellent visualization.

Core features:

Experiment tracking: Metrics, hyperparameters, system stats
Sweeps: Hyperparameter search (grid, random, Bayesian)
Artifacts: Dataset and model versioning
Reports: Shareable documentation

Strengths: Beautiful visualizations, team collaboration, hyperparameter sweeps Limitations: Cloud-dependent, limited self-hosting

Key concept:

wandb.init()

wandb.log()

- simple API, powerful features.

优先支持云端的实验追踪工具，具备出色的可视化能力。

核心功能：

实验追踪：指标、超参数、系统状态
Sweeps（调优搜索）：超参数搜索（网格、随机、贝叶斯算法）
Artifacts（制品）：数据集与模型版本管理
Reports（报告）：可分享的文档

优势：可视化效果出色、支持团队协作、内置超参数调优搜索 局限性：依赖云端、自部署支持有限 核心概念：

wandb.init()

wandb.log()

——简洁API，功能强大。

What to Track

需要追踪的内容

Category	Examples
Hyperparameters	Learning rate, batch size, architecture
Metrics	Loss, accuracy, F1, per-epoch values
Artifacts	Model checkpoints, configs, datasets
System	GPU usage, memory, runtime
Code	Git commit, diff, requirements

分类	示例
超参数	学习率、批量大小、模型架构
指标	损失值、准确率、F1值、每轮训练数值
制品	模型 checkpoint、配置文件、数据集
系统状态	GPU使用率、内存占用、运行时长
代码信息	Git提交记录、代码差异、依赖清单

Model Registry Concepts

模型注册中心核心概念

Stage	Purpose
None	Just logged, not registered
Staging	Testing, validation
Production	Serving live traffic
Archived	Deprecated, kept for reference

阶段	用途
None（未注册）	仅已记录，未纳入注册中心
Staging（预发布）	测试、验证阶段
Production（生产环境）	用于线上服务
Archived（已归档）	已弃用，仅留作参考

Decision Guide

选型指南

Scenario	Recommendation
Self-hosted requirement	MLflow
Team collaboration	W&B
Model registry focus	MLflow
Hyperparameter sweeps	W&B
Beautiful dashboards	W&B
Full MLOps pipeline	MLflow + deployment tools

场景	推荐方案
需自部署	MLflow
团队协作需求	W&B
聚焦模型注册管理	MLflow
超参数调优需求	W&B
需要精美仪表盘	W&B
全流程MLOps管线	MLflow + 部署工具

Resources

参考资源

MLflow: https://mlflow.org/docs/latest/
W&B: https://docs.wandb.ai/

MLflow：https://mlflow.org/docs/latest/
W&B：https://docs.wandb.ai/