accessing-mlflow

Compare original and translation side by side

🇺🇸

Original

English
🇨🇳

Translation

Chinese

Accessing MLflow

访问MLflow

MCP Server

MCP Server

mlflow-mcp gives agents direct access to MLflow — query runs, compare metrics, browse artifacts, all through natural language.
mlflow-mcp 允许Agent直接访问MLflow——通过自然语言查询运行记录、比较指标、浏览工件。

ID Convention

ID约定

When the user provides a hex ID (e.g.
71f3f3199ea5e1f0
) without specifying what it is, assume it is an invocation_id (not an MLflow run_id). An invocation_id identifies a launcher invocation and is stored as both a tag and a param on MLflow runs. One invocation can produce multiple MLflow runs (one per task). You may need to search across multiple experiments if you don't know which experiment the run belongs to.
当用户提供一个十六进制ID(例如
71f3f3199ea5e1f0
)但未说明其类型时,默认它是invocation_id(而非MLflow run_id)。invocation_id用于标识启动器的调用,在MLflow运行记录中同时作为标签和参数存储。一次调用可以生成多个MLflow运行记录(每个任务对应一个)。如果不知道运行记录所属的实验,你可能需要跨多个实验进行搜索。

Querying Runs

查询运行记录

python
undefined
python
undefined

Find runs by invocation_id

Find runs by invocation_id

MLflow:search_runs_by_tags(experiment_id, {"invocation_id": "<invocation_id>"})
MLflow:search_runs_by_tags(experiment_id, {"invocation_id": "<invocation_id>"})

Query for example model/task runs

Query for example model/task runs

MLflow:query_runs(experiment_id, "tags.model LIKE '%<model>%'") MLflow:query_runs(experiment_id, "tags.task_name LIKE '%<task_name>%'")
MLflow:query_runs(experiment_id, "tags.model LIKE '%<model>%'") MLflow:query_runs(experiment_id, "tags.task_name LIKE '%<task_name>%'")

Get a config from run's artifacts

Get a config from run's artifacts

MLflow:get_artifact_content(run_id, "config.yml")
MLflow:get_artifact_content(run_id, "config.yml")

Get nested stats from run's artifacts

Get nested stats from run's artifacts

MLflow:get_artifact_content(run_id, "artifacts/eval_factory_metrics.json")

NOTE: You WILL NOT find PENDING, RUNNING, KILLED, or FAILED runs in MLflow! Only SUCCESSFUL runs are exported to MLflow.
MLflow:get_artifact_content(run_id, "artifacts/eval_factory_metrics.json")

注意:在MLflow中无法找到PENDING(待处理)、RUNNING(运行中)、KILLED(已终止)或FAILED(失败)的运行记录!只有SUCCESSFUL(成功)的运行记录会被导出到MLflow。

Workflow Tips

工作流技巧

When comparing metrics across runs, fetch the data via MCP, then run the computation in Python for exact results rather than doing math in-context:
bash
uv run --with pandas python3 << 'EOF'
import pandas as pd
跨运行记录比较指标时,通过MCP获取数据,然后在Python中执行计算以获得精确结果,而非在上下文内进行数学运算:
bash
uv run --with pandas python3 << 'EOF'
import pandas as pd

... compute deltas, averages, etc.

... compute deltas, averages, etc.

EOF
undefined
EOF
undefined

Artifacts Structure

工件结构

<harness>.<task>/
├── artifacts/
│   ├── config.yml                # Fully resolved config used during the evaluation
│   ├── launcher_unresolved_config.yaml # Unresolved config passed to the launcher
│   ├── results.yml               # All results in YAML format
│   ├── eval_factory_metrics.json # Runtime stats (latency, tokens count, memory)
│   ├── report.html               # Request-Response Pairs samples in HTML format (if enabled)
│   └── report.json               # Request-Response Pairs samples in JSON format (if enabled)
└── logs/
    ├── client-*.log              # Evaluation client
    ├── server-*-N.log            # Deployment per node
    ├── slurm-*.log               # Slurm job
    └── proxy-*.log               # Request proxy
<harness>.<task>/
├── artifacts/
│   ├── config.yml                # Fully resolved config used during the evaluation
│   ├── launcher_unresolved_config.yaml # Unresolved config passed to the launcher
│   ├── results.yml               # All results in YAML format
│   ├── eval_factory_metrics.json # Runtime stats (latency, tokens count, memory)
│   ├── report.html               # Request-Response Pairs samples in HTML format (if enabled)
│   └── report.json               # Request-Response Pairs samples in JSON format (if enabled)
└── logs/
    ├── client-*.log              # Evaluation client
    ├── server-*-N.log            # Deployment per node
    ├── slurm-*.log               # Slurm job
    └── proxy-*.log               # Request proxy

Troubleshooting

故障排查

If the MLflow MCP server fails to load or its tools are unavailable:
  1. uvx
    not found
    — install uv:
    bash
    curl -LsSf https://astral.sh/uv/install.sh | sh
  2. MCP server not configured — add the config and restart the agent:
    For Claude Code — add to
    .claude/settings.json
    (project or user level), under
    "mcpServers"
    :
    json
    "MLflow": {
      "command": "uvx",
      "args": ["mlflow-mcp"],
      "env": {
        "MLFLOW_TRACKING_URI": "https://<your-mlflow-server>/"
      }
    }
    For Cursor — edit
    ~/.cursor/mcp.json
    (Settings > Tools & MCP > New MCP Server):
    json
    {
      "mcpServers": {
        "MLflow": {
          "command": "uvx",
          "args": ["mlflow-mcp"],
          "env": {
            "MLFLOW_TRACKING_URI": "https://<your-mlflow-server>/"
          }
        }
      }
    }
如果MLflow MCP服务器加载失败或其工具不可用:
  1. 未找到
    uvx
    — 安装uv:
    bash
    curl -LsSf https://astral.sh/uv/install.sh | sh
  2. MCP服务器未配置 — 添加配置并重启Agent:
    针对Claude Code — 将以下内容添加到
    .claude/settings.json
    (项目或用户级别)的
    "mcpServers"
    字段下:
    json
    "MLflow": {
      "command": "uvx",
      "args": ["mlflow-mcp"],
      "env": {
        "MLFLOW_TRACKING_URI": "https://<your-mlflow-server>/"
      }
    }
    针对Cursor — 编辑
    ~/.cursor/mcp.json
    (设置 > 工具与MCP > 新建MCP服务器):
    json
    {
      "mcpServers": {
        "MLflow": {
          "command": "uvx",
          "args": ["mlflow-mcp"],
          "env": {
            "MLFLOW_TRACKING_URI": "https://<your-mlflow-server>/"
          }
        }
      }
    }