migrate-from-model-serving

Compare original and translation side by side

🇺🇸

Original

English
🇨🇳

Translation

Chinese

Model Serving to Databricks Apps Migration Guide

从Model Serving迁移到Databricks Apps的指南

This guide instructs LLM coding agents how to migrate an MLflow ResponsesAgent from Databricks Model Serving to Databricks Apps.

本指南指导LLM编码代理如何将MLflow ResponsesAgent从Databricks Model Serving迁移到Databricks Apps。

Overview

概述

Goal: Migrate an agent deployed on Databricks Model Serving (using
ResponsesAgent
with
predict()
/
predict_stream()
) to Databricks Apps (using MLflow GenAI Server with
@invoke
/
@stream
decorators).
Key Transformation:
  • Model Serving: Synchronous
    predict()
    and
    predict_stream()
    methods on a class
  • Apps: Functions with
    @invoke
    and
    @stream
    decorators (sync or async, based on user preference)
Deliverables: After migration is complete, you will have:
<working-directory>/
├── original_mlflow_model/    # Downloaded artifacts from Model Serving
│   ├── MLmodel
│   ├── code/
│   │   └── agent.py
│   ├── input_example.json
│   └── requirements.txt
└── <app-name>/               # New Databricks App (ready to deploy)
    ├── agent_server/
    │   ├── agent.py          # Migrated agent code
    │   └── ...
    ├── app.yaml
    ├── databricks.yml        # Bundle config with resources
    ├── pyproject.toml
    ├── requirements.txt
    └── ...
<app-name>
is the name the user provides at the start of the migration. It is used as both the directory name and the Databricks App name at deploy time.

目标: 将部署在Databricks Model Serving上的代理(使用带有
predict()
/
predict_stream()
ResponsesAgent
)迁移到Databricks Apps(使用带有
@invoke
/
@stream
装饰器的MLflow GenAI Server)。
核心转换:
  • Model Serving:类中的同步
    predict()
    predict_stream()
    方法
  • Apps:带有
    @invoke
    @stream
    装饰器的函数(根据用户偏好选择同步或异步)
交付成果: 迁移完成后,你将拥有:
<working-directory>/
├── original_mlflow_model/    # 从Model Serving下载的工件
│   ├── MLmodel
│   ├── code/
│   │   └── agent.py
│   ├── input_example.json
│   └── requirements.txt
└── <app-name>/               # 新的Databricks App(已准备好部署)
    ├── agent_server/
    │   ├── agent.py          # 迁移后的代理代码
    │   └── ...
    ├── app.yaml
    ├── databricks.yml        # 包含资源的Bundle配置
    ├── pyproject.toml
    ├── requirements.txt
    └── ...
<app-name>
是用户在迁移开始时提供的名称。它既用作目录名称,也用作部署时的Databricks App名称。

Before You Begin: Gather User Inputs

开始前:收集用户输入

Before doing anything else, ask the user three questions. Use the
AskUserQuestion
tool to collect all answers at once so the user is only prompted once, then Claude can execute the rest of the migration autonomously.
Questions to ask:
  1. Databricks profile: Which Databricks CLI profile should be used for the workspace where the Model Serving endpoint lives? (Run
    databricks auth profiles
    first to list available profiles and their workspaces, then present the options to the user.)
  2. App name: What should the new Databricks App be named? (Must be lowercase, can contain letters, numbers, and hyphens, and must be unique within the workspace.)
  3. Async migration: Would you like to migrate your agent code to be fully async?
    • Yes (Recommended): Converts all I/O operations to async (
      await
      /
      async for
      ), enabling higher concurrency on smaller compute — no more threads sitting idle while waiting for LLM responses or long-running tool calls.
    • No: Keeps your existing synchronous code with minimal changes — just extracts the logic from the
      ResponsesAgent
      class and wraps it with
      @invoke
      /
      @stream
      decorators. Simpler migration, but each request blocks a thread while waiting for I/O.
Store the answers as:
  • <profile>
    — used for ALL
    databricks
    CLI commands throughout the migration (via
    --profile <profile>
    )
  • <app-name>
    — used as both the directory name for the migrated app AND the app name when deploying with
    databricks bundle deploy
  • <async>
    yes
    or
    no
    , determines whether to convert the agent code to async or keep it synchronous
在开始任何操作之前,请向用户询问三个问题。 使用
AskUserQuestion
工具一次性收集所有答案,之后Claude可以自主完成剩余的迁移工作。
要询问的问题:
  1. Databricks配置文件: 应该使用哪个Databricks CLI配置文件来访问Model Serving端点所在的工作区?(先运行
    databricks auth profiles
    列出可用配置文件及其对应的工作区,然后向用户展示选项。)
  2. App名称: 新的Databricks App应该命名为什么?(必须为小写,可以包含字母、数字和连字符,并且在工作区内必须唯一。)
  3. 异步迁移: 你是否希望将代理代码迁移为完全异步?
    • 是(推荐): 将所有I/O操作转换为异步(
      await
      /
      async for
      ),在较小的计算资源上实现更高的并发量——等待LLM响应或长时间运行的工具调用时,不再有线程处于空闲状态。
    • 否: 保留现有的同步代码,仅进行最小更改——只需将逻辑从
      ResponsesAgent
      类中提取出来,并用
      @invoke
      /
      @stream
      装饰器包装。迁移更简单,但每个请求在等待I/O时都会阻塞一个线程。
将答案存储为:
  • <profile>
    — 在整个迁移过程中用于所有
    databricks
    CLI命令(通过
    --profile <profile>
  • <app-name>
    — 既用作迁移后App的目录名称,也用作使用
    databricks bundle deploy
    部署时的App名称
  • <async>
    yes
    no
    ,决定是否将代理代码转换为异步或保持同步

Validate Authentication

验证身份验证

After receiving the user's answers, validate the selected profile:
bash
databricks current-user me --profile <profile>
If this fails with an authentication error, prompt the user to re-authenticate:
bash
databricks auth login --profile <profile>
Important: Remember to include
--profile <profile>
on every
databricks
CLI command throughout the migration.
收到用户的答案后,验证所选配置文件:
bash
databricks current-user me --profile <profile>
如果此命令因身份验证错误失败,请提示用户重新进行身份验证:
bash
databricks auth login --profile <profile>
重要提示: 在整个迁移过程中,记得在每个
databricks
CLI命令中包含
--profile <profile>

Create the App Directory

创建App目录

Copy all scaffold files from the current working directory into a new directory named
<app-name>/
. Exclude instruction files (
AGENTS.md
,
CLAUDE.md
), hidden directories (
.claude/
,
.git/
), and any migration artifacts (e.g.,
original_mlflow_model/
,
.migration-venv/
). Do NOT search for or copy scaffold files from other directories or templates — everything you need is right here.
All subsequent migration steps operate inside the
<app-name>/
directory.
Note: The
agent_server/agent.py
scaffold is intentionally framework-agnostic — it contains the
@invoke
/
@stream
decorator pattern with TODO placeholders. Step 3 (Migrate the Agent Code) will replace these placeholders with the actual agent logic from the original Model Serving endpoint.
将当前工作目录中的所有脚手架文件复制到名为
<app-name>/
的新目录中。排除说明文件(
AGENTS.md
CLAUDE.md
)、隐藏目录(
.claude/
.git/
)和任何迁移工件(例如
original_mlflow_model/
.migration-venv/
)。不要从其他目录或模板中搜索或复制脚手架文件——你需要的所有内容都在这里。
所有后续迁移步骤都在
<app-name>/
目录内进行。
注意:
agent_server/agent.py
脚手架是有意设计为与框架无关的——它包含带有TODO占位符的
@invoke
/
@stream
装饰器模式。步骤3(迁移代理代码)会将这些占位符替换为来自原始Model Serving端点的实际代理逻辑。

Create Task List

创建任务列表

Create a task list to track progress. This helps the user follow along and see what's completed, in progress, and pending.
User tip: Press
Ctrl+T
to toggle the task list view in your terminal. The display shows up to 10 tasks at a time with status indicators.
Create the following tasks using the
TaskCreate
tool:
TaskDescription
Authenticate to DatabricksVerify Databricks CLI authentication and validate the selected profile
Download original agent artifactsDownload the MLflow model artifacts from Model Serving endpoint
Analyze and understand agent codeExamine the original agent code, identify tools, resources, and dependencies
Migrate agent code to Apps formatTransform ResponsesAgent class to @invoke/@stream decorated functions
Set up and configure the appInstall dependencies, run quickstart, configure environment
Test agent locallyStart local server and verify the agent works correctly
Deploy to Databricks AppsConfigure databricks.yml resources and deploy with Databricks Asset Bundles
Test deployed appVerify the deployed app responds correctly
Update task status as you progress:
  • Mark tasks as
    in_progress
    when starting each step
  • Mark tasks as
    completed
    when finished
  • This gives the user visibility into migration progress

创建任务列表以跟踪进度。 这有助于用户跟进并查看已完成、进行中和待处理的任务。
用户提示: 在终端中按
Ctrl+T
可切换任务列表视图。该视图最多显示10个任务,并带有状态指示器。
使用
TaskCreate
工具创建以下任务:
任务描述
身份验证到Databricks验证Databricks CLI身份验证并验证所选配置文件
下载原始代理工件从Model Serving端点下载MLflow模型工件
分析并理解代理代码检查原始代理代码,识别工具、资源和依赖项
将代理代码迁移到Apps格式将ResponsesAgent类转换为带有@invoke/@stream装饰器的函数
设置并配置App安装依赖项,运行快速启动脚本,配置环境
本地测试代理启动本地服务器并验证代理是否正常工作
部署到Databricks Apps配置databricks.yml资源并使用Databricks Asset Bundles部署
测试已部署的App验证已部署的App是否能正确响应
随着进度更新任务状态:
  • 开始每个步骤时,将任务标记为
    in_progress
  • 完成后,将任务标记为
    completed
  • 这使用户能够了解迁移进度

Step 1: Download the Original Agent Code

步骤1:下载原始代理代码

Task: Mark "Authenticate to Databricks" as
completed
. Mark "Download original agent artifacts" as
in_progress
.
Note: The
<profile>
and
<app-name>
values were collected from the user in the "Before You Begin" section. Use them throughout.
Download the original agent code from the Model Serving endpoint. This requires setting up a virtual environment with MLflow to access the model artifacts.
任务: 将“身份验证到Databricks”标记为
completed
。将“下载原始代理工件”标记为
in_progress
注意:
<profile>
<app-name>
的值是在“开始前”部分从用户那里收集的。请在整个过程中使用它们。
从Model Serving端点下载原始代理代码。这需要设置一个带有MLflow的虚拟环境来访问模型工件。

1.1 Get Model Info from Endpoint

1.1 从端点获取模型信息

If you have a serving endpoint name, extract the model details:
bash
undefined
如果你有服务端点名称,请提取模型详细信息:
bash
undefined

Get endpoint info (remember to include --profile if using non-default)

获取端点信息(如果使用非默认配置文件,请记得包含--profile)

databricks serving-endpoints get <endpoint-name> --profile <profile> --output json

Look for `served_entities[0].entity_name` (model name) and `entity_version` in the response. Find the entity with 100% traffic in `traffic_config.routes`.
databricks serving-endpoints get <endpoint-name> --profile <profile> --output json

在响应中查找`served_entities[0].entity_name`(模型名称)和`entity_version`。在`traffic_config.routes`中找到流量占比为100%的实体。

1.2 Download Model Artifacts

1.2 下载模型工件

Use
uv run --with
to download artifacts without creating a separate virtual environment. The
mlflow[databricks]
extra includes
boto3
for Unity Catalog artifact access:
bash
DATABRICKS_CONFIG_PROFILE=<profile> uv run --no-project \
  --with "mlflow[databricks]>=2.15.0" \
  --with "databricks-sdk>=0.30.0" \
  python3 << 'EOF'
import mlflow

mlflow.set_tracking_uri("databricks")
使用
uv run --with
下载工件,无需创建单独的虚拟环境。
mlflow[databricks]
扩展包含用于Unity Catalog工件访问的
boto3
bash
DATABRICKS_CONFIG_PROFILE=<profile> uv run --no-project \
  --with "mlflow[databricks]>=2.15.0" \
  --with "databricks-sdk>=0.30.0" \
  python3 << 'EOF'
import mlflow

mlflow.set_tracking_uri("databricks")

Replace with actual values from step 1.1

替换为步骤1.1中的实际值

MODEL_NAME = "<model-name>" VERSION = "<version>"
print(f"Downloading model: models:/{MODEL_NAME}/{VERSION}") mlflow.artifacts.download_artifacts( artifact_uri=f"models:/{MODEL_NAME}/{VERSION}", dst_path="./original_mlflow_model" ) print("Download complete! Artifacts saved to ./original_mlflow_model") EOF
undefined
MODEL_NAME = "<model-name>" VERSION = "<version>"
print(f"正在下载模型: models:/{MODEL_NAME}/{VERSION}") mlflow.artifacts.download_artifacts( artifact_uri=f"models:/{MODEL_NAME}/{VERSION}", dst_path="./original_mlflow_model" ) print("下载完成!工件已保存到./original_mlflow_model") EOF
undefined

1.3 Verify Downloaded Artifacts

1.3 验证下载的工件

Check that the key files exist and understand the full structure:
bash
undefined
检查关键文件是否存在并了解完整结构:
bash
undefined

List all downloaded files recursively

递归列出所有下载的文件

find ./original_mlflow_model -type f | head -50
find ./original_mlflow_model -type f | head -50

Check for MLmodel file (contains resource requirements)

检查MLmodel文件(包含资源要求)

cat ./original_mlflow_model/MLmodel
cat ./original_mlflow_model/MLmodel

Check for input example (useful for testing)

检查输入示例(对测试有用)

cat ./original_mlflow_model/input_example.json 2>/dev/null

**Examine the `/code` folder** - contains all code dependencies logged via `code_paths=["..."]`:

```bash
cat ./original_mlflow_model/input_example.json 2>/dev/null

**检查`/code`文件夹** - 包含通过`code_paths=["..."]`记录的所有代码依赖项:

```bash

List all code files

列出所有代码文件

ls -la ./original_mlflow_model/code/
ls -la ./original_mlflow_model/code/

The main agent is typically agent.py, but there may be additional modules

主代理通常是agent.py,但可能还有其他模块

find ./original_mlflow_model/code -name "*.py" -type f

**Examine the `/artifacts` folder** (if present) - contains artifacts logged via `artifacts={...}`:

```bash
find ./original_mlflow_model/code -name "*.py" -type f

**检查`/artifacts`文件夹(如果存在)** - 包含通过`artifacts={...}`记录的工件:

```bash

Check for artifacts folder

检查是否存在artifacts文件夹

ls -la ./original_mlflow_model/artifacts/ 2>/dev/null
ls -la ./original_mlflow_model/artifacts/ 2>/dev/null

List all artifacts

列出所有工件

find ./original_mlflow_model/artifacts -type f 2>/dev/null

> **Important:** Take note of ALL files in `/code` and `/artifacts`. You will need to copy these to the migrated app and ensure imports still work correctly.
find ./original_mlflow_model/artifacts -type f 2>/dev/null

> **重要提示:** 记录`/code`和`/artifacts`中的所有文件。你需要将这些文件复制到迁移后的App中,并确保导入仍然正确工作。

Expected Output Structure

预期输出结构

After successful download, you should have:
./original_mlflow_model/
├── MLmodel              # Model metadata and resource requirements
├── code/                # Code logged via code_paths=["..."]
│   ├── agent.py         # Main agent implementation
│   ├── utils.py         # (optional) Helper modules
│   ├── tools.py         # (optional) Custom tool definitions
│   └── ...              # Any other code dependencies
├── artifacts/           # (optional) Artifacts logged via artifacts={...}
│   ├── config.yaml      # (optional) Configuration files
│   ├── prompts/         # (optional) Prompt templates
│   └── ...              # Any other artifacts (data files, etc.)
├── input_example.json   # Sample request for testing
├── requirements.txt     # Original dependencies
└── ...
成功下载后,你应该拥有:
./original_mlflow_model/
├── MLmodel              # 模型元数据和资源要求
├── code/                # 通过code_paths=["..."]记录的代码
│   ├── agent.py         # 主代理实现
│   ├── utils.py         # (可选)辅助模块
│   ├── tools.py         # (可选)自定义工具定义
│   └── ...              # 任何其他代码依赖项
├── artifacts/           # (可选)通过artifacts={...}记录的工件
│   ├── config.yaml      # (可选)配置文件
│   ├── prompts/         # (可选)提示模板
│   └── ...              # 任何其他工件(数据文件等)
├── input_example.json   # 用于测试的示例请求
├── requirements.txt     # 原始依赖项
└── ...

Key Files to Examine

需要检查的关键文件

  1. code/agent.py
    - Contains the
    ResponsesAgent
    class with
    predict()
    and
    predict_stream()
    methods
  2. code/*.py
    - Any additional Python modules the agent imports
  3. MLmodel
    - Contains the
    resources
    section listing required Databricks resources
  4. artifacts/
    - Any configuration files, prompts, or data files the agent uses
  5. input_example.json
    - Use this to test the migrated agent
  1. code/agent.py
    - 包含带有
    predict()
    predict_stream()
    方法的
    ResponsesAgent
  2. code/*.py
    - 代理导入的任何其他Python模块
  3. MLmodel
    - 包含列出所需Databricks资源的
    resources
    部分
  4. artifacts/
    - 代理使用的任何配置文件、提示或数据文件
  5. input_example.json
    - 用于测试迁移后的代理

Troubleshooting Model Download

模型下载故障排除

"Unable to import necessary dependencies to access model version files in Unity Catalog" This means
boto3
is missing. Ensure you're using
mlflow[databricks]
(not just
mlflow
) in the
--with
flag — the
[databricks]
extra includes
boto3
.
"INVALID_PARAMETER_VALUE" or authentication errors Re-authenticate with Databricks (include profile if non-default):
bash
databricks auth login --profile <profile>
Wrong workspace / Model not found Make sure you're using the correct profile that corresponds to the workspace where the model is deployed:
bash
undefined
"无法导入必要的依赖项以访问Unity Catalog中的模型版本文件" 这意味着缺少
boto3
。确保在
--with
标志中使用
mlflow[databricks]
(而不仅仅是
mlflow
)——
[databricks]
扩展包含
boto3
"INVALID_PARAMETER_VALUE"或身份验证错误 重新进行Databricks身份验证(如果是非默认配置文件,请包含配置文件):
bash
databricks auth login --profile <profile>
错误的工作区/未找到模型 确保你使用的配置文件对应于模型部署所在的工作区:
bash
undefined

List profiles to see which workspace each points to

列出配置文件以查看每个配置文件指向哪个工作区

databricks auth profiles
databricks auth profiles

Verify you can access the workspace

验证你是否可以访问该工作区

databricks current-user me --profile <profile>
databricks current-user me --profile <profile>

List models in that workspace

列出该工作区中的模型

databricks registered-models list --profile <profile> databricks model-versions list --name "<model-name>" --profile <profile>

---
databricks registered-models list --profile <profile> databricks model-versions list --name "<model-name>" --profile <profile>

---

Step 2: Understand the Key Transformations

步骤2:了解核心转换

Task: Mark "Download original agent artifacts" as
completed
. Mark "Analyze and understand agent code" as
in_progress
.
任务: 将“下载原始代理工件”标记为
completed
。将“分析并理解代理代码”标记为
in_progress

Entry Point Transformation

入口点转换

In both cases, the
ResponsesAgent
class is replaced with decorated functions. The difference is whether those functions are async or sync.
Model Serving (OLD):
python
from mlflow.pyfunc import ResponsesAgent, ResponsesAgentRequest, ResponsesAgentResponse

class MyAgent(ResponsesAgent):
    def predict(self, request: ResponsesAgentRequest, params=None) -> ResponsesAgentResponse:
        # Synchronous implementation
        ...
        return ResponsesAgentResponse(output=outputs)

    def predict_stream(self, request: ResponsesAgentRequest, params=None):
        # Synchronous generator
        for chunk in ...:
            yield ResponsesAgentStreamEvent(...)
Apps — Async (if
<async>
= yes):
python
from mlflow.genai.agent_server import invoke, stream
from mlflow.types.responses import (
    ResponsesAgentRequest,
    ResponsesAgentResponse,
    ResponsesAgentStreamEvent,
)

@invoke()
async def non_streaming(request: ResponsesAgentRequest) -> ResponsesAgentResponse:
    # Async implementation - typically calls streaming() and collects results
    outputs = [
        event.item
        async for event in streaming(request)
        if event.type == "response.output_item.done"
    ]
    return ResponsesAgentResponse(output=outputs)

@stream()
async def streaming(request: ResponsesAgentRequest) -> AsyncGenerator[ResponsesAgentStreamEvent, None]:
    # Async generator
    async for event in ...:
        yield event
Apps — Sync (if
<async>
= no):
python
from mlflow.genai.agent_server import invoke, stream
from mlflow.types.responses import (
    ResponsesAgentRequest,
    ResponsesAgentResponse,
    ResponsesAgentStreamEvent,
)

@invoke()
def non_streaming(request: ResponsesAgentRequest) -> ResponsesAgentResponse:
    # Same sync logic from original predict(), extracted from the class
    ...
    return ResponsesAgentResponse(output=outputs)

@stream()
def streaming(request: ResponsesAgentRequest):
    # Same sync generator from original predict_stream(), extracted from the class
    for chunk in ...:
        yield ResponsesAgentStreamEvent(...)
在两种情况下,
ResponsesAgent
类都被替换为带有装饰器的函数。区别在于这些函数是异步还是同步。
Model Serving(旧版):
python
from mlflow.pyfunc import ResponsesAgent, ResponsesAgentRequest, ResponsesAgentResponse

class MyAgent(ResponsesAgent):
    def predict(self, request: ResponsesAgentRequest, params=None) -> ResponsesAgentResponse:
        # 同步实现
        ...
        return ResponsesAgentResponse(output=outputs)

    def predict_stream(self, request: ResponsesAgentRequest, params=None):
        # 同步生成器
        for chunk in ...:
            yield ResponsesAgentStreamEvent(...)
Apps — 异步(如果
<async>
= yes):
python
from mlflow.genai.agent_server import invoke, stream
from mlflow.types.responses import (
    ResponsesAgentRequest,
    ResponsesAgentResponse,
    ResponsesAgentStreamEvent,
)

@invoke()
async def non_streaming(request: ResponsesAgentRequest) -> ResponsesAgentResponse:
    # 异步实现 - 通常调用streaming()并收集结果
    outputs = [
        event.item
        async for event in streaming(request)
        if event.type == "response.output_item.done"
    ]
    return ResponsesAgentResponse(output=outputs)

@stream()
async def streaming(request: ResponsesAgentRequest) -> AsyncGenerator[ResponsesAgentStreamEvent, None]:
    # 异步生成器
    async for event in ...:
        yield event
Apps — 同步(如果
<async>
= no):
python
from mlflow.genai.agent_server import invoke, stream
from mlflow.types.responses import (
    ResponsesAgentRequest,
    ResponsesAgentResponse,
    ResponsesAgentStreamEvent,
)

@invoke()
def non_streaming(request: ResponsesAgentRequest) -> ResponsesAgentResponse:
    # 与原始predict()相同的同步逻辑,从类中提取
    ...
    return ResponsesAgentResponse(output=outputs)

@stream()
def streaming(request: ResponsesAgentRequest):
    # 与原始predict_stream()相同的同步生成器,从类中提取
    for chunk in ...:
        yield ResponsesAgentStreamEvent(...)

Key Differences

核心差异

AspectModel ServingApps (async)Apps (sync)
Structure
class MyAgent(ResponsesAgent)
Decorated functionsDecorated functions
Functions
def predict()
/
def predict_stream()
async def
with
await
def
(same as original)
StreamingSync generator (
yield
)
Async generator (
async for
/
yield
)
Sync generator (
yield
)
ServerMLflow Model ServerMLflow GenAI Server (FastAPI)MLflow GenAI Server (FastAPI)
Deployment
databricks_agents.deploy()
databricks bundle deploy
+
bundle run
databricks bundle deploy
+
bundle run
方面Model ServingApps(异步)Apps(同步)
结构
class MyAgent(ResponsesAgent)
带有装饰器的函数带有装饰器的函数
函数
def predict()
/
def predict_stream()
async def
搭配
await
def
(与原始相同)
流式处理同步生成器(
yield
异步生成器(
async for
/
yield
同步生成器(
yield
服务器MLflow Model ServerMLflow GenAI Server(FastAPI)MLflow GenAI Server(FastAPI)
部署
databricks_agents.deploy()
databricks bundle deploy
+
bundle run
databricks bundle deploy
+
bundle run

Async Patterns (only if
<async>
= yes)

异步模式(仅当
<async>
= yes时)

Skip this section if the user chose synchronous migration. The sync path keeps all original I/O calls as-is.
All I/O operations must be converted to async:
python
undefined
如果用户选择同步迁移,请跳过本节。同步路径保持所有原始I/O调用不变。
所有I/O操作必须转换为异步:
python
undefined

OLD (sync)

旧版(同步)

response = client.chat(messages)
response = client.chat(messages)

NEW (async)

新版(异步)

response = await client.achat(messages)
response = await client.achat(messages)

OLD (sync iteration)

旧版(同步迭代)

for chunk in stream: yield chunk
for chunk in stream: yield chunk

NEW (async iteration)

新版(异步迭代)

async for chunk in stream: yield chunk

---
async for chunk in stream: yield chunk

---

Step 3: Migrate the Agent Code

步骤3:迁移代理代码

Task: Mark "Analyze and understand agent code" as
completed
. Mark "Migrate agent code to Apps format" as
in_progress
.
任务: 将“分析并理解代理代码”标记为
completed
。将“将代理代码迁移到Apps格式”标记为
in_progress

3.1 Copy Code Dependencies and Artifacts

3.1 复制代码依赖项和工件

The original MLflow model may contain multiple code files and artifacts that need to be migrated.
Copy all code files from
/code
to
agent_server/
:
bash
undefined
原始MLflow模型可能包含多个需要迁移的代码文件和工件。
/code
中的所有代码文件复制到
agent_server/
bash
undefined

Copy all Python files from original code folder

从原始代码文件夹复制所有Python文件

cp ./original_mlflow_model/code/*.py ./<app-name>/agent_server/
cp ./original_mlflow_model/code/*.py ./<app-name>/agent_server/

If there are subdirectories with code, copy those too

如果有包含代码的子目录,也复制这些目录

cp -r ./original_mlflow_model/code/submodule ./<app-name>/agent_server/

cp -r ./original_mlflow_model/code/submodule ./<app-name>/agent_server/


**Copy artifacts (if present):**

```bash

**复制工件(如果存在):**

```bash

Create an artifacts directory in the migrated app if needed

如有需要,在迁移后的App中创建artifacts目录

mkdir -p ./<app-name>/agent_server/artifacts
mkdir -p ./<app-name>/agent_server/artifacts

Copy all artifacts

复制所有工件

cp -r ./original_mlflow_model/artifacts/* ./<app-name>/agent_server/artifacts/ 2>/dev/null || true

**Fix import paths after copying:**

When code files are moved, imports may break. Check and update imports in all copied files:

```python
cp -r ./original_mlflow_model/artifacts/* ./<app-name>/agent_server/artifacts/ 2>/dev/null || true

**复制后修复导入路径:**

移动代码文件后,导入可能会中断。检查并更新所有复制文件中的导入:

```python

BEFORE (if files were in different locations):

之前(如果文件位于不同位置):

from code.utils import helper_function from artifacts.prompts import SYSTEM_PROMPT
from code.utils import helper_function from artifacts.prompts import SYSTEM_PROMPT

AFTER (files are now in agent_server/):

之后(文件现在位于agent_server/中):

from agent_server.utils import helper_function
from agent_server.utils import helper_function

Or if in same directory:

或者如果在同一目录中:

from .utils import helper_function
from .utils import helper_function

For artifacts, update file paths:

对于工件,更新文件路径:

BEFORE:

之前:

with open("artifacts/config.yaml") as f:
with open("artifacts/config.yaml") as f:

AFTER:

之后:

import os config_path = os.path.join(os.path.dirname(file), "artifacts", "config.yaml") with open(config_path) as f:

> **Important:** Review each copied file and ensure all imports resolve correctly. The most common issues are:
> - Relative imports that assumed a different directory structure
> - Hardcoded file paths to artifacts
> - Missing `__init__.py` files for package imports
import os config_path = os.path.join(os.path.dirname(file), "artifacts", "config.yaml") with open(config_path) as f:

> **重要提示:** 检查每个复制的文件并确保所有导入都能正确解析。最常见的问题是:
> - 假设不同目录结构的相对导入
> - 指向工件的硬编码文件路径
> - 包导入缺少`__init__.py`文件

3.2 Extract Configuration

3.2 提取配置

From the original agent code, identify and preserve:
  • LLM endpoint name (e.g.,
    databricks-claude-sonnet-4-5
    )
  • System prompt
  • Tool definitions
  • Any custom logic
从原始代理代码中识别并保留:
  • LLM端点名称(例如
    databricks-claude-sonnet-4-5
  • 系统提示
  • 工具定义
  • 任何自定义逻辑

3.3 Update the Agent Entry Point

3.3 更新代理入口点

The approach depends on whether the user chose async or sync migration.

方法取决于用户选择的是异步还是同步迁移。

Path A: Synchronous Migration (
<async>
= no)

路径A:同步迁移(
<async>
= no)

This is the minimal-changes path. Extract the logic from the
ResponsesAgent
class, wrap it with
@invoke
/
@stream
decorators, and keep all code synchronous.
Edit
<app-name>/agent_server/agent.py
:
  1. Replace the scaffold with the original agent logic. The core transformation is extracting the class methods into decorated functions:
python
from mlflow.genai.agent_server import invoke, stream
from mlflow.types.responses import (
    ResponsesAgentRequest,
    ResponsesAgentResponse,
    ResponsesAgentStreamEvent,
)
这是最小更改路径。从
ResponsesAgent
类中提取逻辑,用
@invoke
/
@stream
装饰器包装,并保持所有代码同步。
编辑
<app-name>/agent_server/agent.py
  1. 用原始代理逻辑替换脚手架。 核心转换是将类方法提取为带有装饰器的函数:
python
from mlflow.genai.agent_server import invoke, stream
from mlflow.types.responses import (
    ResponsesAgentRequest,
    ResponsesAgentResponse,
    ResponsesAgentStreamEvent,
)

Move any class init or class-level setup to module level

将任何类__init__或类级设置移到模块级别

e.g., client initialization, tool setup, etc.

例如客户端初始化、工具设置等

@invoke() def non_streaming(request: ResponsesAgentRequest) -> ResponsesAgentResponse: # Paste the body of the original predict() method here # Remove 'self.' references — replace with module-level variables # Remove 'params' parameter (not used in Apps) ... return ResponsesAgentResponse(output=outputs)
@stream() def streaming(request: ResponsesAgentRequest): # Paste the body of the original predict_stream() method here # Remove 'self.' references — replace with module-level variables # Remove 'params' parameter (not used in Apps) for chunk in ...: yield ResponsesAgentStreamEvent(...)

2. **Key changes from class to functions:**
   - Remove the `class MyAgent(ResponsesAgent):` wrapper
   - Remove `self` parameter from all methods
   - Move `__init__` logic (client creation, tool setup) to module-level code
   - Replace `self.some_attribute` with module-level variables
   - Add `@invoke()` decorator to the non-streaming function
   - Add `@stream()` decorator to the streaming function

3. **Keep all other code as-is** — no need to convert sync calls to async, no need to change `for` to `async for`, no need to add `await`.

---
@invoke() def non_streaming(request: ResponsesAgentRequest) -> ResponsesAgentResponse: # 在此处粘贴原始predict()方法的主体 # 移除'self.'引用 - 替换为模块级变量 # 移除'params'参数(在Apps中不使用) ... return ResponsesAgentResponse(output=outputs)
@stream() def streaming(request: ResponsesAgentRequest): # 在此处粘贴原始predict_stream()方法的主体 # 移除'self.'引用 - 替换为模块级变量 # 移除'params'参数(在Apps中不使用) for chunk in ...: yield ResponsesAgentStreamEvent(...)

2. **从类到函数的关键更改:**
   - 移除`class MyAgent(ResponsesAgent):`包装器
   - 从所有方法中移除`self`参数
   - 将`__init__`逻辑(客户端创建、工具设置)移到模块级代码
   - 将`self.some_attribute`替换为模块级变量
   - 为非流式函数添加`@invoke()`装饰器
   - 为流式函数添加`@stream()`装饰器

3. **保留所有其他代码不变** - 无需将同步调用转换为异步,无需将`for`更改为`async for`,无需添加`await`。

---

Path B: Async Migration (
<async>
= yes)

路径B:异步迁移(
<async>
= yes)

This path converts all I/O operations to async for higher concurrency. More changes are required, but the result is a more efficient server.
Edit
<app-name>/agent_server/agent.py
:
  1. Update the LLM endpoint:
    python
    LLM_ENDPOINT_NAME = "<your-endpoint-from-original>"
  2. Update the system prompt:
    python
    SYSTEM_PROMPT = """<your-system-prompt-from-original>"""
  3. Add your custom tools: If your original agent had custom tools, add them:
    python
    from langchain_core.tools import tool
    
    @tool
    async def my_custom_tool(arg: str) -> str:
        """Tool description."""
        # Your tool logic (make async if needed)
        return result
  4. Convert all I/O to async:
    • def predict()
      async def non_streaming()
    • def predict_stream()
      async def streaming()
    • client.chat()
      await client.achat()
    • for chunk in stream:
      async for chunk in stream:
    • Sync HTTP calls →
      await
      async equivalents
  5. Preserve any special logic: Migrate any custom preprocessing, postprocessing, or business logic from the original agent.

此路径将所有I/O操作转换为异步以实现更高的并发量。需要更多更改,但结果是更高效的服务器。
编辑
<app-name>/agent_server/agent.py
  1. 更新LLM端点:
    python
    LLM_ENDPOINT_NAME = "<your-endpoint-from-original>"
  2. 更新系统提示:
    python
    SYSTEM_PROMPT = """<your-system-prompt-from-original>"""
  3. 添加自定义工具: 如果原始代理有自定义工具,请添加它们:
    python
    from langchain_core.tools import tool
    
    @tool
    async def my_custom_tool(arg: str) -> str:
        """工具描述。"""
        # 你的工具逻辑(如有需要,设为异步)
        return result
  4. 将所有I/O转换为异步:
    • def predict()
      async def non_streaming()
    • def predict_stream()
      async def streaming()
    • client.chat()
      await client.achat()
    • for chunk in stream:
      async for chunk in stream:
    • 同步HTTP调用 →
      await
      异步等效调用
  5. 保留任何特殊逻辑: 迁移原始代理中的任何自定义预处理、后处理或业务逻辑。

3.4 Handle Stateful Agents

3.4 处理有状态代理

If original uses checkpointer (short-term memory):
  • Add checkpointer with Lakebase integration (use
    AsyncCheckpointSaver
    if async, or sync equivalent if sync)
  • Configure
    LAKEBASE_INSTANCE_NAME
    in
    .env
  • Extract thread_id from
    request.custom_inputs
    or
    request.context.conversation_id
If original uses store (long-term memory):
  • Add store with Lakebase integration (use
    AsyncDatabricksStore
    if async, or sync equivalent if sync)
  • Configure
    LAKEBASE_INSTANCE_NAME
    in
    .env
  • Extract user_id from
    request.custom_inputs
    or
    request.context.user_id

如果原始代理使用检查点(短期记忆):
  • 添加带有Lakebase集成的检查点(如果是异步则使用
    AsyncCheckpointSaver
    ,如果是同步则使用同步等效项)
  • .env
    中配置
    LAKEBASE_INSTANCE_NAME
  • request.custom_inputs
    request.context.conversation_id
    中提取thread_id
如果原始代理使用存储(长期记忆):
  • 添加带有Lakebase集成的存储(如果是异步则使用
    AsyncDatabricksStore
    ,如果是同步则使用同步等效项)
  • .env
    中配置
    LAKEBASE_INSTANCE_NAME
  • request.custom_inputs
    request.context.user_id
    中提取user_id

Step 4: Set Up the App

步骤4:设置App

Task: Mark "Migrate agent code to Apps format" as
completed
. Mark "Set up and configure the app" as
in_progress
.
任务: 将“将代理代码迁移到Apps格式”标记为
completed
。将“设置并配置App”标记为
in_progress

4.1 Verify Build Configuration

4.1 验证构建配置

Before installing dependencies, ensure a README file exists (hatchling requires this):
Ensure a README file exists:
bash
undefined
安装依赖项之前,确保存在README文件(hatchling需要此文件):
确保存在README文件:
bash
undefined

Create a minimal README if one doesn't exist

如果不存在,创建一个最小的README

if [ ! -f "README.md" ]; then echo "# Migrated Agent App" > README.md fi
undefined
if [ ! -f "README.md" ]; then echo "# 迁移后的代理App" > README.md fi
undefined

4.2 Install Dependencies

4.2 安装依赖项

bash
cd <app-name>
uv sync
bash
cd <app-name>
uv sync

4.3 Create requirements.txt for Databricks Apps

4.3 为Databricks Apps创建requirements.txt

Databricks Apps requires a
requirements.txt
file with
uv
to install dependencies from
pyproject.toml
:
bash
echo "uv" > requirements.txt
Databricks Apps需要一个包含
uv
requirements.txt
文件,以便从
pyproject.toml
安装依赖项:
bash
echo "uv" > requirements.txt

4.4 Run Quickstart

4.4 运行快速启动脚本

Run the
uv run quickstart
script to quickly set up your local environment. This is the recommended way to configure the app as it handles all necessary setup automatically.
bash
uv run quickstart
This script will:
  1. Verify uv, nvm, and Databricks CLI installations
  2. Configure Databricks authentication
  3. Configure agent tracing, by creating and linking an MLflow experiment to your app
  4. Configure
    .env
    with the necessary environment variables
Important: The quickstart script creates the MLflow experiment that the app needs for logging traces and models. This experiment will be added as a resource when deploying the app.
If there are issues with the quickstart script, refer to the manual setup in section 4.5.
运行
uv run quickstart
脚本快速设置本地环境。这是推荐的配置App的方式,因为它会自动处理所有必要的设置。
bash
uv run quickstart
该脚本将:
  1. 验证uv、nvm和Databricks CLI的安装
  2. 配置Databricks身份验证
  3. 配置代理跟踪,方法是创建一个MLflow实验并将其链接到你的App
  4. 使用必要的环境变量配置
    .env
重要提示: 快速启动脚本创建App记录跟踪和模型所需的MLflow实验。部署App时,此实验将作为资源添加。
如果快速启动脚本出现问题,请参考4.5节中的手动设置。

4.5 Manual Environment Configuration (Optional)

4.5 手动环境配置(可选)

If you need to manually configure the environment or add additional variables, edit
.env
:
bash
undefined
如果你需要手动配置环境或添加其他变量,请编辑
.env
bash
undefined

Databricks authentication

Databricks身份验证

DATABRICKS_CONFIG_PROFILE=<your-profile>
DATABRICKS_CONFIG_PROFILE=<your-profile>

MLflow experiment (created by quickstart, or create manually)

MLflow实验(由快速启动脚本创建,或手动创建)

MLFLOW_EXPERIMENT_ID=<experiment-id>
MLFLOW_EXPERIMENT_ID=<experiment-id>

Example: Lakebase for stateful agents

示例:用于有状态代理的Lakebase

LAKEBASE_INSTANCE_NAME=<your-lakebase-instance>
LAKEBASE_INSTANCE_NAME=<your-lakebase-instance>

Example: Custom API keys

示例:自定义API密钥

MY_API_KEY=<value>

To manually create an MLflow experiment:

```bash
databricks experiments create-experiment "/Users/<your-username>/<app-name>" --profile <profile>

MY_API_KEY=<value>

要手动创建MLflow实验:

```bash
databricks experiments create-experiment "/Users/<your-username>/<app-name>" --profile <profile>

Step 5: Test Locally

步骤5:本地测试

Task: Mark "Set up and configure the app" as
completed
. Mark "Test agent locally" as
in_progress
.
Test your migrated agent locally before deploying to Databricks Apps. This helps catch configuration issues early and ensures the agent works correctly.
任务: 将“设置并配置App”标记为
completed
。将“本地测试代理”标记为
in_progress
在部署到Databricks Apps之前,先在本地测试迁移后的代理。这有助于及早发现配置问题并确保代理正常工作。

5.1 Start the Server

5.1 启动服务器

After the quickstart setup is complete, start the agent server and chat app locally:
bash
cd <app-name>
uv run start-app
Wait for the server to start. You should see output indicating the server is running on
http://localhost:8000
.
Note: If you only need the API endpoint (without the chat UI), you can run
uv run start-server
instead.
快速启动设置完成后,在本地启动代理服务器和聊天App:
bash
cd <app-name>
uv run start-app
等待服务器启动。你应该会看到输出表明服务器正在
http://localhost:8000
上运行。
注意: 如果你只需要API端点(不需要聊天UI),可以运行
uv run start-server
代替。

5.2 Test with Original Input Example

5.2 使用原始输入示例进行测试

The original model artifacts include an
input_example.json
file that contains a sample request. Use this to verify your migrated agent produces the same behavior. If there's no valid sample request then figure out a valid sample request to query agent based on its code.
bash
undefined
原始模型工件包含一个
input_example.json
文件,其中包含示例请求。使用此文件验证迁移后的代理是否产生相同的行为。如果没有有效的示例请求,请根据其代码找出有效的示例请求来查询代理。
bash
undefined

Check the original input example (from the <app-name> directory)

检查原始输入示例(从<app-name>目录中)

cat ../original_mlflow_model/input_example.json

Example content:
```json
{"input": [{"role": "user", "content": "What is an LLM agent?"}], "custom_inputs": {"thread_id": "example-thread-123"}}
Test your local server with this input:
bash
undefined
cat ../original_mlflow_model/input_example.json

示例内容:
```json
{"input": [{"role": "user", "content": "什么是LLM代理?"}], "custom_inputs": {"thread_id": "example-thread-123"}}
使用此输入测试本地服务器:
bash
undefined

Test with the original input example

使用原始输入示例进行测试

curl -X POST http://localhost:8000/invocations
-H "Content-Type: application/json"
-d "$(cat ../original_mlflow_model/input_example.json)"
undefined
curl -X POST http://localhost:8000/invocations
-H "Content-Type: application/json"
-d "$(cat ../original_mlflow_model/input_example.json)"
undefined

5.3 Test Basic Requests

5.3 测试基本请求

bash
undefined
bash
undefined

Non-streaming

非流式

curl -X POST http://localhost:8000/invocations
-H "Content-Type: application/json"
-d '{"input": [{"role": "user", "content": "Hello!"}]}'
curl -X POST http://localhost:8000/invocations
-H "Content-Type: application/json"
-d '{"input": [{"role": "user", "content": "你好!"}]}'

Streaming

流式

curl -X POST http://localhost:8000/invocations
-H "Content-Type: application/json"
-d '{"input": [{"role": "user", "content": "Hello!"}], "stream": true}'
undefined
curl -X POST http://localhost:8000/invocations
-H "Content-Type: application/json"
-d '{"input": [{"role": "user", "content": "你好!"}], "stream": true}'
undefined

5.4 Test with Custom Inputs (for stateful agents)

5.4 使用自定义输入测试(针对有状态代理)

bash
undefined
bash
undefined

With thread_id for short-term memory

带有thread_id以实现短期记忆

curl -X POST http://localhost:8000/invocations
-H "Content-Type: application/json"
-d '{"input": [{"role": "user", "content": "Hi"}], "custom_inputs": {"thread_id": "test-123"}}'
curl -X POST http://localhost:8000/invocations
-H "Content-Type: application/json"
-d '{"input": [{"role": "user", "content": "嗨"}], "custom_inputs": {"thread_id": "test-123"}}'

With user_id for long-term memory

带有user_id以实现长期记忆

curl -X POST http://localhost:8000/invocations
-H "Content-Type: application/json"
-d '{"input": [{"role": "user", "content": "Hi"}], "custom_inputs": {"user_id": "user@example.com"}}'
undefined
curl -X POST http://localhost:8000/invocations
-H "Content-Type: application/json"
-d '{"input": [{"role": "user", "content": "嗨"}], "custom_inputs": {"user_id": "user@example.com"}}'
undefined

5.5 Verify Before Proceeding

5.5 继续前验证

Before proceeding to deployment, ensure:
  • The server starts without errors
  • The original input example returns a valid response
  • Streaming responses work correctly
  • Custom inputs (thread_id, user_id) are handled properly (if applicable)
Note: Only proceed to Step 6 (Deploy) after confirming the agent works correctly locally.

继续部署之前,请确保:
  • 服务器启动时没有错误
  • 原始输入示例返回有效响应
  • 流式响应正常工作
  • 自定义输入(thread_id、user_id)处理正确(如适用)
注意: 确认代理在本地正常工作后,再继续进行步骤6(部署)。

Step 6: Deploy to Databricks Apps

步骤6:部署到Databricks Apps

Task: Mark "Test agent locally" as
completed
. Mark "Deploy to Databricks Apps" as
in_progress
.
This step uses Databricks Asset Bundles (DAB) to deploy. The scaffold includes a
databricks.yml
that you need to update with the app name and resources from the original model.
任务: 将“本地测试代理”标记为
completed
。将“部署到Databricks Apps”标记为
in_progress
此步骤使用Databricks Asset Bundles (DAB)进行部署。脚手架包含一个
databricks.yml
,你需要使用App名称和原始模型中的资源更新该文件。

6.1 Extract Resources from Original Model

6.1 从原始模型中提取资源

The original model's
MLmodel
file contains a
resources
section that lists all Databricks resources the agent needs access to. Check
../original_mlflow_model/MLmodel
(or
./original_mlflow_model/MLmodel
if you're in the parent directory) for content like:
yaml
resources:
  api_version: '1'
  databricks:
    lakebase:
    - name: lakebase
    serving_endpoint:
    - name: databricks-claude-sonnet-4-5
原始模型的
MLmodel
文件包含一个
resources
部分,列出了代理需要访问的所有Databricks资源。检查
../original_mlflow_model/MLmodel
(如果在父目录中则检查
./original_mlflow_model/MLmodel
),查找如下内容:
yaml
resources:
  api_version: '1'
  databricks:
    lakebase:
    - name: lakebase
    serving_endpoint:
    - name: databricks-claude-sonnet-4-5

6.2 Update
databricks.yml
with Resources

6.2 使用资源更新
databricks.yml

The scaffold includes a
databricks.yml
with the experiment resource pre-configured. You need to:
  1. Update the app name to
    <app-name>
    (the name provided by the user) in both the
    resources.apps.agent_migration.name
    field and the
    targets.prod.resources.apps.agent_migration.name
    field.
  2. Add resources extracted from the original MLmodel file to the
    resources.apps.agent_migration.resources
    list.
Resource Type Mapping (MLmodel →
databricks.yml
):
MLmodel Resource
databricks.yml
Resource
Key Fields
serving_endpoint
serving_endpoint
name
,
permission
(CAN_QUERY)
lakebase
database
database_name: databricks_postgres
,
instance_name
,
permission
(CAN_CONNECT_AND_CREATE)
vector_search_index
uc_securable
securable_full_name
,
securable_type: TABLE
,
permission: SELECT
function
uc_securable
securable_full_name
,
securable_type: FUNCTION
,
permission: EXECUTE
table
uc_securable
securable_full_name
,
securable_type: TABLE
,
permission: SELECT
uc_connection
uc_securable
securable_full_name
,
securable_type: CONNECTION
,
permission: USE_CONNECTION
sql_warehouse
sql_warehouse
id
,
permission
(CAN_USE)
genie_space
genie_space
space_id
,
permission
(CAN_RUN)
Note: The
experiment
resource is already configured in the scaffold
databricks.yml
and is automatically created by the bundle. You do not need to add it manually.
Example:
databricks.yml
for an agent with a serving endpoint and UC function:
yaml
resources:
  experiments:
    agent_migration_experiment:
      name: /Users/${workspace.current_user.userName}/${bundle.name}-${bundle.target}

  apps:
    agent_migration:
      name: "<app-name>"  # Update to user's app name
      description: "Migrated agent from Model Serving to Databricks Apps"
      source_code_path: ./
      resources:
        - name: 'experiment'
          experiment:
            experiment_id: "${resources.experiments.agent_migration_experiment.id}"
            permission: 'CAN_MANAGE'
        - name: 'serving-endpoint'
          serving_endpoint:
            name: 'databricks-claude-sonnet-4-5'
            permission: 'CAN_QUERY'
        - name: 'python-exec'
          uc_securable:
            securable_full_name: 'system.ai.python_exec'
            securable_type: 'FUNCTION'
            permission: 'EXECUTE'

targets:
  prod:
    resources:
      apps:
        agent_migration:
          name: "<app-name>"  # Same name for production
Example: Adding Lakebase resources (for stateful agents):
yaml
        - name: 'database'
          database:
            database_name: 'databricks_postgres'
            instance_name: 'lakebase'
            permission: 'CAN_CONNECT_AND_CREATE'
脚手架包含一个
databricks.yml
,其中预先配置了实验资源。你需要:
  1. 更新App名称
    <app-name>
    (用户提供的名称),在
    resources.apps.agent_migration.name
    字段和
    targets.prod.resources.apps.agent_migration.name
    字段中都要更新。
  2. 添加资源,从原始MLmodel文件中提取的资源添加到
    resources.apps.agent_migration.resources
    列表中。
资源类型映射(MLmodel →
databricks.yml
):
MLmodel资源
databricks.yml
资源
关键字段
serving_endpoint
serving_endpoint
name
,
permission
(CAN_QUERY)
lakebase
database
database_name: databricks_postgres
,
instance_name
,
permission
(CAN_CONNECT_AND_CREATE)
vector_search_index
uc_securable
securable_full_name
,
securable_type: TABLE
,
permission: SELECT
function
uc_securable
securable_full_name
,
securable_type: FUNCTION
,
permission: EXECUTE
table
uc_securable
securable_full_name
,
securable_type: TABLE
,
permission: SELECT
uc_connection
uc_securable
securable_full_name
,
securable_type: CONNECTION
,
permission: USE_CONNECTION
sql_warehouse
sql_warehouse
id
,
permission
(CAN_USE)
genie_space
genie_space
space_id
,
permission
(CAN_RUN)
注意: 实验资源已在脚手架
databricks.yml
中配置,并且会由bundle自动创建。你不需要手动添加它。
示例:带有服务端点和UC函数的代理的
databricks.yml
yaml
resources:
  experiments:
    agent_migration_experiment:
      name: /Users/${workspace.current_user.userName}/${bundle.name}-${bundle.target}

  apps:
    agent_migration:
      name: "<app-name>"  # 更新为用户的App名称
      description: "从Model Serving迁移到Databricks Apps的代理"
      source_code_path: ./
      resources:
        - name: 'experiment'
          experiment:
            experiment_id: "${resources.experiments.agent_migration_experiment.id}"
            permission: 'CAN_MANAGE'
        - name: 'serving-endpoint'
          serving_endpoint:
            name: 'databricks-claude-sonnet-4-5'
            permission: 'CAN_QUERY'
        - name: 'python-exec'
          uc_securable:
            securable_full_name: 'system.ai.python_exec'
            securable_type: 'FUNCTION'
            permission: 'EXECUTE'

targets:
  prod:
    resources:
      apps:
        agent_migration:
          name: "<app-name>"  # 生产环境使用相同名称
示例:添加Lakebase资源(针对有状态代理):
yaml
        - name: 'database'
          database:
            database_name: 'databricks_postgres'
            instance_name: 'lakebase'
            permission: 'CAN_CONNECT_AND_CREATE'

6.3 Deploy with Databricks Asset Bundles

6.3 使用Databricks Asset Bundles部署

From inside the
<app-name>
directory, validate, deploy, and run:
bash
undefined
<app-name>
目录内,验证、部署并运行:
bash
undefined

1. Validate bundle configuration (catches errors before deploy)

1. 验证bundle配置(在部署前捕获错误)

databricks bundle validate --profile <profile>
databricks bundle validate --profile <profile>

2. Deploy the bundle (creates/updates resources, uploads files)

2. 部署bundle(创建/更新资源,上传文件)

databricks bundle deploy --profile <profile>
databricks bundle deploy --profile <profile>

3. Run the app (starts/restarts with uploaded source code) - REQUIRED!

3. 运行App(使用上传的源代码启动/重启) - 必须执行!

databricks bundle run agent_migration --profile <profile>

> **Important:** `bundle deploy` only uploads files and configures resources. `bundle run` is **required** to actually start/restart the app with the new code. If you only run `deploy`, the app will continue running old code!
databricks bundle run agent_migration --profile <profile>

> **重要提示:** `bundle deploy`仅上传文件并配置资源。`bundle run`是**必须执行**的,才能使用新代码实际启动/重启App。如果你只运行`deploy`,App将继续运行旧代码!

6.4 Test Deployed App

6.4 测试已部署的App

Task: Mark "Deploy to Databricks Apps" as
completed
. Mark "Test deployed app" as
in_progress
.
bash
undefined
任务: 将“部署到Databricks Apps”标记为
completed
。将“测试已部署的App”标记为
in_progress
bash
undefined

Get the app URL

获取App URL

APP_URL=$(databricks apps get <app-name> --profile <profile> --output json | jq -r '.url')
APP_URL=$(databricks apps get <app-name> --profile <profile> --output json | jq -r '.url')

Get OAuth token

获取OAuth令牌

TOKEN=$(databricks auth token --profile <profile> | jq -r .access_token)
TOKEN=$(databricks auth token --profile <profile> | jq -r .access_token)

Query the app

查询App

curl -X POST ${APP_URL}/invocations
-H "Authorization: Bearer $TOKEN"
-H "Content-Type: application/json"
-d '{"input": [{"role": "user", "content": "Hello!"}]}'

Once the deployed app responds successfully:

> **Task:** Mark "Test deployed app" as `completed`. Migration complete!
curl -X POST ${APP_URL}/invocations
-H "Authorization: Bearer $TOKEN"
-H "Content-Type: application/json"
-d '{"input": [{"role": "user", "content": "你好!"}]}'

一旦已部署的App成功响应:

> **任务:** 将“测试已部署的App”标记为`completed`。迁移完成!

6.5 Deployment Troubleshooting

6.5 部署故障排除

If you encounter issues during deployment, refer to the deploy skill for detailed guidance.
Debug commands:
bash
undefined
如果在部署过程中遇到问题,请参考deploy技能获取详细指导。
调试命令:
bash
undefined

Validate bundle configuration

验证bundle配置

databricks bundle validate --profile <profile>
databricks bundle validate --profile <profile>

View app logs

查看App日志

databricks apps logs <app-name> --profile <profile> --follow
databricks apps logs <app-name> --profile <profile> --follow

Check app status

检查App状态

databricks apps get <app-name> --profile <profile> --output json | jq '{app_status, compute_status}'
databricks apps get <app-name> --profile <profile> --output json | jq '{app_status, compute_status}'

Get app URL

获取App URL

databricks apps get <app-name> --profile <profile> --output json | jq -r '.url'

**"App already exists" error:**
If `databricks bundle deploy` fails because the app already exists, refer to the **deploy** skill for instructions on binding an existing app to the bundle.

---
databricks apps get <app-name> --profile <profile> --output json | jq -r '.url'

**“App已存在”错误:**
如果`databricks bundle deploy`因App已存在而失败,请参考**deploy**技能获取有关将现有App绑定到bundle的说明。

---

Reference: App File Structure

参考:App文件结构

<app-name>/
├── agent_server/
│   ├── __init__.py
│   ├── agent.py          # Main agent logic - THIS IS WHERE YOU MIGRATE TO
│   ├── start_server.py   # FastAPI server setup
│   ├── utils.py          # Helper utilities
│   └── evaluate_agent.py # Agent evaluation
├── scripts/
│   ├── __init__.py
│   ├── quickstart.py     # Setup script
│   └── start_app.py      # App startup
├── app.yaml              # Databricks Apps configuration
├── databricks.yml        # Databricks Asset Bundle configuration (resources, targets)
├── pyproject.toml        # Dependencies (for local dev with uv)
├── requirements.txt      # REQUIRED: Must contain "uv" for Databricks Apps
├── .env.example          # Environment template
└── README.md
IMPORTANT: The
requirements.txt
file must exist and contain
uv
so that Databricks Apps can install dependencies using the
pyproject.toml
. Without this file, the app will fail to start.

<app-name>/
├── agent_server/
│   ├── __init__.py
│   ├── agent.py          # 主代理逻辑 - 这是你要迁移到的地方
│   ├── start_server.py   # FastAPI服务器设置
│   ├── utils.py          # 辅助工具
│   └── evaluate_agent.py # 代理评估
├── scripts/
│   ├── __init__.py
│   ├── quickstart.py     # 设置脚本
│   └── start_app.py      # App启动脚本
├── app.yaml              # Databricks Apps配置
├── databricks.yml        # Databricks Asset Bundle配置(资源、目标)
├── pyproject.toml        # 依赖项(用于使用uv进行本地开发)
├── requirements.txt      # 必须:必须包含"uv",以便Databricks Apps使用
├── .env.example          # 环境模板
└── README.md
重要提示:
requirements.txt
文件必须存在并包含
uv
,以便Databricks Apps可以使用
pyproject.toml
安装依赖项。没有此文件,App将无法启动。

Reference: Common Migration Patterns

参考:常见迁移模式

Pattern 1: Simple Chat Agent

模式1:简单聊天代理

Original:
python
class ChatAgent(ResponsesAgent):
    def predict(self, request, params=None):
        messages = to_chat_completions_input(request.input)
        response = self.llm.invoke(messages)
        return ResponsesAgentResponse(output=[...])
Migrated (sync):
python
llm = ...  # Move class-level init to module level

@invoke()
def non_streaming(request: ResponsesAgentRequest) -> ResponsesAgentResponse:
    messages = to_chat_completions_input(request.input)
    response = llm.invoke(messages)
    return ResponsesAgentResponse(output=[...])

@stream()
def streaming(request: ResponsesAgentRequest):
    # Original predict_stream() body, with self. removed
    ...
Migrated (async):
python
@invoke()
async def non_streaming(request: ResponsesAgentRequest) -> ResponsesAgentResponse:
    outputs = [e.item async for e in streaming(request) if e.type == "response.output_item.done"]
    return ResponsesAgentResponse(output=outputs)

@stream()
async def streaming(request: ResponsesAgentRequest) -> AsyncGenerator[ResponsesAgentStreamEvent, None]:
    messages = {"messages": to_chat_completions_input([i.model_dump() for i in request.input])}
    agent = await init_agent()
    async for event in process_agent_astream_events(agent.astream(messages, stream_mode=["updates", "messages"])):
        yield event
原始:
python
class ChatAgent(ResponsesAgent):
    def predict(self, request, params=None):
        messages = to_chat_completions_input(request.input)
        response = self.llm.invoke(messages)
        return ResponsesAgentResponse(output=[...])
迁移后(同步):
python
llm = ...  # 将类级初始化移到模块级别

@invoke()
def non_streaming(request: ResponsesAgentRequest) -> ResponsesAgentResponse:
    messages = to_chat_completions_input(request.input)
    response = llm.invoke(messages)
    return ResponsesAgentResponse(output=[...])

@stream()
def streaming(request: ResponsesAgentRequest):
    # 原始predict_stream()主体,移除self.
    ...
迁移后(异步):
python
@invoke()
async def non_streaming(request: ResponsesAgentRequest) -> ResponsesAgentResponse:
    outputs = [e.item async for e in streaming(request) if e.type == "response.output_item.done"]
    return ResponsesAgentResponse(output=outputs)

@stream()
async def streaming(request: ResponsesAgentRequest) -> AsyncGenerator[ResponsesAgentStreamEvent, None]:
    messages = {"messages": to_chat_completions_input([i.model_dump() for i in request.input])}
    agent = await init_agent()
    async for event in process_agent_astream_events(agent.astream(messages, stream_mode=["updates", "messages"])):
        yield event

Pattern 2: Agent with Custom Tools

模式2:带有自定义工具的代理

Sync: Keep tools as-is from the original code.
Async: Migrate tools to async LangChain tools:
python
from langchain_core.tools import tool

@tool
async def search_docs(query: str) -> str:
    """Search the documentation."""
    results = await vector_store.asimilarity_search(query)
    return format_results(results)
同步: 保留原始代码中的工具不变。
异步: 将工具迁移为异步LangChain工具:
python
from langchain_core.tools import tool

@tool
async def search_docs(query: str) -> str:
    """搜索文档。"""
    results = await vector_store.asimilarity_search(query)
    return format_results(results)

Pattern 3: Using LangGraph with create_agent (async only)

模式3:使用LangGraph和create_agent(仅异步)

python
from langchain.agents import create_agent
from databricks_langchain import ChatDatabricks

async def init_agent():
    tools = await mcp_client.get_tools()  # MCP tools are async
    model = ChatDatabricks(endpoint=LLM_ENDPOINT_NAME)
    return create_agent(model=model, tools=tools, system_prompt=SYSTEM_PROMPT)

python
from langchain.agents import create_agent
from databricks_langchain import ChatDatabricks

async def init_agent():
    tools = await mcp_client.get_tools()  # MCP工具是异步的
    model = ChatDatabricks(endpoint=LLM_ENDPOINT_NAME)
    return create_agent(model=model, tools=tools, system_prompt=SYSTEM_PROMPT)

Reference: Useful Resources

参考:有用资源

Troubleshooting

故障排除

"Module not found" errors

"找不到模块"错误

bash
uv sync  # Reinstall dependencies
bash
uv sync  # 重新安装依赖项

Authentication errors

身份验证错误

bash
databricks auth login  # Re-authenticate
bash
databricks auth login  # 重新进行身份验证

Lakebase permission errors

Lakebase权限错误

  • Ensure the Lakebase instance is added as an app resource in Databricks UI
  • Grant appropriate permissions on the Lakebase instance
  • 确保Lakebase实例已作为App资源添加到Databricks UI中
  • 在Lakebase实例上授予适当的权限

Async errors (async migration only)

异步错误(仅异步迁移)

  • Ensure all I/O calls use async versions (e.g.,
    await client.achat()
    not
    client.chat()
    )
  • Use
    async for
    instead of
    for
    when iterating async generators
  • If you chose sync migration, these errors should not occur — double-check that you're not mixing sync and async patterns
  • 确保所有I/O调用使用异步版本(例如
    await client.achat()
    而不是
    client.chat()
  • 迭代异步生成器时使用
    async for
    而不是
    for
  • 如果你选择了同步迁移,不应出现这些错误——请仔细检查你是否混合了同步和异步模式