migrate-from-model-serving

Compare original and translation side by side

🇺🇸

Original

English

🇨🇳

Translation

Chinese

Model Serving to Databricks Apps Migration Guide

从Model Serving迁移到Databricks Apps的指南

This guide instructs LLM coding agents how to migrate an MLflow ResponsesAgent from Databricks Model Serving to Databricks Apps.

本指南指导LLM编码代理如何将MLflow ResponsesAgent从Databricks Model Serving迁移到Databricks Apps。

Overview

概述

Goal: Migrate an agent deployed on Databricks Model Serving (using

ResponsesAgent

with

predict()

predict_stream()

) to Databricks Apps (using MLflow GenAI Server with

@invoke

@stream

decorators).

Key Transformation:

Model Serving: Synchronous
```
predict()
```
and
```
predict_stream()
```
methods on a class
Apps: Functions with
```
@invoke
```
and
```
@stream
```
decorators (sync or async, based on user preference)

Deliverables: After migration is complete, you will have:

<working-directory>/
├── original_mlflow_model/    # Downloaded artifacts from Model Serving
│   ├── MLmodel
│   ├── code/
│   │   └── agent.py
│   ├── input_example.json
│   └── requirements.txt
│
└── <app-name>/               # New Databricks App (ready to deploy)
    ├── agent_server/
    │   ├── agent.py          # Migrated agent code
    │   └── ...
    ├── app.yaml
    ├── databricks.yml        # Bundle config with resources
    ├── pyproject.toml
    ├── requirements.txt
    └── ...

<app-name>
is the name the user provides at the start of the migration. It is used as both the directory name and the Databricks App name at deploy time.

目标： 将部署在Databricks Model Serving上的代理（使用带有

predict()

predict_stream()

的

ResponsesAgent

）迁移到Databricks Apps（使用带有

@invoke

@stream

装饰器的MLflow GenAI Server）。

核心转换：

Model Serving：类中的同步
```
predict()
```
和
```
predict_stream()
```
方法
Apps：带有
```
@invoke
```
和
```
@stream
```
装饰器的函数（根据用户偏好选择同步或异步）

交付成果： 迁移完成后，你将拥有：

<working-directory>/
├── original_mlflow_model/    # 从Model Serving下载的工件
│   ├── MLmodel
│   ├── code/
│   │   └── agent.py
│   ├── input_example.json
│   └── requirements.txt
│
└── <app-name>/               # 新的Databricks App（已准备好部署）
    ├── agent_server/
    │   ├── agent.py          # 迁移后的代理代码
    │   └── ...
    ├── app.yaml
    ├── databricks.yml        # 包含资源的Bundle配置
    ├── pyproject.toml
    ├── requirements.txt
    └── ...

<app-name>
是用户在迁移开始时提供的名称。它既用作目录名称，也用作部署时的Databricks App名称。

Before You Begin: Gather User Inputs

开始前：收集用户输入

Before doing anything else, ask the user three questions. Use the

AskUserQuestion

tool to collect all answers at once so the user is only prompted once, then Claude can execute the rest of the migration autonomously.

Questions to ask:

Databricks profile: Which Databricks CLI profile should be used for the workspace where the Model Serving endpoint lives? (Run
```
databricks auth profiles
```
first to list available profiles and their workspaces, then present the options to the user.)
App name: What should the new Databricks App be named? (Must be lowercase, can contain letters, numbers, and hyphens, and must be unique within the workspace.)
Async migration: Would you like to migrate your agent code to be fully async?
- Yes (Recommended): Converts all I/O operations to async (
```
await
```
  /
```
async for
```
  ), enabling higher concurrency on smaller compute — no more threads sitting idle while waiting for LLM responses or long-running tool calls.
- No: Keeps your existing synchronous code with minimal changes — just extracts the logic from the
```
ResponsesAgent
```
  class and wraps it with
```
@invoke
```
  /
```
@stream
```
  decorators. Simpler migration, but each request blocks a thread while waiting for I/O.

Store the answers as:

```
<profile>
```
— used for ALL
```
databricks
```
CLI commands throughout the migration (via
```
--profile <profile>
```
)
```
<app-name>
```
— used as both the directory name for the migrated app AND the app name when deploying with
```
databricks bundle deploy
```
```
<async>
```
—
```
yes
```
or
```
no
```
, determines whether to convert the agent code to async or keep it synchronous

在开始任何操作之前，请向用户询问三个问题。 使用

AskUserQuestion

工具一次性收集所有答案，之后Claude可以自主完成剩余的迁移工作。

要询问的问题：

Databricks配置文件： 应该使用哪个Databricks CLI配置文件来访问Model Serving端点所在的工作区？（先运行
```
databricks auth profiles
```
列出可用配置文件及其对应的工作区，然后向用户展示选项。）
App名称： 新的Databricks App应该命名为什么？（必须为小写，可以包含字母、数字和连字符，并且在工作区内必须唯一。）
异步迁移： 你是否希望将代理代码迁移为完全异步？
- 是（推荐）： 将所有I/O操作转换为异步（
```
await
```
  /
```
async for
```
  ），在较小的计算资源上实现更高的并发量——等待LLM响应或长时间运行的工具调用时，不再有线程处于空闲状态。
- 否：保留现有的同步代码，仅进行最小更改——只需将逻辑从
```
ResponsesAgent
```
  类中提取出来，并用
```
@invoke
```
  /
```
@stream
```
  装饰器包装。迁移更简单，但每个请求在等待I/O时都会阻塞一个线程。

将答案存储为：

```
<profile>
```
— 在整个迁移过程中用于所有
```
databricks
```
CLI命令（通过
```
--profile <profile>
```
）
```
<app-name>
```
— 既用作迁移后App的目录名称，也用作使用
```
databricks bundle deploy
```
部署时的App名称
```
<async>
```
—
```
yes
```
或
```
no
```
，决定是否将代理代码转换为异步或保持同步

Validate Authentication

验证身份验证

After receiving the user's answers, validate the selected profile:

bash

databricks current-user me --profile <profile>

If this fails with an authentication error, prompt the user to re-authenticate:

bash

databricks auth login --profile <profile>

Important: Remember to include
--profile <profile>
on every
databricks
CLI command throughout the migration.

收到用户的答案后，验证所选配置文件：

bash

databricks current-user me --profile <profile>

如果此命令因身份验证错误失败，请提示用户重新进行身份验证：

bash

databricks auth login --profile <profile>

重要提示： 在整个迁移过程中，记得在每个
databricks
CLI命令中包含
--profile <profile>
。

Create the App Directory

创建App目录

Copy all scaffold files from the current working directory into a new directory named

<app-name>/

. Exclude instruction files (

AGENTS.md

CLAUDE.md

), hidden directories (

.claude/

.git/

), and any migration artifacts (e.g.,

original_mlflow_model/

.migration-venv/

). Do NOT search for or copy scaffold files from other directories or templates — everything you need is right here.

All subsequent migration steps operate inside the

<app-name>/

directory.

Note: The
agent_server/agent.py
scaffold is intentionally framework-agnostic — it contains the
@invoke
/
@stream
decorator pattern with TODO placeholders. Step 3 (Migrate the Agent Code) will replace these placeholders with the actual agent logic from the original Model Serving endpoint.

将当前工作目录中的所有脚手架文件复制到名为

<app-name>/

的新目录中。排除说明文件（

AGENTS.md

、

CLAUDE.md

）、隐藏目录（

.claude/

、

.git/

）和任何迁移工件（例如

original_mlflow_model/

、

.migration-venv/

）。不要从其他目录或模板中搜索或复制脚手架文件——你需要的所有内容都在这里。

所有后续迁移步骤都在

<app-name>/

目录内进行。

注意：
agent_server/agent.py
脚手架是有意设计为与框架无关的——它包含带有TODO占位符的
@invoke
/
@stream
装饰器模式。步骤3（迁移代理代码）会将这些占位符替换为来自原始Model Serving端点的实际代理逻辑。

Create Task List

创建任务列表

Create a task list to track progress. This helps the user follow along and see what's completed, in progress, and pending.

User tip: Press
Ctrl+T
to toggle the task list view in your terminal. The display shows up to 10 tasks at a time with status indicators.

Create the following tasks using the

TaskCreate

tool:

Task	Description
Authenticate to Databricks	Verify Databricks CLI authentication and validate the selected profile
Download original agent artifacts	Download the MLflow model artifacts from Model Serving endpoint
Analyze and understand agent code	Examine the original agent code, identify tools, resources, and dependencies
Migrate agent code to Apps format	Transform ResponsesAgent class to @invoke/@stream decorated functions
Set up and configure the app	Install dependencies, run quickstart, configure environment
Test agent locally	Start local server and verify the agent works correctly
Deploy to Databricks Apps	Configure databricks.yml resources and deploy with Databricks Asset Bundles
Test deployed app	Verify the deployed app responds correctly

Update task status as you progress:

Mark tasks as
```
in_progress
```
when starting each step
Mark tasks as
```
completed
```
when finished
This gives the user visibility into migration progress

创建任务列表以跟踪进度。 这有助于用户跟进并查看已完成、进行中和待处理的任务。

用户提示： 在终端中按
Ctrl+T
可切换任务列表视图。该视图最多显示10个任务，并带有状态指示器。

使用

TaskCreate

工具创建以下任务：

任务	描述
身份验证到Databricks	验证Databricks CLI身份验证并验证所选配置文件
下载原始代理工件	从Model Serving端点下载MLflow模型工件
分析并理解代理代码	检查原始代理代码，识别工具、资源和依赖项
将代理代码迁移到Apps格式	将ResponsesAgent类转换为带有@invoke/@stream装饰器的函数
设置并配置App	安装依赖项，运行快速启动脚本，配置环境
本地测试代理	启动本地服务器并验证代理是否正常工作
部署到Databricks Apps	配置databricks.yml资源并使用Databricks Asset Bundles部署
测试已部署的App	验证已部署的App是否能正确响应

随着进度更新任务状态：

开始每个步骤时，将任务标记为
```
in_progress
```
完成后，将任务标记为
```
completed
```
这使用户能够了解迁移进度

Step 1: Download the Original Agent Code

步骤1：下载原始代理代码

Task: Mark "Authenticate to Databricks" as
completed
. Mark "Download original agent artifacts" as
in_progress
.
Note: The
<profile>
and
<app-name>
values were collected from the user in the "Before You Begin" section. Use them throughout.

Download the original agent code from the Model Serving endpoint. This requires setting up a virtual environment with MLflow to access the model artifacts.

任务： 将“身份验证到Databricks”标记为
completed
。将“下载原始代理工件”标记为
in_progress
。
注意：
<profile>
和
<app-name>
的值是在“开始前”部分从用户那里收集的。请在整个过程中使用它们。

从Model Serving端点下载原始代理代码。这需要设置一个带有MLflow的虚拟环境来访问模型工件。

1.1 Get Model Info from Endpoint

1.1 从端点获取模型信息

If you have a serving endpoint name, extract the model details:

bash

undefined

如果你有服务端点名称，请提取模型详细信息：

bash

undefined

Get endpoint info (remember to include --profile if using non-default)

获取端点信息（如果使用非默认配置文件，请记得包含--profile）

databricks serving-endpoints get <endpoint-name> --profile <profile> --output json


Look for `served_entities[0].entity_name` (model name) and `entity_version` in the response. Find the entity with 100% traffic in `traffic_config.routes`.

databricks serving-endpoints get <endpoint-name> --profile <profile> --output json


在响应中查找`served_entities[0].entity_name`（模型名称）和`entity_version`。在`traffic_config.routes`中找到流量占比为100%的实体。

1.2 Download Model Artifacts

1.2 下载模型工件

Use

uv run --with

to download artifacts without creating a separate virtual environment. The

mlflow[databricks]

extra includes

boto3

for Unity Catalog artifact access:

bash

DATABRICKS_CONFIG_PROFILE=<profile> uv run --no-project \
  --with "mlflow[databricks]>=2.15.0" \
  --with "databricks-sdk>=0.30.0" \
  python3 << 'EOF'
import mlflow

mlflow.set_tracking_uri("databricks")

使用

uv run --with

下载工件，无需创建单独的虚拟环境。

mlflow[databricks]

扩展包含用于Unity Catalog工件访问的

boto3

：

bash

DATABRICKS_CONFIG_PROFILE=<profile> uv run --no-project \
  --with "mlflow[databricks]>=2.15.0" \
  --with "databricks-sdk>=0.30.0" \
  python3 << 'EOF'
import mlflow

mlflow.set_tracking_uri("databricks")

Replace with actual values from step 1.1

替换为步骤1.1中的实际值

MODEL_NAME = "<model-name>" VERSION = "<version>"

print(f"Downloading model: models:/{MODEL_NAME}/{VERSION}") mlflow.artifacts.download_artifacts( artifact_uri=f"models:/{MODEL_NAME}/{VERSION}", dst_path="./original_mlflow_model" ) print("Download complete! Artifacts saved to ./original_mlflow_model") EOF

undefined

MODEL_NAME = "<model-name>" VERSION = "<version>"

print(f"正在下载模型: models:/{MODEL_NAME}/{VERSION}") mlflow.artifacts.download_artifacts( artifact_uri=f"models:/{MODEL_NAME}/{VERSION}", dst_path="./original_mlflow_model" ) print("下载完成！工件已保存到./original_mlflow_model") EOF

undefined

1.3 Verify Downloaded Artifacts

1.3 验证下载的工件

Check that the key files exist and understand the full structure:

bash

undefined

检查关键文件是否存在并了解完整结构：

bash

undefined

List all downloaded files recursively

递归列出所有下载的文件

find ./original_mlflow_model -type f | head -50

Check for MLmodel file (contains resource requirements)

检查MLmodel文件（包含资源要求）

cat ./original_mlflow_model/MLmodel

Check for input example (useful for testing)

检查输入示例（对测试有用）

cat ./original_mlflow_model/input_example.json 2>/dev/null


**Examine the `/code` folder** - contains all code dependencies logged via `code_paths=["..."]`:

```bash

cat ./original_mlflow_model/input_example.json 2>/dev/null


**检查`/code`文件夹** - 包含通过`code_paths=["..."]`记录的所有代码依赖项：

```bash

List all code files

列出所有代码文件

ls -la ./original_mlflow_model/code/

The main agent is typically agent.py, but there may be additional modules

主代理通常是agent.py，但可能还有其他模块

find ./original_mlflow_model/code -name "*.py" -type f


**Examine the `/artifacts` folder** (if present) - contains artifacts logged via `artifacts={...}`:

```bash

find ./original_mlflow_model/code -name "*.py" -type f


**检查`/artifacts`文件夹（如果存在）** - 包含通过`artifacts={...}`记录的工件：

```bash

Check for artifacts folder

检查是否存在artifacts文件夹

ls -la ./original_mlflow_model/artifacts/ 2>/dev/null

List all artifacts

列出所有工件

find ./original_mlflow_model/artifacts -type f 2>/dev/null


> **Important:** Take note of ALL files in `/code` and `/artifacts`. You will need to copy these to the migrated app and ensure imports still work correctly.

find ./original_mlflow_model/artifacts -type f 2>/dev/null


> **重要提示：** 记录`/code`和`/artifacts`中的所有文件。你需要将这些文件复制到迁移后的App中，并确保导入仍然正确工作。

Expected Output Structure

预期输出结构

After successful download, you should have:

./original_mlflow_model/
├── MLmodel              # Model metadata and resource requirements
├── code/                # Code logged via code_paths=["..."]
│   ├── agent.py         # Main agent implementation
│   ├── utils.py         # (optional) Helper modules
│   ├── tools.py         # (optional) Custom tool definitions
│   └── ...              # Any other code dependencies
├── artifacts/           # (optional) Artifacts logged via artifacts={...}
│   ├── config.yaml      # (optional) Configuration files
│   ├── prompts/         # (optional) Prompt templates
│   └── ...              # Any other artifacts (data files, etc.)
├── input_example.json   # Sample request for testing
├── requirements.txt     # Original dependencies
└── ...

成功下载后，你应该拥有：

./original_mlflow_model/
├── MLmodel              # 模型元数据和资源要求
├── code/                # 通过code_paths=["..."]记录的代码
│   ├── agent.py         # 主代理实现
│   ├── utils.py         # （可选）辅助模块
│   ├── tools.py         # （可选）自定义工具定义
│   └── ...              # 任何其他代码依赖项
├── artifacts/           # （可选）通过artifacts={...}记录的工件
│   ├── config.yaml      # （可选）配置文件
│   ├── prompts/         # （可选）提示模板
│   └── ...              # 任何其他工件（数据文件等）
├── input_example.json   # 用于测试的示例请求
├── requirements.txt     # 原始依赖项
└── ...

Key Files to Examine

需要检查的关键文件

code/agent.py
- Contains the

ResponsesAgent

class with

predict()

and

predict_stream()

methods

code/*.py
- Any additional Python modules the agent imports
MLmodel
- Contains the
```
resources
```
section listing required Databricks resources
artifacts/
- Any configuration files, prompts, or data files the agent uses
input_example.json
- Use this to test the migrated agent

code/agent.py
- 包含带有

predict()

和

predict_stream()

方法的

ResponsesAgent

类

code/*.py
- 代理导入的任何其他Python模块
MLmodel
- 包含列出所需Databricks资源的
```
resources
```
部分
artifacts/
- 代理使用的任何配置文件、提示或数据文件
input_example.json
- 用于测试迁移后的代理

Troubleshooting Model Download

模型下载故障排除

"Unable to import necessary dependencies to access model version files in Unity Catalog" This means

boto3

is missing. Ensure you're using

mlflow[databricks]

(not just

mlflow

) in the

--with

flag — the

[databricks]

extra includes

boto3

"INVALID_PARAMETER_VALUE" or authentication errors Re-authenticate with Databricks (include profile if non-default):

bash

databricks auth login --profile <profile>

Wrong workspace / Model not found Make sure you're using the correct profile that corresponds to the workspace where the model is deployed:

bash

undefined

"无法导入必要的依赖项以访问Unity Catalog中的模型版本文件" 这意味着缺少

boto3

。确保在

--with

标志中使用

mlflow[databricks]

（而不仅仅是

mlflow

）——

[databricks]

扩展包含

boto3

。

"INVALID_PARAMETER_VALUE"或身份验证错误 重新进行Databricks身份验证（如果是非默认配置文件，请包含配置文件）：

bash

databricks auth login --profile <profile>

错误的工作区/未找到模型 确保你使用的配置文件对应于模型部署所在的工作区：

bash

undefined

List profiles to see which workspace each points to

列出配置文件以查看每个配置文件指向哪个工作区

databricks auth profiles

Verify you can access the workspace

验证你是否可以访问该工作区

databricks current-user me --profile <profile>

List models in that workspace

列出该工作区中的模型

databricks registered-models list --profile <profile> databricks model-versions list --name "<model-name>" --profile <profile>

---

databricks registered-models list --profile <profile> databricks model-versions list --name "<model-name>" --profile <profile>

---

Step 2: Understand the Key Transformations

步骤2：了解核心转换

Task: Mark "Download original agent artifacts" as
completed
. Mark "Analyze and understand agent code" as
in_progress
.

任务： 将“下载原始代理工件”标记为
completed
。将“分析并理解代理代码”标记为
in_progress
。

Entry Point Transformation

入口点转换

In both cases, the

ResponsesAgent

class is replaced with decorated functions. The difference is whether those functions are async or sync.

Model Serving (OLD):

python

from mlflow.pyfunc import ResponsesAgent, ResponsesAgentRequest, ResponsesAgentResponse

class MyAgent(ResponsesAgent):
    def predict(self, request: ResponsesAgentRequest, params=None) -> ResponsesAgentResponse:
        # Synchronous implementation
        ...
        return ResponsesAgentResponse(output=outputs)

    def predict_stream(self, request: ResponsesAgentRequest, params=None):
        # Synchronous generator
        for chunk in ...:
            yield ResponsesAgentStreamEvent(...)

Apps — Async (if
<async>
= yes):

python

from mlflow.genai.agent_server import invoke, stream
from mlflow.types.responses import (
    ResponsesAgentRequest,
    ResponsesAgentResponse,
    ResponsesAgentStreamEvent,
)

@invoke()
async def non_streaming(request: ResponsesAgentRequest) -> ResponsesAgentResponse:
    # Async implementation - typically calls streaming() and collects results
    outputs = [
        event.item
        async for event in streaming(request)
        if event.type == "response.output_item.done"
    ]
    return ResponsesAgentResponse(output=outputs)

@stream()
async def streaming(request: ResponsesAgentRequest) -> AsyncGenerator[ResponsesAgentStreamEvent, None]:
    # Async generator
    async for event in ...:
        yield event

Apps — Sync (if
<async>
= no):

python

from mlflow.genai.agent_server import invoke, stream
from mlflow.types.responses import (
    ResponsesAgentRequest,
    ResponsesAgentResponse,
    ResponsesAgentStreamEvent,
)

@invoke()
def non_streaming(request: ResponsesAgentRequest) -> ResponsesAgentResponse:
    # Same sync logic from original predict(), extracted from the class
    ...
    return ResponsesAgentResponse(output=outputs)

@stream()
def streaming(request: ResponsesAgentRequest):
    # Same sync generator from original predict_stream(), extracted from the class
    for chunk in ...:
        yield ResponsesAgentStreamEvent(...)

在两种情况下，

ResponsesAgent

类都被替换为带有装饰器的函数。区别在于这些函数是异步还是同步。

Model Serving（旧版）：

python

from mlflow.pyfunc import ResponsesAgent, ResponsesAgentRequest, ResponsesAgentResponse

class MyAgent(ResponsesAgent):
    def predict(self, request: ResponsesAgentRequest, params=None) -> ResponsesAgentResponse:
        # 同步实现
        ...
        return ResponsesAgentResponse(output=outputs)

    def predict_stream(self, request: ResponsesAgentRequest, params=None):
        # 同步生成器
        for chunk in ...:
            yield ResponsesAgentStreamEvent(...)

Apps — 异步（如果
<async>
= yes）：

python

from mlflow.genai.agent_server import invoke, stream
from mlflow.types.responses import (
    ResponsesAgentRequest,
    ResponsesAgentResponse,
    ResponsesAgentStreamEvent,
)

@invoke()
async def non_streaming(request: ResponsesAgentRequest) -> ResponsesAgentResponse:
    # 异步实现 - 通常调用streaming()并收集结果
    outputs = [
        event.item
        async for event in streaming(request)
        if event.type == "response.output_item.done"
    ]
    return ResponsesAgentResponse(output=outputs)

@stream()
async def streaming(request: ResponsesAgentRequest) -> AsyncGenerator[ResponsesAgentStreamEvent, None]:
    # 异步生成器
    async for event in ...:
        yield event

Apps — 同步（如果
<async>
= no）：

python

from mlflow.genai.agent_server import invoke, stream
from mlflow.types.responses import (
    ResponsesAgentRequest,
    ResponsesAgentResponse,
    ResponsesAgentStreamEvent,
)

@invoke()
def non_streaming(request: ResponsesAgentRequest) -> ResponsesAgentResponse:
    # 与原始predict()相同的同步逻辑，从类中提取
    ...
    return ResponsesAgentResponse(output=outputs)

@stream()
def streaming(request: ResponsesAgentRequest):
    # 与原始predict_stream()相同的同步生成器，从类中提取
    for chunk in ...:
        yield ResponsesAgentStreamEvent(...)

Key Differences

核心差异

Aspect	Model Serving	Apps (async)	Apps (sync)
Structure	`class MyAgent(ResponsesAgent)`	Decorated functions	Decorated functions
Functions	`def predict()` / `def predict_stream()`	`async def` with `await`	`def` (same as original)
Streaming	Sync generator ( `yield` )	Async generator ( `async for` / `yield` )	Sync generator ( `yield` )
Server	MLflow Model Server	MLflow GenAI Server (FastAPI)	MLflow GenAI Server (FastAPI)
Deployment	`databricks_agents.deploy()`	`databricks bundle deploy` + `bundle run`	`databricks bundle deploy` + `bundle run`

方面	Model Serving	Apps（异步）	Apps（同步）
结构	`class MyAgent(ResponsesAgent)`	带有装饰器的函数	带有装饰器的函数
函数	`def predict()` / `def predict_stream()`	`async def` 搭配 `await`	`def` （与原始相同）
流式处理	同步生成器（ `yield` ）	异步生成器（ `async for` / `yield` ）	同步生成器（ `yield` ）
服务器	MLflow Model Server	MLflow GenAI Server（FastAPI）	MLflow GenAI Server（FastAPI）
部署	`databricks_agents.deploy()`	`databricks bundle deploy` + `bundle run`	`databricks bundle deploy` + `bundle run`

Async Patterns (only if

<async>

= yes)

异步模式（仅当

<async>

= yes时）

Skip this section if the user chose synchronous migration. The sync path keeps all original I/O calls as-is.

All I/O operations must be converted to async:

python

undefined

如果用户选择同步迁移，请跳过本节。同步路径保持所有原始I/O调用不变。

所有I/O操作必须转换为异步：

python

undefined

OLD (sync)

旧版（同步）

response = client.chat(messages)

NEW (async)

新版（异步）

response = await client.achat(messages)

OLD (sync iteration)

旧版（同步迭代）

for chunk in stream: yield chunk

NEW (async iteration)

新版（异步迭代）

async for chunk in stream: yield chunk

---

async for chunk in stream: yield chunk

---

Step 3: Migrate the Agent Code

步骤3：迁移代理代码

Task: Mark "Analyze and understand agent code" as
completed
. Mark "Migrate agent code to Apps format" as
in_progress
.

任务： 将“分析并理解代理代码”标记为
completed
。将“将代理代码迁移到Apps格式”标记为
in_progress
。

3.1 Copy Code Dependencies and Artifacts

3.1 复制代码依赖项和工件

The original MLflow model may contain multiple code files and artifacts that need to be migrated.

Copy all code files from
/code
to
agent_server/
:

bash

undefined

原始MLflow模型可能包含多个需要迁移的代码文件和工件。

将
/code
中的所有代码文件复制到
agent_server/
：

bash

undefined

Copy all Python files from original code folder

从原始代码文件夹复制所有Python文件

cp ./original_mlflow_model/code/*.py ./<app-name>/agent_server/

If there are subdirectories with code, copy those too

cp -r ./original_mlflow_model/code/submodule ./<app-name>/agent_server/


**Copy artifacts (if present):**

```bash


**复制工件（如果存在）：**

```bash

Create an artifacts directory in the migrated app if needed

如有需要，在迁移后的App中创建artifacts目录

mkdir -p ./<app-name>/agent_server/artifacts

Copy all artifacts

复制所有工件

cp -r ./original_mlflow_model/artifacts/* ./<app-name>/agent_server/artifacts/ 2>/dev/null || true


**Fix import paths after copying:**

When code files are moved, imports may break. Check and update imports in all copied files:

```python

cp -r ./original_mlflow_model/artifacts/* ./<app-name>/agent_server/artifacts/ 2>/dev/null || true


**复制后修复导入路径：**

移动代码文件后，导入可能会中断。检查并更新所有复制文件中的导入：

```python

BEFORE (if files were in different locations):

之前（如果文件位于不同位置）：

from code.utils import helper_function from artifacts.prompts import SYSTEM_PROMPT

AFTER (files are now in agent_server/):

之后（文件现在位于agent_server/中）：

from agent_server.utils import helper_function

Or if in same directory:

或者如果在同一目录中：

from .utils import helper_function

For artifacts, update file paths:

对于工件，更新文件路径：

BEFORE:

之前：

with open("artifacts/config.yaml") as f:

AFTER:

之后：

import os config_path = os.path.join(os.path.dirname(file), "artifacts", "config.yaml") with open(config_path) as f:


> **Important:** Review each copied file and ensure all imports resolve correctly. The most common issues are:
> - Relative imports that assumed a different directory structure
> - Hardcoded file paths to artifacts
> - Missing `__init__.py` files for package imports

import os config_path = os.path.join(os.path.dirname(file), "artifacts", "config.yaml") with open(config_path) as f:


> **重要提示：** 检查每个复制的文件并确保所有导入都能正确解析。最常见的问题是：
> - 假设不同目录结构的相对导入
> - 指向工件的硬编码文件路径
> - 包导入缺少`__init__.py`文件

3.2 Extract Configuration

3.2 提取配置

From the original agent code, identify and preserve:

LLM endpoint name (e.g.,
```
databricks-claude-sonnet-4-5
```
)
System prompt
Tool definitions
Any custom logic

从原始代理代码中识别并保留：

LLM端点名称（例如
```
databricks-claude-sonnet-4-5
```
）
系统提示
工具定义
任何自定义逻辑

3.3 Update the Agent Entry Point

3.3 更新代理入口点

The approach depends on whether the user chose async or sync migration.

方法取决于用户选择的是异步还是同步迁移。

Path A: Synchronous Migration (

<async>

= no)

路径A：同步迁移（

<async>

= no）

This is the minimal-changes path. Extract the logic from the

ResponsesAgent

class, wrap it with

@invoke

@stream

decorators, and keep all code synchronous.

Edit

<app-name>/agent_server/agent.py

Replace the scaffold with the original agent logic. The core transformation is extracting the class methods into decorated functions:

python

from mlflow.genai.agent_server import invoke, stream
from mlflow.types.responses import (
    ResponsesAgentRequest,
    ResponsesAgentResponse,
    ResponsesAgentStreamEvent,
)

这是最小更改路径。从

ResponsesAgent

类中提取逻辑，用

@invoke

@stream

装饰器包装，并保持所有代码同步。

编辑

<app-name>/agent_server/agent.py

：

用原始代理逻辑替换脚手架。 核心转换是将类方法提取为带有装饰器的函数：

python

from mlflow.genai.agent_server import invoke, stream
from mlflow.types.responses import (
    ResponsesAgentRequest,
    ResponsesAgentResponse,
    ResponsesAgentStreamEvent,
)

Move any class init or class-level setup to module level

将任何类init或类级设置移到模块级别

e.g., client initialization, tool setup, etc.

例如客户端初始化、工具设置等

@invoke() def non_streaming(request: ResponsesAgentRequest) -> ResponsesAgentResponse: # Paste the body of the original predict() method here # Remove 'self.' references — replace with module-level variables # Remove 'params' parameter (not used in Apps) ... return ResponsesAgentResponse(output=outputs)

@stream() def streaming(request: ResponsesAgentRequest): # Paste the body of the original predict_stream() method here # Remove 'self.' references — replace with module-level variables # Remove 'params' parameter (not used in Apps) for chunk in ...: yield ResponsesAgentStreamEvent(...)


2. **Key changes from class to functions:**
   - Remove the `class MyAgent(ResponsesAgent):` wrapper
   - Remove `self` parameter from all methods
   - Move `__init__` logic (client creation, tool setup) to module-level code
   - Replace `self.some_attribute` with module-level variables
   - Add `@invoke()` decorator to the non-streaming function
   - Add `@stream()` decorator to the streaming function

3. **Keep all other code as-is** — no need to convert sync calls to async, no need to change `for` to `async for`, no need to add `await`.

---

@invoke() def non_streaming(request: ResponsesAgentRequest) -> ResponsesAgentResponse: # 在此处粘贴原始predict()方法的主体 # 移除'self.'引用 - 替换为模块级变量 # 移除'params'参数（在Apps中不使用） ... return ResponsesAgentResponse(output=outputs)

@stream() def streaming(request: ResponsesAgentRequest): # 在此处粘贴原始predict_stream()方法的主体 # 移除'self.'引用 - 替换为模块级变量 # 移除'params'参数（在Apps中不使用） for chunk in ...: yield ResponsesAgentStreamEvent(...)


2. **从类到函数的关键更改：**
   - 移除`class MyAgent(ResponsesAgent):`包装器
   - 从所有方法中移除`self`参数
   - 将`__init__`逻辑（客户端创建、工具设置）移到模块级代码
   - 将`self.some_attribute`替换为模块级变量
   - 为非流式函数添加`@invoke()`装饰器
   - 为流式函数添加`@stream()`装饰器

3. **保留所有其他代码不变** - 无需将同步调用转换为异步，无需将`for`更改为`async for`，无需添加`await`。

---

Path B: Async Migration (

<async>

= yes)

路径B：异步迁移（

<async>

= yes）

This path converts all I/O operations to async for higher concurrency. More changes are required, but the result is a more efficient server.

Edit

<app-name>/agent_server/agent.py

Update the LLM endpoint:

python

LLM_ENDPOINT_NAME = "<your-endpoint-from-original>"

Update the system prompt:

python

SYSTEM_PROMPT = """<your-system-prompt-from-original>"""

Add your custom tools: If your original agent had custom tools, add them:

python

from langchain_core.tools import tool

@tool
async def my_custom_tool(arg: str) -> str:
    """Tool description."""
    # Your tool logic (make async if needed)
    return result

Convert all I/O to async:

```
def predict()
```
→
```
async def non_streaming()
```

def predict_stream()

→

async def streaming()

```
client.chat()
```
→
```
await client.achat()
```

for chunk in stream:

→

async for chunk in stream:

Sync HTTP calls →
```
await
```
async equivalents

Preserve any special logic: Migrate any custom preprocessing, postprocessing, or business logic from the original agent.

此路径将所有I/O操作转换为异步以实现更高的并发量。需要更多更改，但结果是更高效的服务器。

编辑

<app-name>/agent_server/agent.py

：

更新LLM端点：

python

LLM_ENDPOINT_NAME = "<your-endpoint-from-original>"

更新系统提示：

python

SYSTEM_PROMPT = """<your-system-prompt-from-original>"""

添加自定义工具： 如果原始代理有自定义工具，请添加它们：

python

from langchain_core.tools import tool

@tool
async def my_custom_tool(arg: str) -> str:
    """工具描述。"""
    # 你的工具逻辑（如有需要，设为异步）
    return result

将所有I/O转换为异步：

```
def predict()
```
→
```
async def non_streaming()
```

def predict_stream()

→

async def streaming()

```
client.chat()
```
→
```
await client.achat()
```

for chunk in stream:

→

async for chunk in stream:

同步HTTP调用 →
```
await
```
异步等效调用

保留任何特殊逻辑： 迁移原始代理中的任何自定义预处理、后处理或业务逻辑。

3.4 Handle Stateful Agents

3.4 处理有状态代理

If original uses checkpointer (short-term memory):

Add checkpointer with Lakebase integration (use
```
AsyncCheckpointSaver
```
if async, or sync equivalent if sync)
Configure
```
LAKEBASE_INSTANCE_NAME
```
in
```
.env
```

Extract thread_id from

request.custom_inputs

request.context.conversation_id

If original uses store (long-term memory):

Add store with Lakebase integration (use
```
AsyncDatabricksStore
```
if async, or sync equivalent if sync)
Configure
```
LAKEBASE_INSTANCE_NAME
```
in
```
.env
```

Extract user_id from

request.custom_inputs

request.context.user_id

如果原始代理使用检查点（短期记忆）：

添加带有Lakebase集成的检查点（如果是异步则使用
```
AsyncCheckpointSaver
```
，如果是同步则使用同步等效项）
在
```
.env
```
中配置
```
LAKEBASE_INSTANCE_NAME
```

从

request.custom_inputs

或

request.context.conversation_id

中提取thread_id

如果原始代理使用存储（长期记忆）：

添加带有Lakebase集成的存储（如果是异步则使用
```
AsyncDatabricksStore
```
，如果是同步则使用同步等效项）
在
```
.env
```
中配置
```
LAKEBASE_INSTANCE_NAME
```

从

request.custom_inputs

或

request.context.user_id

中提取user_id

Step 4: Set Up the App

步骤4：设置App

Task: Mark "Migrate agent code to Apps format" as
completed
. Mark "Set up and configure the app" as
in_progress
.

任务： 将“将代理代码迁移到Apps格式”标记为
completed
。将“设置并配置App”标记为
in_progress
。

4.1 Verify Build Configuration

4.1 验证构建配置

Before installing dependencies, ensure a README file exists (hatchling requires this):

Ensure a README file exists:

bash

undefined

安装依赖项之前，确保存在README文件（hatchling需要此文件）：

确保存在README文件：

bash

undefined

Create a minimal README if one doesn't exist

如果不存在，创建一个最小的README

if [ ! -f "README.md" ]; then echo "# Migrated Agent App" > README.md fi

undefined

if [ ! -f "README.md" ]; then echo "# 迁移后的代理App" > README.md fi

undefined

4.2 Install Dependencies

4.2 安装依赖项

bash

cd <app-name>
uv sync

bash

cd <app-name>
uv sync

4.3 Create requirements.txt for Databricks Apps

4.3 为Databricks Apps创建requirements.txt

Databricks Apps requires a

requirements.txt

file with

uv

to install dependencies from

pyproject.toml

bash

echo "uv" > requirements.txt

Databricks Apps需要一个包含

uv

的

requirements.txt

文件，以便从

pyproject.toml

安装依赖项：

bash

echo "uv" > requirements.txt

4.4 Run Quickstart

4.4 运行快速启动脚本

Run the

uv run quickstart

script to quickly set up your local environment. This is the recommended way to configure the app as it handles all necessary setup automatically.

bash

uv run quickstart

This script will:

Verify uv, nvm, and Databricks CLI installations
Configure Databricks authentication
Configure agent tracing, by creating and linking an MLflow experiment to your app
Configure
```
.env
```
with the necessary environment variables

Important: The quickstart script creates the MLflow experiment that the app needs for logging traces and models. This experiment will be added as a resource when deploying the app.

If there are issues with the quickstart script, refer to the manual setup in section 4.5.

运行

uv run quickstart

脚本快速设置本地环境。这是推荐的配置App的方式，因为它会自动处理所有必要的设置。

bash

uv run quickstart

该脚本将：

验证uv、nvm和Databricks CLI的安装
配置Databricks身份验证
配置代理跟踪，方法是创建一个MLflow实验并将其链接到你的App
使用必要的环境变量配置
```
.env
```

重要提示： 快速启动脚本创建App记录跟踪和模型所需的MLflow实验。部署App时，此实验将作为资源添加。

如果快速启动脚本出现问题，请参考4.5节中的手动设置。

4.5 Manual Environment Configuration (Optional)

4.5 手动环境配置（可选）

If you need to manually configure the environment or add additional variables, edit

.env

bash

undefined

如果你需要手动配置环境或添加其他变量，请编辑

.env

：

bash

undefined

Databricks authentication

Databricks身份验证

DATABRICKS_CONFIG_PROFILE=<your-profile>

MLflow experiment (created by quickstart, or create manually)

MLflow实验（由快速启动脚本创建，或手动创建）

MLFLOW_EXPERIMENT_ID=<experiment-id>

Example: Lakebase for stateful agents

示例：用于有状态代理的Lakebase

LAKEBASE_INSTANCE_NAME=<your-lakebase-instance>

Example: Custom API keys

示例：自定义API密钥

MY_API_KEY=<value>


To manually create an MLflow experiment:

```bash
databricks experiments create-experiment "/Users/<your-username>/<app-name>" --profile <profile>

MY_API_KEY=<value>


要手动创建MLflow实验：

```bash
databricks experiments create-experiment "/Users/<your-username>/<app-name>" --profile <profile>

Step 5: Test Locally

步骤5：本地测试

Task: Mark "Set up and configure the app" as
completed
. Mark "Test agent locally" as
in_progress
.

Test your migrated agent locally before deploying to Databricks Apps. This helps catch configuration issues early and ensures the agent works correctly.

任务： 将“设置并配置App”标记为
completed
。将“本地测试代理”标记为
in_progress
。

在部署到Databricks Apps之前，先在本地测试迁移后的代理。这有助于及早发现配置问题并确保代理正常工作。

5.1 Start the Server

5.1 启动服务器

After the quickstart setup is complete, start the agent server and chat app locally:

bash

cd <app-name>
uv run start-app

Wait for the server to start. You should see output indicating the server is running on

http://localhost:8000

Note: If you only need the API endpoint (without the chat UI), you can run
uv run start-server
instead.

快速启动设置完成后，在本地启动代理服务器和聊天App：

bash

cd <app-name>
uv run start-app

等待服务器启动。你应该会看到输出表明服务器正在

http://localhost:8000

上运行。

注意： 如果你只需要API端点（不需要聊天UI），可以运行
uv run start-server
代替。

5.2 Test with Original Input Example

5.2 使用原始输入示例进行测试

The original model artifacts include an

input_example.json

file that contains a sample request. Use this to verify your migrated agent produces the same behavior. If there's no valid sample request then figure out a valid sample request to query agent based on its code.

bash

undefined

原始模型工件包含一个

input_example.json

文件，其中包含示例请求。使用此文件验证迁移后的代理是否产生相同的行为。如果没有有效的示例请求，请根据其代码找出有效的示例请求来查询代理。

bash

undefined

Check the original input example (from the <app-name> directory)

检查原始输入示例（从<app-name>目录中）

cat ../original_mlflow_model/input_example.json


Example content:
```json
{"input": [{"role": "user", "content": "What is an LLM agent?"}], "custom_inputs": {"thread_id": "example-thread-123"}}

Test your local server with this input:

bash

undefined

cat ../original_mlflow_model/input_example.json


示例内容：
```json
{"input": [{"role": "user", "content": "什么是LLM代理？"}], "custom_inputs": {"thread_id": "example-thread-123"}}

使用此输入测试本地服务器：

bash

undefined

Test with the original input example

使用原始输入示例进行测试

curl -X POST http://localhost:8000/invocations
-H "Content-Type: application/json"
-d "$(cat ../original_mlflow_model/input_example.json)"

undefined

curl -X POST http://localhost:8000/invocations
-H "Content-Type: application/json"
-d "$(cat ../original_mlflow_model/input_example.json)"

undefined

5.3 Test Basic Requests

5.3 测试基本请求

bash

undefined

bash

undefined

Non-streaming

非流式

curl -X POST http://localhost:8000/invocations
-H "Content-Type: application/json"
-d '{"input": [{"role": "user", "content": "Hello!"}]}'

curl -X POST http://localhost:8000/invocations
-H "Content-Type: application/json"
-d '{"input": [{"role": "user", "content": "你好！"}]}'

Streaming

流式

curl -X POST http://localhost:8000/invocations
-H "Content-Type: application/json"
-d '{"input": [{"role": "user", "content": "Hello!"}], "stream": true}'

undefined

curl -X POST http://localhost:8000/invocations
-H "Content-Type: application/json"
-d '{"input": [{"role": "user", "content": "你好！"}], "stream": true}'

undefined

5.4 Test with Custom Inputs (for stateful agents)

5.4 使用自定义输入测试（针对有状态代理）

bash

undefined

bash

undefined

With thread_id for short-term memory

带有thread_id以实现短期记忆

curl -X POST http://localhost:8000/invocations
-H "Content-Type: application/json"
-d '{"input": [{"role": "user", "content": "Hi"}], "custom_inputs": {"thread_id": "test-123"}}'

curl -X POST http://localhost:8000/invocations
-H "Content-Type: application/json"
-d '{"input": [{"role": "user", "content": "嗨"}], "custom_inputs": {"thread_id": "test-123"}}'

With user_id for long-term memory

带有user_id以实现长期记忆

curl -X POST http://localhost:8000/invocations
-H "Content-Type: application/json"
-d '{"input": [{"role": "user", "content": "Hi"}], "custom_inputs": {"user_id": "user@example.com"}}'

undefined

curl -X POST http://localhost:8000/invocations
-H "Content-Type: application/json"
-d '{"input": [{"role": "user", "content": "嗨"}], "custom_inputs": {"user_id": "user@example.com"}}'

undefined

5.5 Verify Before Proceeding

5.5 继续前验证

Before proceeding to deployment, ensure:

The server starts without errors
The original input example returns a valid response
Streaming responses work correctly
Custom inputs (thread_id, user_id) are handled properly (if applicable)

Note: Only proceed to Step 6 (Deploy) after confirming the agent works correctly locally.

继续部署之前，请确保：

服务器启动时没有错误
原始输入示例返回有效响应
流式响应正常工作
自定义输入（thread_id、user_id）处理正确（如适用）

注意： 确认代理在本地正常工作后，再继续进行步骤6（部署）。

Step 6: Deploy to Databricks Apps

步骤6：部署到Databricks Apps

Task: Mark "Test agent locally" as
completed
. Mark "Deploy to Databricks Apps" as
in_progress
.

This step uses Databricks Asset Bundles (DAB) to deploy. The scaffold includes a

databricks.yml

that you need to update with the app name and resources from the original model.

任务： 将“本地测试代理”标记为
completed
。将“部署到Databricks Apps”标记为
in_progress
。

此步骤使用Databricks Asset Bundles (DAB)进行部署。脚手架包含一个

databricks.yml

，你需要使用App名称和原始模型中的资源更新该文件。

6.1 Extract Resources from Original Model

6.1 从原始模型中提取资源

The original model's

MLmodel

file contains a

resources

section that lists all Databricks resources the agent needs access to. Check

../original_mlflow_model/MLmodel

(or

./original_mlflow_model/MLmodel

if you're in the parent directory) for content like:

yaml

resources:
  api_version: '1'
  databricks:
    lakebase:
    - name: lakebase
    serving_endpoint:
    - name: databricks-claude-sonnet-4-5

原始模型的

MLmodel

文件包含一个

resources

部分，列出了代理需要访问的所有Databricks资源。检查

../original_mlflow_model/MLmodel

（如果在父目录中则检查

./original_mlflow_model/MLmodel

），查找如下内容：

yaml

resources:
  api_version: '1'
  databricks:
    lakebase:
    - name: lakebase
    serving_endpoint:
    - name: databricks-claude-sonnet-4-5

6.2 Update

databricks.yml

with Resources

6.2 使用资源更新

databricks.yml

The scaffold includes a

databricks.yml

with the experiment resource pre-configured. You need to:

Update the app name to

<app-name>

(the name provided by the user) in both the

resources.apps.agent_migration.name

field and the

targets.prod.resources.apps.agent_migration.name

field.

Add resources extracted from the original MLmodel file to the
```
resources.apps.agent_migration.resources
```
list.

Resource Type Mapping (MLmodel →
databricks.yml
):

MLmodel Resource	`databricks.yml` Resource	Key Fields
`serving_endpoint`	`serving_endpoint`	`name` , `permission` (CAN_QUERY)
`lakebase`	`database`	`database_name: databricks_postgres` , `instance_name` , `permission` (CAN_CONNECT_AND_CREATE)
`vector_search_index`	`uc_securable`	`securable_full_name` , `securable_type: TABLE` , `permission: SELECT`
`function`	`uc_securable`	`securable_full_name` , `securable_type: FUNCTION` , `permission: EXECUTE`
`table`	`uc_securable`	`securable_full_name` , `securable_type: TABLE` , `permission: SELECT`
`uc_connection`	`uc_securable`	`securable_full_name` , `securable_type: CONNECTION` , `permission: USE_CONNECTION`
`sql_warehouse`	`sql_warehouse`	`id` , `permission` (CAN_USE)
`genie_space`	`genie_space`	`space_id` , `permission` (CAN_RUN)

Note: The
experiment
resource is already configured in the scaffold
databricks.yml
and is automatically created by the bundle. You do not need to add it manually.

Example:
databricks.yml
for an agent with a serving endpoint and UC function:

yaml

resources:
  experiments:
    agent_migration_experiment:
      name: /Users/${workspace.current_user.userName}/${bundle.name}-${bundle.target}

  apps:
    agent_migration:
      name: "<app-name>"  # Update to user's app name
      description: "Migrated agent from Model Serving to Databricks Apps"
      source_code_path: ./
      resources:
        - name: 'experiment'
          experiment:
            experiment_id: "${resources.experiments.agent_migration_experiment.id}"
            permission: 'CAN_MANAGE'
        - name: 'serving-endpoint'
          serving_endpoint:
            name: 'databricks-claude-sonnet-4-5'
            permission: 'CAN_QUERY'
        - name: 'python-exec'
          uc_securable:
            securable_full_name: 'system.ai.python_exec'
            securable_type: 'FUNCTION'
            permission: 'EXECUTE'

targets:
  prod:
    resources:
      apps:
        agent_migration:
          name: "<app-name>"  # Same name for production

Example: Adding Lakebase resources (for stateful agents):

yaml

        - name: 'database'
          database:
            database_name: 'databricks_postgres'
            instance_name: 'lakebase'
            permission: 'CAN_CONNECT_AND_CREATE'

脚手架包含一个

databricks.yml

，其中预先配置了实验资源。你需要：

更新App名称为

<app-name>

（用户提供的名称），在

resources.apps.agent_migration.name

字段和

targets.prod.resources.apps.agent_migration.name

字段中都要更新。

添加资源，从原始MLmodel文件中提取的资源添加到
```
resources.apps.agent_migration.resources
```
列表中。

资源类型映射（MLmodel →
databricks.yml
）：

MLmodel资源	`databricks.yml` 资源	关键字段
`serving_endpoint`	`serving_endpoint`	`name` , `permission` (CAN_QUERY)
`lakebase`	`database`	`database_name: databricks_postgres` , `instance_name` , `permission` (CAN_CONNECT_AND_CREATE)
`vector_search_index`	`uc_securable`	`securable_full_name` , `securable_type: TABLE` , `permission: SELECT`
`function`	`uc_securable`	`securable_full_name` , `securable_type: FUNCTION` , `permission: EXECUTE`
`table`	`uc_securable`	`securable_full_name` , `securable_type: TABLE` , `permission: SELECT`
`uc_connection`	`uc_securable`	`securable_full_name` , `securable_type: CONNECTION` , `permission: USE_CONNECTION`
`sql_warehouse`	`sql_warehouse`	`id` , `permission` (CAN_USE)
`genie_space`	`genie_space`	`space_id` , `permission` (CAN_RUN)

注意： 实验资源已在脚手架
databricks.yml
中配置，并且会由bundle自动创建。你不需要手动添加它。

示例：带有服务端点和UC函数的代理的
databricks.yml
：

yaml

resources:
  experiments:
    agent_migration_experiment:
      name: /Users/${workspace.current_user.userName}/${bundle.name}-${bundle.target}

  apps:
    agent_migration:
      name: "<app-name>"  # 更新为用户的App名称
      description: "从Model Serving迁移到Databricks Apps的代理"
      source_code_path: ./
      resources:
        - name: 'experiment'
          experiment:
            experiment_id: "${resources.experiments.agent_migration_experiment.id}"
            permission: 'CAN_MANAGE'
        - name: 'serving-endpoint'
          serving_endpoint:
            name: 'databricks-claude-sonnet-4-5'
            permission: 'CAN_QUERY'
        - name: 'python-exec'
          uc_securable:
            securable_full_name: 'system.ai.python_exec'
            securable_type: 'FUNCTION'
            permission: 'EXECUTE'

targets:
  prod:
    resources:
      apps:
        agent_migration:
          name: "<app-name>"  # 生产环境使用相同名称

示例：添加Lakebase资源（针对有状态代理）：

yaml

        - name: 'database'
          database:
            database_name: 'databricks_postgres'
            instance_name: 'lakebase'
            permission: 'CAN_CONNECT_AND_CREATE'

6.3 Deploy with Databricks Asset Bundles

6.3 使用Databricks Asset Bundles部署

From inside the

<app-name>

directory, validate, deploy, and run:

bash

undefined

在

<app-name>

目录内，验证、部署并运行：

bash

undefined

1. Validate bundle configuration (catches errors before deploy)

1. 验证bundle配置（在部署前捕获错误）

databricks bundle validate --profile <profile>

2. Deploy the bundle (creates/updates resources, uploads files)

2. 部署bundle（创建/更新资源，上传文件）

databricks bundle deploy --profile <profile>

3. Run the app (starts/restarts with uploaded source code) - REQUIRED!

3. 运行App（使用上传的源代码启动/重启） - 必须执行！

databricks bundle run agent_migration --profile <profile>


> **Important:** `bundle deploy` only uploads files and configures resources. `bundle run` is **required** to actually start/restart the app with the new code. If you only run `deploy`, the app will continue running old code!

databricks bundle run agent_migration --profile <profile>


> **重要提示：** `bundle deploy`仅上传文件并配置资源。`bundle run`是**必须执行**的，才能使用新代码实际启动/重启App。如果你只运行`deploy`，App将继续运行旧代码！

6.4 Test Deployed App

6.4 测试已部署的App

Task: Mark "Deploy to Databricks Apps" as
completed
. Mark "Test deployed app" as
in_progress
.

bash

undefined

任务： 将“部署到Databricks Apps”标记为
completed
。将“测试已部署的App”标记为
in_progress
。

bash

undefined

Get the app URL

获取App URL

APP_URL=$(databricks apps get <app-name> --profile <profile> --output json | jq -r '.url')

Get OAuth token

获取OAuth令牌

TOKEN=$(databricks auth token --profile <profile> | jq -r .access_token)

Query the app

查询App

curl -X POST ${APP_URL}/invocations
-H "Authorization: Bearer $TOKEN"
-H "Content-Type: application/json"
-d '{"input": [{"role": "user", "content": "Hello!"}]}'


Once the deployed app responds successfully:

> **Task:** Mark "Test deployed app" as `completed`. Migration complete!

curl -X POST ${APP_URL}/invocations
-H "Authorization: Bearer $TOKEN"
-H "Content-Type: application/json"
-d '{"input": [{"role": "user", "content": "你好！"}]}'


一旦已部署的App成功响应：

> **任务：** 将“测试已部署的App”标记为`completed`。迁移完成！

6.5 Deployment Troubleshooting

6.5 部署故障排除

If you encounter issues during deployment, refer to the deploy skill for detailed guidance.

Debug commands:

bash

undefined

如果在部署过程中遇到问题，请参考deploy技能获取详细指导。

调试命令：

bash

undefined

Validate bundle configuration

验证bundle配置

databricks bundle validate --profile <profile>

View app logs

查看App日志

databricks apps logs <app-name> --profile <profile> --follow

Check app status

检查App状态

databricks apps get <app-name> --profile <profile> --output json | jq '{app_status, compute_status}'

Get app URL

获取App URL

databricks apps get <app-name> --profile <profile> --output json | jq -r '.url'


**"App already exists" error:**
If `databricks bundle deploy` fails because the app already exists, refer to the **deploy** skill for instructions on binding an existing app to the bundle.

---

databricks apps get <app-name> --profile <profile> --output json | jq -r '.url'


**“App已存在”错误：**
如果`databricks bundle deploy`因App已存在而失败，请参考**deploy**技能获取有关将现有App绑定到bundle的说明。

---

Reference: App File Structure

参考：App文件结构

<app-name>/
├── agent_server/
│   ├── __init__.py
│   ├── agent.py          # Main agent logic - THIS IS WHERE YOU MIGRATE TO
│   ├── start_server.py   # FastAPI server setup
│   ├── utils.py          # Helper utilities
│   └── evaluate_agent.py # Agent evaluation
├── scripts/
│   ├── __init__.py
│   ├── quickstart.py     # Setup script
│   └── start_app.py      # App startup
├── app.yaml              # Databricks Apps configuration
├── databricks.yml        # Databricks Asset Bundle configuration (resources, targets)
├── pyproject.toml        # Dependencies (for local dev with uv)
├── requirements.txt      # REQUIRED: Must contain "uv" for Databricks Apps
├── .env.example          # Environment template
└── README.md

IMPORTANT: The
requirements.txt
file must exist and contain
uv
so that Databricks Apps can install dependencies using the
pyproject.toml
. Without this file, the app will fail to start.

<app-name>/
├── agent_server/
│   ├── __init__.py
│   ├── agent.py          # 主代理逻辑 - 这是你要迁移到的地方
│   ├── start_server.py   # FastAPI服务器设置
│   ├── utils.py          # 辅助工具
│   └── evaluate_agent.py # 代理评估
├── scripts/
│   ├── __init__.py
│   ├── quickstart.py     # 设置脚本
│   └── start_app.py      # App启动脚本
├── app.yaml              # Databricks Apps配置
├── databricks.yml        # Databricks Asset Bundle配置（资源、目标）
├── pyproject.toml        # 依赖项（用于使用uv进行本地开发）
├── requirements.txt      # 必须：必须包含"uv"，以便Databricks Apps使用
├── .env.example          # 环境模板
└── README.md

重要提示：
requirements.txt
文件必须存在并包含
uv
，以便Databricks Apps可以使用
pyproject.toml
安装依赖项。没有此文件，App将无法启动。

Reference: Common Migration Patterns

参考：常见迁移模式

Pattern 1: Simple Chat Agent

模式1：简单聊天代理

Original:

python

class ChatAgent(ResponsesAgent):
    def predict(self, request, params=None):
        messages = to_chat_completions_input(request.input)
        response = self.llm.invoke(messages)
        return ResponsesAgentResponse(output=[...])

Migrated (sync):

python

llm = ...  # Move class-level init to module level

@invoke()
def non_streaming(request: ResponsesAgentRequest) -> ResponsesAgentResponse:
    messages = to_chat_completions_input(request.input)
    response = llm.invoke(messages)
    return ResponsesAgentResponse(output=[...])

@stream()
def streaming(request: ResponsesAgentRequest):
    # Original predict_stream() body, with self. removed
    ...

Migrated (async):

python

@invoke()
async def non_streaming(request: ResponsesAgentRequest) -> ResponsesAgentResponse:
    outputs = [e.item async for e in streaming(request) if e.type == "response.output_item.done"]
    return ResponsesAgentResponse(output=outputs)

@stream()
async def streaming(request: ResponsesAgentRequest) -> AsyncGenerator[ResponsesAgentStreamEvent, None]:
    messages = {"messages": to_chat_completions_input([i.model_dump() for i in request.input])}
    agent = await init_agent()
    async for event in process_agent_astream_events(agent.astream(messages, stream_mode=["updates", "messages"])):
        yield event

原始：

python

class ChatAgent(ResponsesAgent):
    def predict(self, request, params=None):
        messages = to_chat_completions_input(request.input)
        response = self.llm.invoke(messages)
        return ResponsesAgentResponse(output=[...])

迁移后（同步）：

python

llm = ...  # 将类级初始化移到模块级别

@invoke()
def non_streaming(request: ResponsesAgentRequest) -> ResponsesAgentResponse:
    messages = to_chat_completions_input(request.input)
    response = llm.invoke(messages)
    return ResponsesAgentResponse(output=[...])

@stream()
def streaming(request: ResponsesAgentRequest):
    # 原始predict_stream()主体，移除self.
    ...

迁移后（异步）：

python

@invoke()
async def non_streaming(request: ResponsesAgentRequest) -> ResponsesAgentResponse:
    outputs = [e.item async for e in streaming(request) if e.type == "response.output_item.done"]
    return ResponsesAgentResponse(output=outputs)

@stream()
async def streaming(request: ResponsesAgentRequest) -> AsyncGenerator[ResponsesAgentStreamEvent, None]:
    messages = {"messages": to_chat_completions_input([i.model_dump() for i in request.input])}
    agent = await init_agent()
    async for event in process_agent_astream_events(agent.astream(messages, stream_mode=["updates", "messages"])):
        yield event

Pattern 2: Agent with Custom Tools

模式2：带有自定义工具的代理

Sync: Keep tools as-is from the original code.

Async: Migrate tools to async LangChain tools:

python

from langchain_core.tools import tool

@tool
async def search_docs(query: str) -> str:
    """Search the documentation."""
    results = await vector_store.asimilarity_search(query)
    return format_results(results)

同步： 保留原始代码中的工具不变。

异步： 将工具迁移为异步LangChain工具：

python

from langchain_core.tools import tool

@tool
async def search_docs(query: str) -> str:
    """搜索文档。"""
    results = await vector_store.asimilarity_search(query)
    return format_results(results)

Pattern 3: Using LangGraph with create_agent (async only)

模式3：使用LangGraph和create_agent（仅异步）

python

from langchain.agents import create_agent
from databricks_langchain import ChatDatabricks

async def init_agent():
    tools = await mcp_client.get_tools()  # MCP tools are async
    model = ChatDatabricks(endpoint=LLM_ENDPOINT_NAME)
    return create_agent(model=model, tools=tools, system_prompt=SYSTEM_PROMPT)

python

from langchain.agents import create_agent
from databricks_langchain import ChatDatabricks

async def init_agent():
    tools = await mcp_client.get_tools()  # MCP工具是异步的
    model = ChatDatabricks(endpoint=LLM_ENDPOINT_NAME)
    return create_agent(model=model, tools=tools, system_prompt=SYSTEM_PROMPT)

Reference: Useful Resources

参考：有用资源

Responses API Docs: https://mlflow.org/docs/latest/genai/serving/responses-agent/
Agent Framework: https://docs.databricks.com/aws/en/generative-ai/agent-framework/
Agent Tools: https://docs.databricks.com/aws/en/generative-ai/agent-framework/agent-tool
databricks-langchain SDK: https://github.com/databricks/databricks-ai-bridge/tree/main/integrations/langchain

Responses API文档： https://mlflow.org/docs/latest/genai/serving/responses-agent/
代理框架： https://docs.databricks.com/aws/en/generative-ai/agent-framework/
代理工具： https://docs.databricks.com/aws/en/generative-ai/agent-framework/agent-tool
databricks-langchain SDK： https://github.com/databricks/databricks-ai-bridge/tree/main/integrations/langchain

Troubleshooting

故障排除

"Module not found" errors

"找不到模块"错误

bash

uv sync  # Reinstall dependencies

bash

uv sync  # 重新安装依赖项

Authentication errors

身份验证错误

bash

databricks auth login  # Re-authenticate

bash

databricks auth login  # 重新进行身份验证

Lakebase permission errors

Lakebase权限错误

Ensure the Lakebase instance is added as an app resource in Databricks UI
Grant appropriate permissions on the Lakebase instance

确保Lakebase实例已作为App资源添加到Databricks UI中
在Lakebase实例上授予适当的权限

Async errors (async migration only)

异步错误（仅异步迁移）

Ensure all I/O calls use async versions (e.g.,
```
await client.achat()
```
not
```
client.chat()
```
)
Use
```
async for
```
instead of
```
for
```
when iterating async generators
If you chose sync migration, these errors should not occur — double-check that you're not mixing sync and async patterns

确保所有I/O调用使用异步版本（例如
```
await client.achat()
```
而不是
```
client.chat()
```
）
迭代异步生成器时使用
```
async for
```
而不是
```
for
```
如果你选择了同步迁移，不应出现这些错误——请仔细检查你是否混合了同步和异步模式

migrate-from-model-serving

Original

Translation

Model Serving to Databricks Apps Migration Guide

从Model Serving迁移到Databricks Apps的指南

Overview

概述

Before You Begin: Gather User Inputs

开始前：收集用户输入

Validate Authentication

验证身份验证

Create the App Directory

创建App目录

Create Task List

创建任务列表

Step 1: Download the Original Agent Code

步骤1：下载原始代理代码

1.1 Get Model Info from Endpoint

1.1 从端点获取模型信息

Get endpoint info (remember to include --profile if using non-default)

获取端点信息（如果使用非默认配置文件，请记得包含--profile）

1.2 Download Model Artifacts

1.2 下载模型工件

Replace with actual values from step 1.1

替换为步骤1.1中的实际值

1.3 Verify Downloaded Artifacts

1.3 验证下载的工件

List all downloaded files recursively

递归列出所有下载的文件

Check for MLmodel file (contains resource requirements)

检查MLmodel文件（包含资源要求）

Check for input example (useful for testing)

检查输入示例（对测试有用）

List all code files

列出所有代码文件

The main agent is typically agent.py, but there may be additional modules

主代理通常是agent.py，但可能还有其他模块

Check for artifacts folder

检查是否存在artifacts文件夹

List all artifacts

列出所有工件

Expected Output Structure

预期输出结构

Key Files to Examine

需要检查的关键文件

Troubleshooting Model Download

模型下载故障排除

List profiles to see which workspace each points to

列出配置文件以查看每个配置文件指向哪个工作区

Verify you can access the workspace

验证你是否可以访问该工作区

List models in that workspace

列出该工作区中的模型

Step 2: Understand the Key Transformations

步骤2：了解核心转换

Entry Point Transformation

入口点转换

Key Differences

核心差异

Async Patterns (only if <async> = yes)

异步模式（仅当<async> = yes时）

OLD (sync)

旧版（同步）

NEW (async)

新版（异步）

OLD (sync iteration)

旧版（同步迭代）

NEW (async iteration)

新版（异步迭代）

Step 3: Migrate the Agent Code

步骤3：迁移代理代码

3.1 Copy Code Dependencies and Artifacts

3.1 复制代码依赖项和工件

Copy all Python files from original code folder

从原始代码文件夹复制所有Python文件

If there are subdirectories with code, copy those too

如果有包含代码的子目录，也复制这些目录

cp -r ./original_mlflow_model/code/submodule ./<app-name>/agent_server/

cp -r ./original_mlflow_model/code/submodule ./<app-name>/agent_server/

Create an artifacts directory in the migrated app if needed

Async Patterns (only if
`<async>`
= yes)

异步模式（仅当
`<async>`
= yes时）

Path A: Synchronous Migration (
`<async>`
= no)

路径A：同步迁移（
`<async>`
= no）

将任何类init或类级设置移到模块级别

Path B: Async Migration (
`<async>`
= yes)

路径B：异步迁移（
`<async>`
= yes）