migrate-from-model-serving
Compare original and translation side by side
🇺🇸
Original
English🇨🇳
Translation
ChineseModel Serving to Databricks Apps Migration Guide
从Model Serving迁移到Databricks Apps的指南
This guide instructs LLM coding agents how to migrate an MLflow ResponsesAgent from Databricks Model Serving to Databricks Apps.
本指南指导LLM编码代理如何将MLflow ResponsesAgent从Databricks Model Serving迁移到Databricks Apps。
Overview
概述
Goal: Migrate an agent deployed on Databricks Model Serving (using with /) to Databricks Apps (using MLflow GenAI Server with / decorators).
ResponsesAgentpredict()predict_stream()@invoke@streamKey Transformation:
- Model Serving: Synchronous and
predict()methods on a classpredict_stream() - Apps: Functions with and
@invokedecorators (sync or async, based on user preference)@stream
Deliverables: After migration is complete, you will have:
<working-directory>/
├── original_mlflow_model/ # Downloaded artifacts from Model Serving
│ ├── MLmodel
│ ├── code/
│ │ └── agent.py
│ ├── input_example.json
│ └── requirements.txt
│
└── <app-name>/ # New Databricks App (ready to deploy)
├── agent_server/
│ ├── agent.py # Migrated agent code
│ └── ...
├── app.yaml
├── databricks.yml # Bundle config with resources
├── pyproject.toml
├── requirements.txt
└── ...is the name the user provides at the start of the migration. It is used as both the directory name and the Databricks App name at deploy time.<app-name>
目标: 将部署在Databricks Model Serving上的代理(使用带有/的)迁移到Databricks Apps(使用带有/装饰器的MLflow GenAI Server)。
predict()predict_stream()ResponsesAgent@invoke@stream核心转换:
- Model Serving:类中的同步和
predict()方法predict_stream() - Apps:带有和
@invoke装饰器的函数(根据用户偏好选择同步或异步)@stream
交付成果: 迁移完成后,你将拥有:
<working-directory>/
├── original_mlflow_model/ # 从Model Serving下载的工件
│ ├── MLmodel
│ ├── code/
│ │ └── agent.py
│ ├── input_example.json
│ └── requirements.txt
│
└── <app-name>/ # 新的Databricks App(已准备好部署)
├── agent_server/
│ ├── agent.py # 迁移后的代理代码
│ └── ...
├── app.yaml
├── databricks.yml # 包含资源的Bundle配置
├── pyproject.toml
├── requirements.txt
└── ...是用户在迁移开始时提供的名称。它既用作目录名称,也用作部署时的Databricks App名称。<app-name>
Before You Begin: Gather User Inputs
开始前:收集用户输入
Before doing anything else, ask the user three questions. Use the tool to collect all answers at once so the user is only prompted once, then Claude can execute the rest of the migration autonomously.
AskUserQuestionQuestions to ask:
- Databricks profile: Which Databricks CLI profile should be used for the workspace where the Model Serving endpoint lives? (Run first to list available profiles and their workspaces, then present the options to the user.)
databricks auth profiles - App name: What should the new Databricks App be named? (Must be lowercase, can contain letters, numbers, and hyphens, and must be unique within the workspace.)
- Async migration: Would you like to migrate your agent code to be fully async?
- Yes (Recommended): Converts all I/O operations to async (/
await), enabling higher concurrency on smaller compute — no more threads sitting idle while waiting for LLM responses or long-running tool calls.async for - No: Keeps your existing synchronous code with minimal changes — just extracts the logic from the class and wraps it with
ResponsesAgent/@invokedecorators. Simpler migration, but each request blocks a thread while waiting for I/O.@stream
- Yes (Recommended): Converts all I/O operations to async (
Store the answers as:
- — used for ALL
<profile>CLI commands throughout the migration (viadatabricks)--profile <profile> - — used as both the directory name for the migrated app AND the app name when deploying with
<app-name>databricks bundle deploy - —
<async>oryes, determines whether to convert the agent code to async or keep it synchronousno
在开始任何操作之前,请向用户询问三个问题。 使用工具一次性收集所有答案,之后Claude可以自主完成剩余的迁移工作。
AskUserQuestion要询问的问题:
- Databricks配置文件: 应该使用哪个Databricks CLI配置文件来访问Model Serving端点所在的工作区?(先运行列出可用配置文件及其对应的工作区,然后向用户展示选项。)
databricks auth profiles - App名称: 新的Databricks App应该命名为什么?(必须为小写,可以包含字母、数字和连字符,并且在工作区内必须唯一。)
- 异步迁移: 你是否希望将代理代码迁移为完全异步?
- 是(推荐): 将所有I/O操作转换为异步(/
await),在较小的计算资源上实现更高的并发量——等待LLM响应或长时间运行的工具调用时,不再有线程处于空闲状态。async for - 否: 保留现有的同步代码,仅进行最小更改——只需将逻辑从类中提取出来,并用
ResponsesAgent/@invoke装饰器包装。迁移更简单,但每个请求在等待I/O时都会阻塞一个线程。@stream
- 是(推荐): 将所有I/O操作转换为异步(
将答案存储为:
- — 在整个迁移过程中用于所有
<profile>CLI命令(通过databricks)--profile <profile> - — 既用作迁移后App的目录名称,也用作使用
<app-name>部署时的App名称databricks bundle deploy - —
<async>或yes,决定是否将代理代码转换为异步或保持同步no
Validate Authentication
验证身份验证
After receiving the user's answers, validate the selected profile:
bash
databricks current-user me --profile <profile>If this fails with an authentication error, prompt the user to re-authenticate:
bash
databricks auth login --profile <profile>Important: Remember to includeon every--profile <profile>CLI command throughout the migration.databricks
收到用户的答案后,验证所选配置文件:
bash
databricks current-user me --profile <profile>如果此命令因身份验证错误失败,请提示用户重新进行身份验证:
bash
databricks auth login --profile <profile>重要提示: 在整个迁移过程中,记得在每个CLI命令中包含databricks。--profile <profile>
Create the App Directory
创建App目录
Copy all scaffold files from the current working directory into a new directory named . Exclude instruction files (, ), hidden directories (, ), and any migration artifacts (e.g., , ). Do NOT search for or copy scaffold files from other directories or templates — everything you need is right here.
<app-name>/AGENTS.mdCLAUDE.md.claude/.git/original_mlflow_model/.migration-venv/All subsequent migration steps operate inside the directory.
<app-name>/Note: Thescaffold is intentionally framework-agnostic — it contains theagent_server/agent.py/@invokedecorator pattern with TODO placeholders. Step 3 (Migrate the Agent Code) will replace these placeholders with the actual agent logic from the original Model Serving endpoint.@stream
将当前工作目录中的所有脚手架文件复制到名为的新目录中。排除说明文件(、)、隐藏目录(、)和任何迁移工件(例如、)。不要从其他目录或模板中搜索或复制脚手架文件——你需要的所有内容都在这里。
<app-name>/AGENTS.mdCLAUDE.md.claude/.git/original_mlflow_model/.migration-venv/所有后续迁移步骤都在目录内进行。
<app-name>/注意:脚手架是有意设计为与框架无关的——它包含带有TODO占位符的agent_server/agent.py/@invoke装饰器模式。步骤3(迁移代理代码)会将这些占位符替换为来自原始Model Serving端点的实际代理逻辑。@stream
Create Task List
创建任务列表
Create a task list to track progress. This helps the user follow along and see what's completed, in progress, and pending.
User tip: Pressto toggle the task list view in your terminal. The display shows up to 10 tasks at a time with status indicators.Ctrl+T
Create the following tasks using the tool:
TaskCreate| Task | Description |
|---|---|
| Authenticate to Databricks | Verify Databricks CLI authentication and validate the selected profile |
| Download original agent artifacts | Download the MLflow model artifacts from Model Serving endpoint |
| Analyze and understand agent code | Examine the original agent code, identify tools, resources, and dependencies |
| Migrate agent code to Apps format | Transform ResponsesAgent class to @invoke/@stream decorated functions |
| Set up and configure the app | Install dependencies, run quickstart, configure environment |
| Test agent locally | Start local server and verify the agent works correctly |
| Deploy to Databricks Apps | Configure databricks.yml resources and deploy with Databricks Asset Bundles |
| Test deployed app | Verify the deployed app responds correctly |
Update task status as you progress:
- Mark tasks as when starting each step
in_progress - Mark tasks as when finished
completed - This gives the user visibility into migration progress
创建任务列表以跟踪进度。 这有助于用户跟进并查看已完成、进行中和待处理的任务。
用户提示: 在终端中按可切换任务列表视图。该视图最多显示10个任务,并带有状态指示器。Ctrl+T
使用工具创建以下任务:
TaskCreate| 任务 | 描述 |
|---|---|
| 身份验证到Databricks | 验证Databricks CLI身份验证并验证所选配置文件 |
| 下载原始代理工件 | 从Model Serving端点下载MLflow模型工件 |
| 分析并理解代理代码 | 检查原始代理代码,识别工具、资源和依赖项 |
| 将代理代码迁移到Apps格式 | 将ResponsesAgent类转换为带有@invoke/@stream装饰器的函数 |
| 设置并配置App | 安装依赖项,运行快速启动脚本,配置环境 |
| 本地测试代理 | 启动本地服务器并验证代理是否正常工作 |
| 部署到Databricks Apps | 配置databricks.yml资源并使用Databricks Asset Bundles部署 |
| 测试已部署的App | 验证已部署的App是否能正确响应 |
随着进度更新任务状态:
- 开始每个步骤时,将任务标记为
in_progress - 完成后,将任务标记为
completed - 这使用户能够了解迁移进度
Step 1: Download the Original Agent Code
步骤1:下载原始代理代码
Task: Mark "Authenticate to Databricks" as. Mark "Download original agent artifacts" ascompleted.in_progressNote: Theand<profile>values were collected from the user in the "Before You Begin" section. Use them throughout.<app-name>
Download the original agent code from the Model Serving endpoint. This requires setting up a virtual environment with MLflow to access the model artifacts.
任务: 将“身份验证到Databricks”标记为。将“下载原始代理工件”标记为completed。in_progress注意:和<profile>的值是在“开始前”部分从用户那里收集的。请在整个过程中使用它们。<app-name>
从Model Serving端点下载原始代理代码。这需要设置一个带有MLflow的虚拟环境来访问模型工件。
1.1 Get Model Info from Endpoint
1.1 从端点获取模型信息
If you have a serving endpoint name, extract the model details:
bash
undefined如果你有服务端点名称,请提取模型详细信息:
bash
undefinedGet endpoint info (remember to include --profile if using non-default)
获取端点信息(如果使用非默认配置文件,请记得包含--profile)
databricks serving-endpoints get <endpoint-name> --profile <profile> --output json
Look for `served_entities[0].entity_name` (model name) and `entity_version` in the response. Find the entity with 100% traffic in `traffic_config.routes`.databricks serving-endpoints get <endpoint-name> --profile <profile> --output json
在响应中查找`served_entities[0].entity_name`(模型名称)和`entity_version`。在`traffic_config.routes`中找到流量占比为100%的实体。1.2 Download Model Artifacts
1.2 下载模型工件
Use to download artifacts without creating a separate virtual environment. The extra includes for Unity Catalog artifact access:
uv run --withmlflow[databricks]boto3bash
DATABRICKS_CONFIG_PROFILE=<profile> uv run --no-project \
--with "mlflow[databricks]>=2.15.0" \
--with "databricks-sdk>=0.30.0" \
python3 << 'EOF'
import mlflow
mlflow.set_tracking_uri("databricks")使用下载工件,无需创建单独的虚拟环境。扩展包含用于Unity Catalog工件访问的:
uv run --withmlflow[databricks]boto3bash
DATABRICKS_CONFIG_PROFILE=<profile> uv run --no-project \
--with "mlflow[databricks]>=2.15.0" \
--with "databricks-sdk>=0.30.0" \
python3 << 'EOF'
import mlflow
mlflow.set_tracking_uri("databricks")Replace with actual values from step 1.1
替换为步骤1.1中的实际值
MODEL_NAME = "<model-name>"
VERSION = "<version>"
print(f"Downloading model: models:/{MODEL_NAME}/{VERSION}")
mlflow.artifacts.download_artifacts(
artifact_uri=f"models:/{MODEL_NAME}/{VERSION}",
dst_path="./original_mlflow_model"
)
print("Download complete! Artifacts saved to ./original_mlflow_model")
EOF
undefinedMODEL_NAME = "<model-name>"
VERSION = "<version>"
print(f"正在下载模型: models:/{MODEL_NAME}/{VERSION}")
mlflow.artifacts.download_artifacts(
artifact_uri=f"models:/{MODEL_NAME}/{VERSION}",
dst_path="./original_mlflow_model"
)
print("下载完成!工件已保存到./original_mlflow_model")
EOF
undefined1.3 Verify Downloaded Artifacts
1.3 验证下载的工件
Check that the key files exist and understand the full structure:
bash
undefined检查关键文件是否存在并了解完整结构:
bash
undefinedList all downloaded files recursively
递归列出所有下载的文件
find ./original_mlflow_model -type f | head -50
find ./original_mlflow_model -type f | head -50
Check for MLmodel file (contains resource requirements)
检查MLmodel文件(包含资源要求)
cat ./original_mlflow_model/MLmodel
cat ./original_mlflow_model/MLmodel
Check for input example (useful for testing)
检查输入示例(对测试有用)
cat ./original_mlflow_model/input_example.json 2>/dev/null
**Examine the `/code` folder** - contains all code dependencies logged via `code_paths=["..."]`:
```bashcat ./original_mlflow_model/input_example.json 2>/dev/null
**检查`/code`文件夹** - 包含通过`code_paths=["..."]`记录的所有代码依赖项:
```bashList all code files
列出所有代码文件
ls -la ./original_mlflow_model/code/
ls -la ./original_mlflow_model/code/
The main agent is typically agent.py, but there may be additional modules
主代理通常是agent.py,但可能还有其他模块
find ./original_mlflow_model/code -name "*.py" -type f
**Examine the `/artifacts` folder** (if present) - contains artifacts logged via `artifacts={...}`:
```bashfind ./original_mlflow_model/code -name "*.py" -type f
**检查`/artifacts`文件夹(如果存在)** - 包含通过`artifacts={...}`记录的工件:
```bashCheck for artifacts folder
检查是否存在artifacts文件夹
ls -la ./original_mlflow_model/artifacts/ 2>/dev/null
ls -la ./original_mlflow_model/artifacts/ 2>/dev/null
List all artifacts
列出所有工件
find ./original_mlflow_model/artifacts -type f 2>/dev/null
> **Important:** Take note of ALL files in `/code` and `/artifacts`. You will need to copy these to the migrated app and ensure imports still work correctly.find ./original_mlflow_model/artifacts -type f 2>/dev/null
> **重要提示:** 记录`/code`和`/artifacts`中的所有文件。你需要将这些文件复制到迁移后的App中,并确保导入仍然正确工作。Expected Output Structure
预期输出结构
After successful download, you should have:
./original_mlflow_model/
├── MLmodel # Model metadata and resource requirements
├── code/ # Code logged via code_paths=["..."]
│ ├── agent.py # Main agent implementation
│ ├── utils.py # (optional) Helper modules
│ ├── tools.py # (optional) Custom tool definitions
│ └── ... # Any other code dependencies
├── artifacts/ # (optional) Artifacts logged via artifacts={...}
│ ├── config.yaml # (optional) Configuration files
│ ├── prompts/ # (optional) Prompt templates
│ └── ... # Any other artifacts (data files, etc.)
├── input_example.json # Sample request for testing
├── requirements.txt # Original dependencies
└── ...成功下载后,你应该拥有:
./original_mlflow_model/
├── MLmodel # 模型元数据和资源要求
├── code/ # 通过code_paths=["..."]记录的代码
│ ├── agent.py # 主代理实现
│ ├── utils.py # (可选)辅助模块
│ ├── tools.py # (可选)自定义工具定义
│ └── ... # 任何其他代码依赖项
├── artifacts/ # (可选)通过artifacts={...}记录的工件
│ ├── config.yaml # (可选)配置文件
│ ├── prompts/ # (可选)提示模板
│ └── ... # 任何其他工件(数据文件等)
├── input_example.json # 用于测试的示例请求
├── requirements.txt # 原始依赖项
└── ...Key Files to Examine
需要检查的关键文件
- - Contains the
code/agent.pyclass withResponsesAgentandpredict()methodspredict_stream() - - Any additional Python modules the agent imports
code/*.py - - Contains the
MLmodelsection listing required Databricks resourcesresources - - Any configuration files, prompts, or data files the agent uses
artifacts/ - - Use this to test the migrated agent
input_example.json
- - 包含带有
code/agent.py和predict()方法的predict_stream()类ResponsesAgent - - 代理导入的任何其他Python模块
code/*.py - - 包含列出所需Databricks资源的
MLmodel部分resources - - 代理使用的任何配置文件、提示或数据文件
artifacts/ - - 用于测试迁移后的代理
input_example.json
Troubleshooting Model Download
模型下载故障排除
"Unable to import necessary dependencies to access model version files in Unity Catalog"
This means is missing. Ensure you're using (not just ) in the flag — the extra includes .
boto3mlflow[databricks]mlflow--with[databricks]boto3"INVALID_PARAMETER_VALUE" or authentication errors
Re-authenticate with Databricks (include profile if non-default):
bash
databricks auth login --profile <profile>Wrong workspace / Model not found
Make sure you're using the correct profile that corresponds to the workspace where the model is deployed:
bash
undefined"无法导入必要的依赖项以访问Unity Catalog中的模型版本文件"
这意味着缺少。确保在标志中使用(而不仅仅是)——扩展包含。
boto3--withmlflow[databricks]mlflow[databricks]boto3"INVALID_PARAMETER_VALUE"或身份验证错误
重新进行Databricks身份验证(如果是非默认配置文件,请包含配置文件):
bash
databricks auth login --profile <profile>错误的工作区/未找到模型
确保你使用的配置文件对应于模型部署所在的工作区:
bash
undefinedList profiles to see which workspace each points to
列出配置文件以查看每个配置文件指向哪个工作区
databricks auth profiles
databricks auth profiles
Verify you can access the workspace
验证你是否可以访问该工作区
databricks current-user me --profile <profile>
databricks current-user me --profile <profile>
List models in that workspace
列出该工作区中的模型
databricks registered-models list --profile <profile>
databricks model-versions list --name "<model-name>" --profile <profile>
---databricks registered-models list --profile <profile>
databricks model-versions list --name "<model-name>" --profile <profile>
---Step 2: Understand the Key Transformations
步骤2:了解核心转换
Task: Mark "Download original agent artifacts" as. Mark "Analyze and understand agent code" ascompleted.in_progress
任务: 将“下载原始代理工件”标记为。将“分析并理解代理代码”标记为completed。in_progress
Entry Point Transformation
入口点转换
In both cases, the class is replaced with decorated functions. The difference is whether those functions are async or sync.
ResponsesAgentModel Serving (OLD):
python
from mlflow.pyfunc import ResponsesAgent, ResponsesAgentRequest, ResponsesAgentResponse
class MyAgent(ResponsesAgent):
def predict(self, request: ResponsesAgentRequest, params=None) -> ResponsesAgentResponse:
# Synchronous implementation
...
return ResponsesAgentResponse(output=outputs)
def predict_stream(self, request: ResponsesAgentRequest, params=None):
# Synchronous generator
for chunk in ...:
yield ResponsesAgentStreamEvent(...)Apps — Async (if = yes):
<async>python
from mlflow.genai.agent_server import invoke, stream
from mlflow.types.responses import (
ResponsesAgentRequest,
ResponsesAgentResponse,
ResponsesAgentStreamEvent,
)
@invoke()
async def non_streaming(request: ResponsesAgentRequest) -> ResponsesAgentResponse:
# Async implementation - typically calls streaming() and collects results
outputs = [
event.item
async for event in streaming(request)
if event.type == "response.output_item.done"
]
return ResponsesAgentResponse(output=outputs)
@stream()
async def streaming(request: ResponsesAgentRequest) -> AsyncGenerator[ResponsesAgentStreamEvent, None]:
# Async generator
async for event in ...:
yield eventApps — Sync (if = no):
<async>python
from mlflow.genai.agent_server import invoke, stream
from mlflow.types.responses import (
ResponsesAgentRequest,
ResponsesAgentResponse,
ResponsesAgentStreamEvent,
)
@invoke()
def non_streaming(request: ResponsesAgentRequest) -> ResponsesAgentResponse:
# Same sync logic from original predict(), extracted from the class
...
return ResponsesAgentResponse(output=outputs)
@stream()
def streaming(request: ResponsesAgentRequest):
# Same sync generator from original predict_stream(), extracted from the class
for chunk in ...:
yield ResponsesAgentStreamEvent(...)在两种情况下,类都被替换为带有装饰器的函数。区别在于这些函数是异步还是同步。
ResponsesAgentModel Serving(旧版):
python
from mlflow.pyfunc import ResponsesAgent, ResponsesAgentRequest, ResponsesAgentResponse
class MyAgent(ResponsesAgent):
def predict(self, request: ResponsesAgentRequest, params=None) -> ResponsesAgentResponse:
# 同步实现
...
return ResponsesAgentResponse(output=outputs)
def predict_stream(self, request: ResponsesAgentRequest, params=None):
# 同步生成器
for chunk in ...:
yield ResponsesAgentStreamEvent(...)Apps — 异步(如果 = yes):
<async>python
from mlflow.genai.agent_server import invoke, stream
from mlflow.types.responses import (
ResponsesAgentRequest,
ResponsesAgentResponse,
ResponsesAgentStreamEvent,
)
@invoke()
async def non_streaming(request: ResponsesAgentRequest) -> ResponsesAgentResponse:
# 异步实现 - 通常调用streaming()并收集结果
outputs = [
event.item
async for event in streaming(request)
if event.type == "response.output_item.done"
]
return ResponsesAgentResponse(output=outputs)
@stream()
async def streaming(request: ResponsesAgentRequest) -> AsyncGenerator[ResponsesAgentStreamEvent, None]:
# 异步生成器
async for event in ...:
yield eventApps — 同步(如果 = no):
<async>python
from mlflow.genai.agent_server import invoke, stream
from mlflow.types.responses import (
ResponsesAgentRequest,
ResponsesAgentResponse,
ResponsesAgentStreamEvent,
)
@invoke()
def non_streaming(request: ResponsesAgentRequest) -> ResponsesAgentResponse:
# 与原始predict()相同的同步逻辑,从类中提取
...
return ResponsesAgentResponse(output=outputs)
@stream()
def streaming(request: ResponsesAgentRequest):
# 与原始predict_stream()相同的同步生成器,从类中提取
for chunk in ...:
yield ResponsesAgentStreamEvent(...)Key Differences
核心差异
| Aspect | Model Serving | Apps (async) | Apps (sync) |
|---|---|---|---|
| Structure | | Decorated functions | Decorated functions |
| Functions | | | |
| Streaming | Sync generator ( | Async generator ( | Sync generator ( |
| Server | MLflow Model Server | MLflow GenAI Server (FastAPI) | MLflow GenAI Server (FastAPI) |
| Deployment | | | |
| 方面 | Model Serving | Apps(异步) | Apps(同步) |
|---|---|---|---|
| 结构 | | 带有装饰器的函数 | 带有装饰器的函数 |
| 函数 | | | |
| 流式处理 | 同步生成器( | 异步生成器( | 同步生成器( |
| 服务器 | MLflow Model Server | MLflow GenAI Server(FastAPI) | MLflow GenAI Server(FastAPI) |
| 部署 | | | |
Async Patterns (only if <async>
= yes)
<async>异步模式(仅当<async>
= yes时)
<async>Skip this section if the user chose synchronous migration. The sync path keeps all original I/O calls as-is.
All I/O operations must be converted to async:
python
undefined如果用户选择同步迁移,请跳过本节。同步路径保持所有原始I/O调用不变。
所有I/O操作必须转换为异步:
python
undefinedOLD (sync)
旧版(同步)
response = client.chat(messages)
response = client.chat(messages)
NEW (async)
新版(异步)
response = await client.achat(messages)
response = await client.achat(messages)
OLD (sync iteration)
旧版(同步迭代)
for chunk in stream:
yield chunk
for chunk in stream:
yield chunk
NEW (async iteration)
新版(异步迭代)
async for chunk in stream:
yield chunk
---async for chunk in stream:
yield chunk
---Step 3: Migrate the Agent Code
步骤3:迁移代理代码
Task: Mark "Analyze and understand agent code" as. Mark "Migrate agent code to Apps format" ascompleted.in_progress
任务: 将“分析并理解代理代码”标记为。将“将代理代码迁移到Apps格式”标记为completed。in_progress
3.1 Copy Code Dependencies and Artifacts
3.1 复制代码依赖项和工件
The original MLflow model may contain multiple code files and artifacts that need to be migrated.
Copy all code files from to :
/codeagent_server/bash
undefined原始MLflow模型可能包含多个需要迁移的代码文件和工件。
将中的所有代码文件复制到:
/codeagent_server/bash
undefinedCopy all Python files from original code folder
从原始代码文件夹复制所有Python文件
cp ./original_mlflow_model/code/*.py ./<app-name>/agent_server/
cp ./original_mlflow_model/code/*.py ./<app-name>/agent_server/
If there are subdirectories with code, copy those too
如果有包含代码的子目录,也复制这些目录
cp -r ./original_mlflow_model/code/submodule ./<app-name>/agent_server/
cp -r ./original_mlflow_model/code/submodule ./<app-name>/agent_server/
**Copy artifacts (if present):**
```bash
**复制工件(如果存在):**
```bashCreate an artifacts directory in the migrated app if needed
如有需要,在迁移后的App中创建artifacts目录
mkdir -p ./<app-name>/agent_server/artifacts
mkdir -p ./<app-name>/agent_server/artifacts
Copy all artifacts
复制所有工件
cp -r ./original_mlflow_model/artifacts/* ./<app-name>/agent_server/artifacts/ 2>/dev/null || true
**Fix import paths after copying:**
When code files are moved, imports may break. Check and update imports in all copied files:
```pythoncp -r ./original_mlflow_model/artifacts/* ./<app-name>/agent_server/artifacts/ 2>/dev/null || true
**复制后修复导入路径:**
移动代码文件后,导入可能会中断。检查并更新所有复制文件中的导入:
```pythonBEFORE (if files were in different locations):
之前(如果文件位于不同位置):
from code.utils import helper_function
from artifacts.prompts import SYSTEM_PROMPT
from code.utils import helper_function
from artifacts.prompts import SYSTEM_PROMPT
AFTER (files are now in agent_server/):
之后(文件现在位于agent_server/中):
from agent_server.utils import helper_function
from agent_server.utils import helper_function
Or if in same directory:
或者如果在同一目录中:
from .utils import helper_function
from .utils import helper_function
For artifacts, update file paths:
对于工件,更新文件路径:
BEFORE:
之前:
with open("artifacts/config.yaml") as f:
with open("artifacts/config.yaml") as f:
AFTER:
之后:
import os
config_path = os.path.join(os.path.dirname(file), "artifacts", "config.yaml")
with open(config_path) as f:
> **Important:** Review each copied file and ensure all imports resolve correctly. The most common issues are:
> - Relative imports that assumed a different directory structure
> - Hardcoded file paths to artifacts
> - Missing `__init__.py` files for package importsimport os
config_path = os.path.join(os.path.dirname(file), "artifacts", "config.yaml")
with open(config_path) as f:
> **重要提示:** 检查每个复制的文件并确保所有导入都能正确解析。最常见的问题是:
> - 假设不同目录结构的相对导入
> - 指向工件的硬编码文件路径
> - 包导入缺少`__init__.py`文件3.2 Extract Configuration
3.2 提取配置
From the original agent code, identify and preserve:
- LLM endpoint name (e.g., )
databricks-claude-sonnet-4-5 - System prompt
- Tool definitions
- Any custom logic
从原始代理代码中识别并保留:
- LLM端点名称(例如)
databricks-claude-sonnet-4-5 - 系统提示
- 工具定义
- 任何自定义逻辑
3.3 Update the Agent Entry Point
3.3 更新代理入口点
The approach depends on whether the user chose async or sync migration.
方法取决于用户选择的是异步还是同步迁移。
Path A: Synchronous Migration (<async>
= no)
<async>路径A:同步迁移(<async>
= no)
<async>This is the minimal-changes path. Extract the logic from the class, wrap it with / decorators, and keep all code synchronous.
ResponsesAgent@invoke@streamEdit :
<app-name>/agent_server/agent.py- Replace the scaffold with the original agent logic. The core transformation is extracting the class methods into decorated functions:
python
from mlflow.genai.agent_server import invoke, stream
from mlflow.types.responses import (
ResponsesAgentRequest,
ResponsesAgentResponse,
ResponsesAgentStreamEvent,
)这是最小更改路径。从类中提取逻辑,用/装饰器包装,并保持所有代码同步。
ResponsesAgent@invoke@stream编辑:
<app-name>/agent_server/agent.py- 用原始代理逻辑替换脚手架。 核心转换是将类方法提取为带有装饰器的函数:
python
from mlflow.genai.agent_server import invoke, stream
from mlflow.types.responses import (
ResponsesAgentRequest,
ResponsesAgentResponse,
ResponsesAgentStreamEvent,
)Move any class init or class-level setup to module level
将任何类__init__或类级设置移到模块级别
e.g., client initialization, tool setup, etc.
例如客户端初始化、工具设置等
@invoke()
def non_streaming(request: ResponsesAgentRequest) -> ResponsesAgentResponse:
# Paste the body of the original predict() method here
# Remove 'self.' references — replace with module-level variables
# Remove 'params' parameter (not used in Apps)
...
return ResponsesAgentResponse(output=outputs)
@stream()
def streaming(request: ResponsesAgentRequest):
# Paste the body of the original predict_stream() method here
# Remove 'self.' references — replace with module-level variables
# Remove 'params' parameter (not used in Apps)
for chunk in ...:
yield ResponsesAgentStreamEvent(...)
2. **Key changes from class to functions:**
- Remove the `class MyAgent(ResponsesAgent):` wrapper
- Remove `self` parameter from all methods
- Move `__init__` logic (client creation, tool setup) to module-level code
- Replace `self.some_attribute` with module-level variables
- Add `@invoke()` decorator to the non-streaming function
- Add `@stream()` decorator to the streaming function
3. **Keep all other code as-is** — no need to convert sync calls to async, no need to change `for` to `async for`, no need to add `await`.
---@invoke()
def non_streaming(request: ResponsesAgentRequest) -> ResponsesAgentResponse:
# 在此处粘贴原始predict()方法的主体
# 移除'self.'引用 - 替换为模块级变量
# 移除'params'参数(在Apps中不使用)
...
return ResponsesAgentResponse(output=outputs)
@stream()
def streaming(request: ResponsesAgentRequest):
# 在此处粘贴原始predict_stream()方法的主体
# 移除'self.'引用 - 替换为模块级变量
# 移除'params'参数(在Apps中不使用)
for chunk in ...:
yield ResponsesAgentStreamEvent(...)
2. **从类到函数的关键更改:**
- 移除`class MyAgent(ResponsesAgent):`包装器
- 从所有方法中移除`self`参数
- 将`__init__`逻辑(客户端创建、工具设置)移到模块级代码
- 将`self.some_attribute`替换为模块级变量
- 为非流式函数添加`@invoke()`装饰器
- 为流式函数添加`@stream()`装饰器
3. **保留所有其他代码不变** - 无需将同步调用转换为异步,无需将`for`更改为`async for`,无需添加`await`。
---Path B: Async Migration (<async>
= yes)
<async>路径B:异步迁移(<async>
= yes)
<async>This path converts all I/O operations to async for higher concurrency. More changes are required, but the result is a more efficient server.
Edit :
<app-name>/agent_server/agent.py-
Update the LLM endpoint:python
LLM_ENDPOINT_NAME = "<your-endpoint-from-original>" -
Update the system prompt:python
SYSTEM_PROMPT = """<your-system-prompt-from-original>""" -
Add your custom tools: If your original agent had custom tools, add them:python
from langchain_core.tools import tool @tool async def my_custom_tool(arg: str) -> str: """Tool description.""" # Your tool logic (make async if needed) return result -
Convert all I/O to async:
- →
def predict()async def non_streaming() - →
def predict_stream()async def streaming() - →
client.chat()await client.achat() - →
for chunk in stream:async for chunk in stream: - Sync HTTP calls → async equivalents
await
-
Preserve any special logic: Migrate any custom preprocessing, postprocessing, or business logic from the original agent.
此路径将所有I/O操作转换为异步以实现更高的并发量。需要更多更改,但结果是更高效的服务器。
编辑:
<app-name>/agent_server/agent.py-
更新LLM端点:python
LLM_ENDPOINT_NAME = "<your-endpoint-from-original>" -
更新系统提示:python
SYSTEM_PROMPT = """<your-system-prompt-from-original>""" -
添加自定义工具: 如果原始代理有自定义工具,请添加它们:python
from langchain_core.tools import tool @tool async def my_custom_tool(arg: str) -> str: """工具描述。""" # 你的工具逻辑(如有需要,设为异步) return result -
将所有I/O转换为异步:
- →
def predict()async def non_streaming() - →
def predict_stream()async def streaming() - →
client.chat()await client.achat() - →
for chunk in stream:async for chunk in stream: - 同步HTTP调用 → 异步等效调用
await
-
保留任何特殊逻辑: 迁移原始代理中的任何自定义预处理、后处理或业务逻辑。
3.4 Handle Stateful Agents
3.4 处理有状态代理
If original uses checkpointer (short-term memory):
- Add checkpointer with Lakebase integration (use if async, or sync equivalent if sync)
AsyncCheckpointSaver - Configure in
LAKEBASE_INSTANCE_NAME.env - Extract thread_id from or
request.custom_inputsrequest.context.conversation_id
If original uses store (long-term memory):
- Add store with Lakebase integration (use if async, or sync equivalent if sync)
AsyncDatabricksStore - Configure in
LAKEBASE_INSTANCE_NAME.env - Extract user_id from or
request.custom_inputsrequest.context.user_id
如果原始代理使用检查点(短期记忆):
- 添加带有Lakebase集成的检查点(如果是异步则使用,如果是同步则使用同步等效项)
AsyncCheckpointSaver - 在中配置
.envLAKEBASE_INSTANCE_NAME - 从或
request.custom_inputs中提取thread_idrequest.context.conversation_id
如果原始代理使用存储(长期记忆):
- 添加带有Lakebase集成的存储(如果是异步则使用,如果是同步则使用同步等效项)
AsyncDatabricksStore - 在中配置
.envLAKEBASE_INSTANCE_NAME - 从或
request.custom_inputs中提取user_idrequest.context.user_id
Step 4: Set Up the App
步骤4:设置App
Task: Mark "Migrate agent code to Apps format" as. Mark "Set up and configure the app" ascompleted.in_progress
任务: 将“将代理代码迁移到Apps格式”标记为。将“设置并配置App”标记为completed。in_progress
4.1 Verify Build Configuration
4.1 验证构建配置
Before installing dependencies, ensure a README file exists (hatchling requires this):
Ensure a README file exists:
bash
undefined安装依赖项之前,确保存在README文件(hatchling需要此文件):
确保存在README文件:
bash
undefinedCreate a minimal README if one doesn't exist
如果不存在,创建一个最小的README
if [ ! -f "README.md" ]; then
echo "# Migrated Agent App" > README.md
fi
undefinedif [ ! -f "README.md" ]; then
echo "# 迁移后的代理App" > README.md
fi
undefined4.2 Install Dependencies
4.2 安装依赖项
bash
cd <app-name>
uv syncbash
cd <app-name>
uv sync4.3 Create requirements.txt for Databricks Apps
4.3 为Databricks Apps创建requirements.txt
Databricks Apps requires a file with to install dependencies from :
requirements.txtuvpyproject.tomlbash
echo "uv" > requirements.txtDatabricks Apps需要一个包含的文件,以便从安装依赖项:
uvrequirements.txtpyproject.tomlbash
echo "uv" > requirements.txt4.4 Run Quickstart
4.4 运行快速启动脚本
Run the script to quickly set up your local environment. This is the recommended way to configure the app as it handles all necessary setup automatically.
uv run quickstartbash
uv run quickstartThis script will:
- Verify uv, nvm, and Databricks CLI installations
- Configure Databricks authentication
- Configure agent tracing, by creating and linking an MLflow experiment to your app
- Configure with the necessary environment variables
.env
Important: The quickstart script creates the MLflow experiment that the app needs for logging traces and models. This experiment will be added as a resource when deploying the app.
If there are issues with the quickstart script, refer to the manual setup in section 4.5.
运行脚本快速设置本地环境。这是推荐的配置App的方式,因为它会自动处理所有必要的设置。
uv run quickstartbash
uv run quickstart该脚本将:
- 验证uv、nvm和Databricks CLI的安装
- 配置Databricks身份验证
- 配置代理跟踪,方法是创建一个MLflow实验并将其链接到你的App
- 使用必要的环境变量配置
.env
重要提示: 快速启动脚本创建App记录跟踪和模型所需的MLflow实验。部署App时,此实验将作为资源添加。
如果快速启动脚本出现问题,请参考4.5节中的手动设置。
4.5 Manual Environment Configuration (Optional)
4.5 手动环境配置(可选)
If you need to manually configure the environment or add additional variables, edit :
.envbash
undefined如果你需要手动配置环境或添加其他变量,请编辑:
.envbash
undefinedDatabricks authentication
Databricks身份验证
DATABRICKS_CONFIG_PROFILE=<your-profile>
DATABRICKS_CONFIG_PROFILE=<your-profile>
MLflow experiment (created by quickstart, or create manually)
MLflow实验(由快速启动脚本创建,或手动创建)
MLFLOW_EXPERIMENT_ID=<experiment-id>
MLFLOW_EXPERIMENT_ID=<experiment-id>
Example: Lakebase for stateful agents
示例:用于有状态代理的Lakebase
LAKEBASE_INSTANCE_NAME=<your-lakebase-instance>
LAKEBASE_INSTANCE_NAME=<your-lakebase-instance>
Example: Custom API keys
示例:自定义API密钥
MY_API_KEY=<value>
To manually create an MLflow experiment:
```bash
databricks experiments create-experiment "/Users/<your-username>/<app-name>" --profile <profile>MY_API_KEY=<value>
要手动创建MLflow实验:
```bash
databricks experiments create-experiment "/Users/<your-username>/<app-name>" --profile <profile>Step 5: Test Locally
步骤5:本地测试
Task: Mark "Set up and configure the app" as. Mark "Test agent locally" ascompleted.in_progress
Test your migrated agent locally before deploying to Databricks Apps. This helps catch configuration issues early and ensures the agent works correctly.
任务: 将“设置并配置App”标记为。将“本地测试代理”标记为completed。in_progress
在部署到Databricks Apps之前,先在本地测试迁移后的代理。这有助于及早发现配置问题并确保代理正常工作。
5.1 Start the Server
5.1 启动服务器
After the quickstart setup is complete, start the agent server and chat app locally:
bash
cd <app-name>
uv run start-appWait for the server to start. You should see output indicating the server is running on .
http://localhost:8000Note: If you only need the API endpoint (without the chat UI), you can runinstead.uv run start-server
快速启动设置完成后,在本地启动代理服务器和聊天App:
bash
cd <app-name>
uv run start-app等待服务器启动。你应该会看到输出表明服务器正在上运行。
http://localhost:8000注意: 如果你只需要API端点(不需要聊天UI),可以运行代替。uv run start-server
5.2 Test with Original Input Example
5.2 使用原始输入示例进行测试
The original model artifacts include an file that contains a sample request. Use this to verify your migrated agent produces the same behavior. If there's no valid sample request then figure out a valid sample request to query agent based on its code.
input_example.jsonbash
undefined原始模型工件包含一个文件,其中包含示例请求。使用此文件验证迁移后的代理是否产生相同的行为。如果没有有效的示例请求,请根据其代码找出有效的示例请求来查询代理。
input_example.jsonbash
undefinedCheck the original input example (from the <app-name> directory)
检查原始输入示例(从<app-name>目录中)
cat ../original_mlflow_model/input_example.json
Example content:
```json
{"input": [{"role": "user", "content": "What is an LLM agent?"}], "custom_inputs": {"thread_id": "example-thread-123"}}Test your local server with this input:
bash
undefinedcat ../original_mlflow_model/input_example.json
示例内容:
```json
{"input": [{"role": "user", "content": "什么是LLM代理?"}], "custom_inputs": {"thread_id": "example-thread-123"}}使用此输入测试本地服务器:
bash
undefinedTest with the original input example
使用原始输入示例进行测试
curl -X POST http://localhost:8000/invocations
-H "Content-Type: application/json"
-d "$(cat ../original_mlflow_model/input_example.json)"
-H "Content-Type: application/json"
-d "$(cat ../original_mlflow_model/input_example.json)"
undefinedcurl -X POST http://localhost:8000/invocations
-H "Content-Type: application/json"
-d "$(cat ../original_mlflow_model/input_example.json)"
-H "Content-Type: application/json"
-d "$(cat ../original_mlflow_model/input_example.json)"
undefined5.3 Test Basic Requests
5.3 测试基本请求
bash
undefinedbash
undefinedNon-streaming
非流式
curl -X POST http://localhost:8000/invocations
-H "Content-Type: application/json"
-d '{"input": [{"role": "user", "content": "Hello!"}]}'
-H "Content-Type: application/json"
-d '{"input": [{"role": "user", "content": "Hello!"}]}'
curl -X POST http://localhost:8000/invocations
-H "Content-Type: application/json"
-d '{"input": [{"role": "user", "content": "你好!"}]}'
-H "Content-Type: application/json"
-d '{"input": [{"role": "user", "content": "你好!"}]}'
Streaming
流式
curl -X POST http://localhost:8000/invocations
-H "Content-Type: application/json"
-d '{"input": [{"role": "user", "content": "Hello!"}], "stream": true}'
-H "Content-Type: application/json"
-d '{"input": [{"role": "user", "content": "Hello!"}], "stream": true}'
undefinedcurl -X POST http://localhost:8000/invocations
-H "Content-Type: application/json"
-d '{"input": [{"role": "user", "content": "你好!"}], "stream": true}'
-H "Content-Type: application/json"
-d '{"input": [{"role": "user", "content": "你好!"}], "stream": true}'
undefined5.4 Test with Custom Inputs (for stateful agents)
5.4 使用自定义输入测试(针对有状态代理)
bash
undefinedbash
undefinedWith thread_id for short-term memory
带有thread_id以实现短期记忆
curl -X POST http://localhost:8000/invocations
-H "Content-Type: application/json"
-d '{"input": [{"role": "user", "content": "Hi"}], "custom_inputs": {"thread_id": "test-123"}}'
-H "Content-Type: application/json"
-d '{"input": [{"role": "user", "content": "Hi"}], "custom_inputs": {"thread_id": "test-123"}}'
curl -X POST http://localhost:8000/invocations
-H "Content-Type: application/json"
-d '{"input": [{"role": "user", "content": "嗨"}], "custom_inputs": {"thread_id": "test-123"}}'
-H "Content-Type: application/json"
-d '{"input": [{"role": "user", "content": "嗨"}], "custom_inputs": {"thread_id": "test-123"}}'
With user_id for long-term memory
带有user_id以实现长期记忆
curl -X POST http://localhost:8000/invocations
-H "Content-Type: application/json"
-d '{"input": [{"role": "user", "content": "Hi"}], "custom_inputs": {"user_id": "user@example.com"}}'
-H "Content-Type: application/json"
-d '{"input": [{"role": "user", "content": "Hi"}], "custom_inputs": {"user_id": "user@example.com"}}'
undefinedcurl -X POST http://localhost:8000/invocations
-H "Content-Type: application/json"
-d '{"input": [{"role": "user", "content": "嗨"}], "custom_inputs": {"user_id": "user@example.com"}}'
-H "Content-Type: application/json"
-d '{"input": [{"role": "user", "content": "嗨"}], "custom_inputs": {"user_id": "user@example.com"}}'
undefined5.5 Verify Before Proceeding
5.5 继续前验证
Before proceeding to deployment, ensure:
- The server starts without errors
- The original input example returns a valid response
- Streaming responses work correctly
- Custom inputs (thread_id, user_id) are handled properly (if applicable)
Note: Only proceed to Step 6 (Deploy) after confirming the agent works correctly locally.
继续部署之前,请确保:
- 服务器启动时没有错误
- 原始输入示例返回有效响应
- 流式响应正常工作
- 自定义输入(thread_id、user_id)处理正确(如适用)
注意: 确认代理在本地正常工作后,再继续进行步骤6(部署)。
Step 6: Deploy to Databricks Apps
步骤6:部署到Databricks Apps
Task: Mark "Test agent locally" as. Mark "Deploy to Databricks Apps" ascompleted.in_progress
This step uses Databricks Asset Bundles (DAB) to deploy. The scaffold includes a that you need to update with the app name and resources from the original model.
databricks.yml任务: 将“本地测试代理”标记为。将“部署到Databricks Apps”标记为completed。in_progress
此步骤使用Databricks Asset Bundles (DAB)进行部署。脚手架包含一个,你需要使用App名称和原始模型中的资源更新该文件。
databricks.yml6.1 Extract Resources from Original Model
6.1 从原始模型中提取资源
The original model's file contains a section that lists all Databricks resources the agent needs access to. Check (or if you're in the parent directory) for content like:
MLmodelresources../original_mlflow_model/MLmodel./original_mlflow_model/MLmodelyaml
resources:
api_version: '1'
databricks:
lakebase:
- name: lakebase
serving_endpoint:
- name: databricks-claude-sonnet-4-5原始模型的文件包含一个部分,列出了代理需要访问的所有Databricks资源。检查(如果在父目录中则检查),查找如下内容:
MLmodelresources../original_mlflow_model/MLmodel./original_mlflow_model/MLmodelyaml
resources:
api_version: '1'
databricks:
lakebase:
- name: lakebase
serving_endpoint:
- name: databricks-claude-sonnet-4-56.2 Update databricks.yml
with Resources
databricks.yml6.2 使用资源更新databricks.yml
databricks.ymlThe scaffold includes a with the experiment resource pre-configured. You need to:
databricks.yml- Update the app name to (the name provided by the user) in both the
<app-name>field and theresources.apps.agent_migration.namefield.targets.prod.resources.apps.agent_migration.name - Add resources extracted from the original MLmodel file to the list.
resources.apps.agent_migration.resources
Resource Type Mapping (MLmodel → ):
databricks.yml| MLmodel Resource | | Key Fields |
|---|---|---|
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
Note: Theresource is already configured in the scaffoldexperimentand is automatically created by the bundle. You do not need to add it manually.databricks.yml
Example: for an agent with a serving endpoint and UC function:
databricks.ymlyaml
resources:
experiments:
agent_migration_experiment:
name: /Users/${workspace.current_user.userName}/${bundle.name}-${bundle.target}
apps:
agent_migration:
name: "<app-name>" # Update to user's app name
description: "Migrated agent from Model Serving to Databricks Apps"
source_code_path: ./
resources:
- name: 'experiment'
experiment:
experiment_id: "${resources.experiments.agent_migration_experiment.id}"
permission: 'CAN_MANAGE'
- name: 'serving-endpoint'
serving_endpoint:
name: 'databricks-claude-sonnet-4-5'
permission: 'CAN_QUERY'
- name: 'python-exec'
uc_securable:
securable_full_name: 'system.ai.python_exec'
securable_type: 'FUNCTION'
permission: 'EXECUTE'
targets:
prod:
resources:
apps:
agent_migration:
name: "<app-name>" # Same name for productionExample: Adding Lakebase resources (for stateful agents):
yaml
- name: 'database'
database:
database_name: 'databricks_postgres'
instance_name: 'lakebase'
permission: 'CAN_CONNECT_AND_CREATE'脚手架包含一个,其中预先配置了实验资源。你需要:
databricks.yml- 更新App名称为(用户提供的名称),在
<app-name>字段和resources.apps.agent_migration.name字段中都要更新。targets.prod.resources.apps.agent_migration.name - 添加资源,从原始MLmodel文件中提取的资源添加到列表中。
resources.apps.agent_migration.resources
资源类型映射(MLmodel → ):
databricks.yml| MLmodel资源 | | 关键字段 |
|---|---|---|
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
注意: 实验资源已在脚手架中配置,并且会由bundle自动创建。你不需要手动添加它。databricks.yml
示例:带有服务端点和UC函数的代理的:
databricks.ymlyaml
resources:
experiments:
agent_migration_experiment:
name: /Users/${workspace.current_user.userName}/${bundle.name}-${bundle.target}
apps:
agent_migration:
name: "<app-name>" # 更新为用户的App名称
description: "从Model Serving迁移到Databricks Apps的代理"
source_code_path: ./
resources:
- name: 'experiment'
experiment:
experiment_id: "${resources.experiments.agent_migration_experiment.id}"
permission: 'CAN_MANAGE'
- name: 'serving-endpoint'
serving_endpoint:
name: 'databricks-claude-sonnet-4-5'
permission: 'CAN_QUERY'
- name: 'python-exec'
uc_securable:
securable_full_name: 'system.ai.python_exec'
securable_type: 'FUNCTION'
permission: 'EXECUTE'
targets:
prod:
resources:
apps:
agent_migration:
name: "<app-name>" # 生产环境使用相同名称示例:添加Lakebase资源(针对有状态代理):
yaml
- name: 'database'
database:
database_name: 'databricks_postgres'
instance_name: 'lakebase'
permission: 'CAN_CONNECT_AND_CREATE'6.3 Deploy with Databricks Asset Bundles
6.3 使用Databricks Asset Bundles部署
From inside the directory, validate, deploy, and run:
<app-name>bash
undefined在目录内,验证、部署并运行:
<app-name>bash
undefined1. Validate bundle configuration (catches errors before deploy)
1. 验证bundle配置(在部署前捕获错误)
databricks bundle validate --profile <profile>
databricks bundle validate --profile <profile>
2. Deploy the bundle (creates/updates resources, uploads files)
2. 部署bundle(创建/更新资源,上传文件)
databricks bundle deploy --profile <profile>
databricks bundle deploy --profile <profile>
3. Run the app (starts/restarts with uploaded source code) - REQUIRED!
3. 运行App(使用上传的源代码启动/重启) - 必须执行!
databricks bundle run agent_migration --profile <profile>
> **Important:** `bundle deploy` only uploads files and configures resources. `bundle run` is **required** to actually start/restart the app with the new code. If you only run `deploy`, the app will continue running old code!databricks bundle run agent_migration --profile <profile>
> **重要提示:** `bundle deploy`仅上传文件并配置资源。`bundle run`是**必须执行**的,才能使用新代码实际启动/重启App。如果你只运行`deploy`,App将继续运行旧代码!6.4 Test Deployed App
6.4 测试已部署的App
Task: Mark "Deploy to Databricks Apps" as. Mark "Test deployed app" ascompleted.in_progress
bash
undefined任务: 将“部署到Databricks Apps”标记为。将“测试已部署的App”标记为completed。in_progress
bash
undefinedGet the app URL
获取App URL
APP_URL=$(databricks apps get <app-name> --profile <profile> --output json | jq -r '.url')
APP_URL=$(databricks apps get <app-name> --profile <profile> --output json | jq -r '.url')
Get OAuth token
获取OAuth令牌
TOKEN=$(databricks auth token --profile <profile> | jq -r .access_token)
TOKEN=$(databricks auth token --profile <profile> | jq -r .access_token)
Query the app
查询App
curl -X POST ${APP_URL}/invocations
-H "Authorization: Bearer $TOKEN"
-H "Content-Type: application/json"
-d '{"input": [{"role": "user", "content": "Hello!"}]}'
-H "Authorization: Bearer $TOKEN"
-H "Content-Type: application/json"
-d '{"input": [{"role": "user", "content": "Hello!"}]}'
Once the deployed app responds successfully:
> **Task:** Mark "Test deployed app" as `completed`. Migration complete!curl -X POST ${APP_URL}/invocations
-H "Authorization: Bearer $TOKEN"
-H "Content-Type: application/json"
-d '{"input": [{"role": "user", "content": "你好!"}]}'
-H "Authorization: Bearer $TOKEN"
-H "Content-Type: application/json"
-d '{"input": [{"role": "user", "content": "你好!"}]}'
一旦已部署的App成功响应:
> **任务:** 将“测试已部署的App”标记为`completed`。迁移完成!6.5 Deployment Troubleshooting
6.5 部署故障排除
If you encounter issues during deployment, refer to the deploy skill for detailed guidance.
Debug commands:
bash
undefined如果在部署过程中遇到问题,请参考deploy技能获取详细指导。
调试命令:
bash
undefinedValidate bundle configuration
验证bundle配置
databricks bundle validate --profile <profile>
databricks bundle validate --profile <profile>
View app logs
查看App日志
databricks apps logs <app-name> --profile <profile> --follow
databricks apps logs <app-name> --profile <profile> --follow
Check app status
检查App状态
databricks apps get <app-name> --profile <profile> --output json | jq '{app_status, compute_status}'
databricks apps get <app-name> --profile <profile> --output json | jq '{app_status, compute_status}'
Get app URL
获取App URL
databricks apps get <app-name> --profile <profile> --output json | jq -r '.url'
**"App already exists" error:**
If `databricks bundle deploy` fails because the app already exists, refer to the **deploy** skill for instructions on binding an existing app to the bundle.
---databricks apps get <app-name> --profile <profile> --output json | jq -r '.url'
**“App已存在”错误:**
如果`databricks bundle deploy`因App已存在而失败,请参考**deploy**技能获取有关将现有App绑定到bundle的说明。
---Reference: App File Structure
参考:App文件结构
<app-name>/
├── agent_server/
│ ├── __init__.py
│ ├── agent.py # Main agent logic - THIS IS WHERE YOU MIGRATE TO
│ ├── start_server.py # FastAPI server setup
│ ├── utils.py # Helper utilities
│ └── evaluate_agent.py # Agent evaluation
├── scripts/
│ ├── __init__.py
│ ├── quickstart.py # Setup script
│ └── start_app.py # App startup
├── app.yaml # Databricks Apps configuration
├── databricks.yml # Databricks Asset Bundle configuration (resources, targets)
├── pyproject.toml # Dependencies (for local dev with uv)
├── requirements.txt # REQUIRED: Must contain "uv" for Databricks Apps
├── .env.example # Environment template
└── README.mdIMPORTANT: Thefile must exist and containrequirements.txtso that Databricks Apps can install dependencies using theuv. Without this file, the app will fail to start.pyproject.toml
<app-name>/
├── agent_server/
│ ├── __init__.py
│ ├── agent.py # 主代理逻辑 - 这是你要迁移到的地方
│ ├── start_server.py # FastAPI服务器设置
│ ├── utils.py # 辅助工具
│ └── evaluate_agent.py # 代理评估
├── scripts/
│ ├── __init__.py
│ ├── quickstart.py # 设置脚本
│ └── start_app.py # App启动脚本
├── app.yaml # Databricks Apps配置
├── databricks.yml # Databricks Asset Bundle配置(资源、目标)
├── pyproject.toml # 依赖项(用于使用uv进行本地开发)
├── requirements.txt # 必须:必须包含"uv",以便Databricks Apps使用
├── .env.example # 环境模板
└── README.md重要提示:文件必须存在并包含requirements.txt,以便Databricks Apps可以使用uv安装依赖项。没有此文件,App将无法启动。pyproject.toml
Reference: Common Migration Patterns
参考:常见迁移模式
Pattern 1: Simple Chat Agent
模式1:简单聊天代理
Original:
python
class ChatAgent(ResponsesAgent):
def predict(self, request, params=None):
messages = to_chat_completions_input(request.input)
response = self.llm.invoke(messages)
return ResponsesAgentResponse(output=[...])Migrated (sync):
python
llm = ... # Move class-level init to module level
@invoke()
def non_streaming(request: ResponsesAgentRequest) -> ResponsesAgentResponse:
messages = to_chat_completions_input(request.input)
response = llm.invoke(messages)
return ResponsesAgentResponse(output=[...])
@stream()
def streaming(request: ResponsesAgentRequest):
# Original predict_stream() body, with self. removed
...Migrated (async):
python
@invoke()
async def non_streaming(request: ResponsesAgentRequest) -> ResponsesAgentResponse:
outputs = [e.item async for e in streaming(request) if e.type == "response.output_item.done"]
return ResponsesAgentResponse(output=outputs)
@stream()
async def streaming(request: ResponsesAgentRequest) -> AsyncGenerator[ResponsesAgentStreamEvent, None]:
messages = {"messages": to_chat_completions_input([i.model_dump() for i in request.input])}
agent = await init_agent()
async for event in process_agent_astream_events(agent.astream(messages, stream_mode=["updates", "messages"])):
yield event原始:
python
class ChatAgent(ResponsesAgent):
def predict(self, request, params=None):
messages = to_chat_completions_input(request.input)
response = self.llm.invoke(messages)
return ResponsesAgentResponse(output=[...])迁移后(同步):
python
llm = ... # 将类级初始化移到模块级别
@invoke()
def non_streaming(request: ResponsesAgentRequest) -> ResponsesAgentResponse:
messages = to_chat_completions_input(request.input)
response = llm.invoke(messages)
return ResponsesAgentResponse(output=[...])
@stream()
def streaming(request: ResponsesAgentRequest):
# 原始predict_stream()主体,移除self.
...迁移后(异步):
python
@invoke()
async def non_streaming(request: ResponsesAgentRequest) -> ResponsesAgentResponse:
outputs = [e.item async for e in streaming(request) if e.type == "response.output_item.done"]
return ResponsesAgentResponse(output=outputs)
@stream()
async def streaming(request: ResponsesAgentRequest) -> AsyncGenerator[ResponsesAgentStreamEvent, None]:
messages = {"messages": to_chat_completions_input([i.model_dump() for i in request.input])}
agent = await init_agent()
async for event in process_agent_astream_events(agent.astream(messages, stream_mode=["updates", "messages"])):
yield eventPattern 2: Agent with Custom Tools
模式2:带有自定义工具的代理
Sync: Keep tools as-is from the original code.
Async: Migrate tools to async LangChain tools:
python
from langchain_core.tools import tool
@tool
async def search_docs(query: str) -> str:
"""Search the documentation."""
results = await vector_store.asimilarity_search(query)
return format_results(results)同步: 保留原始代码中的工具不变。
异步: 将工具迁移为异步LangChain工具:
python
from langchain_core.tools import tool
@tool
async def search_docs(query: str) -> str:
"""搜索文档。"""
results = await vector_store.asimilarity_search(query)
return format_results(results)Pattern 3: Using LangGraph with create_agent (async only)
模式3:使用LangGraph和create_agent(仅异步)
python
from langchain.agents import create_agent
from databricks_langchain import ChatDatabricks
async def init_agent():
tools = await mcp_client.get_tools() # MCP tools are async
model = ChatDatabricks(endpoint=LLM_ENDPOINT_NAME)
return create_agent(model=model, tools=tools, system_prompt=SYSTEM_PROMPT)python
from langchain.agents import create_agent
from databricks_langchain import ChatDatabricks
async def init_agent():
tools = await mcp_client.get_tools() # MCP工具是异步的
model = ChatDatabricks(endpoint=LLM_ENDPOINT_NAME)
return create_agent(model=model, tools=tools, system_prompt=SYSTEM_PROMPT)Reference: Useful Resources
参考:有用资源
- Responses API Docs: https://mlflow.org/docs/latest/genai/serving/responses-agent/
- Agent Framework: https://docs.databricks.com/aws/en/generative-ai/agent-framework/
- Agent Tools: https://docs.databricks.com/aws/en/generative-ai/agent-framework/agent-tool
- databricks-langchain SDK: https://github.com/databricks/databricks-ai-bridge/tree/main/integrations/langchain
- Responses API文档: https://mlflow.org/docs/latest/genai/serving/responses-agent/
- 代理框架: https://docs.databricks.com/aws/en/generative-ai/agent-framework/
- 代理工具: https://docs.databricks.com/aws/en/generative-ai/agent-framework/agent-tool
- databricks-langchain SDK: https://github.com/databricks/databricks-ai-bridge/tree/main/integrations/langchain
Troubleshooting
故障排除
"Module not found" errors
"找不到模块"错误
bash
uv sync # Reinstall dependenciesbash
uv sync # 重新安装依赖项Authentication errors
身份验证错误
bash
databricks auth login # Re-authenticatebash
databricks auth login # 重新进行身份验证Lakebase permission errors
Lakebase权限错误
- Ensure the Lakebase instance is added as an app resource in Databricks UI
- Grant appropriate permissions on the Lakebase instance
- 确保Lakebase实例已作为App资源添加到Databricks UI中
- 在Lakebase实例上授予适当的权限
Async errors (async migration only)
异步错误(仅异步迁移)
- Ensure all I/O calls use async versions (e.g., not
await client.achat())client.chat() - Use instead of
async forwhen iterating async generatorsfor - If you chose sync migration, these errors should not occur — double-check that you're not mixing sync and async patterns
- 确保所有I/O调用使用异步版本(例如而不是
await client.achat())client.chat() - 迭代异步生成器时使用而不是
async forfor - 如果你选择了同步迁移,不应出现这些错误——请仔细检查你是否混合了同步和异步模式