asset-bundles

Compare original and translation side by side

🇺🇸

Original

English
🇨🇳

Translation

Chinese

Databricks Asset Bundle (DABs) Writer

Databricks Asset Bundle (DABs) 编写指南

Overview

概述

Create DABs for multi-environment deployment (dev/staging/prod).
创建用于多环境部署(开发/预发布/生产)的DABs。

Reference Files

参考文件

  • SDP_guidance.md - Spark Declarative Pipeline configurations
  • alerts_guidance.md - SQL Alert schemas (critical - API differs)
  • SDP_guidance.md - Spark声明式流水线配置
  • alerts_guidance.md - SQL告警模式(重要 - API存在差异)

Bundle Structure

Bundle结构

project/
├── databricks.yml           # Main config + targets
├── resources/*.yml          # Resource definitions
└── src/                     # Code/dashboard files
project/
├── databricks.yml           # 主配置 + 目标环境
├── resources/*.yml          # 资源定义
└── src/                     # 代码/仪表板文件

Main Configuration (databricks.yml)

主配置(databricks.yml)

yaml
bundle:
  name: project-name

include:
  - resources/*.yml

variables:
  catalog:
    default: "default_catalog"
  schema:
    default: "default_schema"
  warehouse_id:
    lookup:
      warehouse: "Shared SQL Warehouse"

targets:
  dev:
    default: true
    mode: development
    workspace:
      profile: dev-profile
    variables:
      catalog: "dev_catalog"
      schema: "dev_schema"

  prod:
    mode: production
    workspace:
      profile: prod-profile
    variables:
      catalog: "prod_catalog"
      schema: "prod_schema"
yaml
bundle:
  name: project-name

include:
  - resources/*.yml

variables:
  catalog:
    default: "default_catalog"
  schema:
    default: "default_schema"
  warehouse_id:
    lookup:
      warehouse: "Shared SQL Warehouse"

targets:
  dev:
    default: true
    mode: development
    workspace:
      profile: dev-profile
    variables:
      catalog: "dev_catalog"
      schema: "dev_schema"

  prod:
    mode: production
    workspace:
      profile: prod-profile
    variables:
      catalog: "prod_catalog"
      schema: "prod_schema"

Dashboard Resources

仪表板资源

Support for dataset_catalog and dataset_schema parameters added in Databricks CLI 0.281.0 (January 2026)
yaml
resources:
  dashboards:
    dashboard_name:
      display_name: "[${bundle.target}] Dashboard Title"
      file_path: ../src/dashboards/dashboard.lvdash.json  # Relative to resources/
      warehouse_id: ${var.warehouse_id}
      dataset_catalog: ${var.catalog} # Default catalog used by all datasets in the dashboard if not otherwise specified in the query
      dataset_schema: ${var.schema} # Default schema used by all datasets in the dashboard if not otherwise specified in the query
      permissions:
        - level: CAN_RUN
          group_name: "users"
Permission levels:
CAN_READ
,
CAN_RUN
,
CAN_EDIT
,
CAN_MANAGE
Databricks CLI 0.281.0及以上版本(2026年1月)新增对dataset_catalog和dataset_schema参数的支持
yaml
resources:
  dashboards:
    dashboard_name:
      display_name: "[${bundle.target}] Dashboard Title"
      file_path: ../src/dashboards/dashboard.lvdash.json  # 相对于resources/目录
      warehouse_id: ${var.warehouse_id}
      dataset_catalog: ${var.catalog} # 如果查询中未另行指定,仪表板中所有数据集使用的默认catalog
      dataset_schema: ${var.schema} # 如果查询中未另行指定,仪表板中所有数据集使用的默认schema
      permissions:
        - level: CAN_RUN
          group_name: "users"
权限级别
CAN_READ
,
CAN_RUN
,
CAN_EDIT
,
CAN_MANAGE

Pipelines

流水线

See SDP_guidance.md for pipeline configuration
流水线配置请参考SDP_guidance.md

SQL Alerts

SQL告警

See alerts_guidance.md - Alert schema differs significantly from other resources
请参考alerts_guidance.md - 告警模式与其他资源存在显著差异

Jobs Resources

任务资源

yaml
resources:
  jobs:
    job_name:
      name: "[${bundle.target}] Job Name"
      tasks:
        - task_key: "main_task"
          notebook_task:
            notebook_path: ../src/notebooks/main.py  # Relative to resources/
          new_cluster:
            spark_version: "13.3.x-scala2.12"
            node_type_id: "i3.xlarge"
            num_workers: 2
      schedule:
        quartz_cron_expression: "0 0 9 * * ?"
        timezone_id: "America/Los_Angeles"
      permissions:
        - level: CAN_VIEW
          group_name: "users"
Permission levels:
CAN_VIEW
,
CAN_MANAGE_RUN
,
CAN_MANAGE
⚠️ Cannot modify "admins" group permissions on jobs - verify custom groups exist before use
yaml
resources:
  jobs:
    job_name:
      name: "[${bundle.target}] Job Name"
      tasks:
        - task_key: "main_task"
          notebook_task:
            notebook_path: ../src/notebooks/main.py  # 相对于resources/目录
          new_cluster:
            spark_version: "13.3.x-scala2.12"
            node_type_id: "i3.xlarge"
            num_workers: 2
      schedule:
        quartz_cron_expression: "0 0 9 * * ?"
        timezone_id: "America/Los_Angeles"
      permissions:
        - level: CAN_VIEW
          group_name: "users"
权限级别
CAN_VIEW
,
CAN_MANAGE_RUN
,
CAN_MANAGE
⚠️ 无法修改任务上的"admins"组权限 - 使用前请确认自定义组已存在

Path Resolution

路径解析

⚠️ Critical: Paths depend on file location:
File LocationPath FormatExample
resources/*.yml
../src/...
../src/dashboards/file.json
databricks.yml
targets
./src/...
./src/dashboards/file.json
Why:
resources/
files are one level deep, so use
../
to reach bundle root.
databricks.yml
is at root, so use
./
⚠️ 关键注意事项:路径取决于文件位置:
文件位置路径格式示例
resources/*.yml
../src/...
../src/dashboards/file.json
databricks.yml
目标
./src/...
./src/dashboards/file.json
原因
resources/
目录下的文件位于一级子目录,因此使用
../
回到Bundle根目录。
databricks.yml
位于根目录,因此使用
./

Volume Resources

存储卷资源

yaml
resources:
  volumes:
    my_volume:
      catalog_name: ${var.catalog}
      schema_name: ${var.schema}
      name: "volume_name"
      volume_type: "MANAGED"
⚠️ Volumes use
grants
not
permissions
- different format from other resources
yaml
resources:
  volumes:
    my_volume:
      catalog_name: ${var.catalog}
      schema_name: ${var.schema}
      name: "volume_name"
      volume_type: "MANAGED"
⚠️ 存储卷使用
grants
而非
permissions
- 格式与其他资源不同

Apps Resources

应用资源

Apps resource support added in Databricks CLI 0.239.0 (January 2025)
Apps in DABs have a minimal configuration - environment variables are defined in
app.yaml
in the source directory, NOT in databricks.yml.
应用资源支持在Databricks CLI 0.239.0及以上版本(2025年1月)中添加
DABs中的应用配置极简 - 环境变量定义在源目录的
app.yaml
中,而非databricks.yml。

Generate from Existing App (Recommended)

从现有应用生成(推荐)

bash
undefined
bash
undefined

Generate bundle config from existing CLI-deployed app

从现有CLI部署的应用生成Bundle配置

databricks bundle generate app --existing-app-name my-app --key my_app --profile DEFAULT
databricks bundle generate app --existing-app-name my-app --key my_app --profile DEFAULT

This creates:

此命令将创建:

- resources/my_app.app.yml (minimal resource definition)

- resources/my_app.app.yml(极简资源定义)

- src/app/ (downloaded source files including app.yaml)

- src/app/(下载的源文件,包括app.yaml)

undefined
undefined

Manual Configuration

手动配置

resources/my_app.app.yml:
yaml
resources:
  apps:
    my_app:
      name: my-app-${bundle.target}        # Environment-specific naming
      description: "My application"
      source_code_path: ../src/app         # Relative to resources/ dir
src/app/app.yaml: (Environment variables go here)
yaml
command:
  - "python"
  - "dash_app.py"

env:
  - name: USE_MOCK_BACKEND
    value: "false"
  - name: DATABRICKS_WAREHOUSE_ID
    value: "your-warehouse-id"
  - name: DATABRICKS_CATALOG
    value: "main"
  - name: DATABRICKS_SCHEMA
    value: "my_schema"
databricks.yml:
yaml
bundle:
  name: my-bundle

include:
  - resources/*.yml

variables:
  warehouse_id:
    default: "default-warehouse-id"

targets:
  dev:
    default: true
    mode: development
    workspace:
      profile: dev-profile
    variables:
      warehouse_id: "dev-warehouse-id"
resources/my_app.app.yml:
yaml
resources:
  apps:
    my_app:
      name: my-app-${bundle.target}        # 区分环境的命名
      description: "My application"
      source_code_path: ../src/app         # 相对于resources/目录
src/app/app.yaml:(环境变量在此定义)
yaml
command:
  - "python"
  - "dash_app.py"

env:
  - name: USE_MOCK_BACKEND
    value: "false"
  - name: DATABRICKS_WAREHOUSE_ID
    value: "your-warehouse-id"
  - name: DATABRICKS_CATALOG
    value: "main"
  - name: DATABRICKS_SCHEMA
    value: "my_schema"
databricks.yml:
yaml
bundle:
  name: my-bundle

include:
  - resources/*.yml

variables:
  warehouse_id:
    default: "default-warehouse-id"

targets:
  dev:
    default: true
    mode: development
    workspace:
      profile: dev-profile
    variables:
      warehouse_id: "dev-warehouse-id"

Key Differences from Other Resources

与其他资源的主要差异

AspectAppsOther Resources
Environment varsIn
app.yaml
(source dir)
In databricks.yml or resource file
ConfigurationMinimal (name, description, path)Extensive (tasks, clusters, etc.)
Source pathPoints to app directoryPoints to specific files
⚠️ Important: When source code is in project root (not src/app), use
source_code_path: ..
in the resource file
方面应用其他资源
环境变量
app.yaml
(源目录)中
在databricks.yml或资源文件中
配置极简(名称、描述、路径)详尽(任务、集群等)
源路径指向应用目录指向特定文件
⚠️ 重要提示:当源代码位于项目根目录(而非src/app)时,在资源文件中使用
source_code_path: ..

Other Resources

其他资源

DABs supports schemas, models, experiments, clusters, warehouses, etc. Use
databricks bundle schema
to inspect schemas.
DABs支持schema、模型、实验、集群、计算仓库等。使用
databricks bundle schema
查看模式。

Common Commands

常用命令

Validation

验证

bash
databricks bundle validate                    # Validate default target
databricks bundle validate -t prod           # Validate specific target
bash
databricks bundle validate                    # 验证默认目标环境
databricks bundle validate -t prod           # 验证指定目标环境

Deployment

部署

bash
databricks bundle deploy                      # Deploy to default target
databricks bundle deploy -t prod             # Deploy to specific target
databricks bundle deploy --auto-approve      # Skip confirmation prompts
databricks bundle deploy --force             # Force overwrite remote changes
bash
databricks bundle deploy                      # 部署到默认目标环境
databricks bundle deploy -t prod             # 部署到指定目标环境
databricks bundle deploy --auto-approve      # 跳过确认提示
databricks bundle deploy --force             # 强制覆盖远程变更

Running Resources

运行资源

bash
databricks bundle run resource_name          # Run a pipeline or job
databricks bundle run pipeline_name -t prod  # Run in specific environment
bash
databricks bundle run resource_name          # 运行流水线或任务
databricks bundle run pipeline_name -t prod  # 在指定环境中运行

Apps require bundle run to start after deployment

应用部署后需要执行bundle run来启动

databricks bundle run app_resource_key -t dev # Start/deploy the app
undefined
databricks bundle run app_resource_key -t dev # 启动/部署应用
undefined

Monitoring & Logs

监控与日志

View application logs (for Apps resources):
bash
undefined
查看应用日志(针对应用资源):
bash
undefined

View logs for deployed apps

查看已部署应用的日志

databricks apps logs <app-name> --profile <profile-name>
databricks apps logs <app-name> --profile <profile-name>

Examples:

示例:

databricks apps logs my-dash-app-dev -p DEFAULT databricks apps logs my-streamlit-app-prod -p DEFAULT

**What logs show:**
- `[SYSTEM]` - Deployment progress, file updates, dependency installation
- `[APP]` - Application output (print statements, errors)
- Backend connection status
- Deployment IDs and timestamps
- Stack traces for errors

**Key log patterns to look for:**
- ✅ `Deployment successful` - Confirms deployment completed
- ✅ `App started successfully` - App is running
- ✅ `Initialized real backend` - Backend connected to Unity Catalog
- ❌ `Error:` - Look for error messages and stack traces
- 📝 `Requirements installed` - Dependencies loaded correctly
databricks apps logs my-dash-app-dev -p DEFAULT databricks apps logs my-streamlit-app-prod -p DEFAULT

**日志内容说明:**
- `[SYSTEM]` - 部署进度、文件更新、依赖安装
- `[APP]` - 应用输出(打印语句、错误)
- 后端连接状态
- 部署ID和时间戳
- 错误堆栈跟踪

**需要关注的关键日志模式:**
- ✅ `Deployment successful` - 确认部署完成
- ✅ `App started successfully` - 应用已启动
- ✅ `Initialized real backend` - 后端已连接到Unity Catalog
- ❌ `Error:` - 查找错误信息和堆栈跟踪
- 📝 `Requirements installed` - 依赖已正确加载

Cleanup

清理

bash
databricks bundle destroy -t dev
databricks bundle destroy -t prod --auto-approve

bash
databricks bundle destroy -t dev
databricks bundle destroy -t prod --auto-approve

Common Issues

常见问题

IssueSolution
App deployment failsCheck logs:
databricks apps logs <app-name>
for error details
App not connecting to Unity CatalogCheck logs for backend connection errors; verify warehouse ID and permissions
Wrong permission levelDashboards: CAN_READ/RUN/EDIT/MANAGE; Jobs: CAN_VIEW/MANAGE_RUN/MANAGE
Path resolution failsUse
../src/
in resources/*.yml,
./src/
in databricks.yml
Catalog doesn't existCreate catalog first or update variable
"admins" group error on jobsCannot modify admins permissions on jobs
Volume permissionsUse
grants
not
permissions
for volumes
Hardcoded catalog in dashboardUse dataset_catalog parameter (CLI v0.281.0+), create environment-specific files, or parameterize JSON
App not starting after deployApps require
databricks bundle run <resource_key>
to start
App env vars not workingEnvironment variables go in
app.yaml
(source dir), not databricks.yml
Wrong app source pathUse
../
from resources/ dir if source is in project root
Debugging any app issueFirst step:
databricks apps logs <app-name>
to see what went wrong
问题解决方案
应用部署失败查看日志:
databricks apps logs <app-name>
获取错误详情
应用无法连接到Unity Catalog查看日志中的后端连接错误;验证计算仓库ID和权限
权限级别错误仪表板:CAN_READ/RUN/EDIT/MANAGE;任务:CAN_VIEW/MANAGE_RUN/MANAGE
路径解析失败在resources/*.yml中使用
../src/
,在databricks.yml中使用
./src/
Catalog不存在先创建Catalog或更新变量
任务上的"admins"组错误无法修改任务的admins权限
存储卷权限存储卷使用
grants
而非
permissions
仪表板中硬编码Catalog使用dataset_catalog参数(CLI v0.281.0+)、创建环境特定文件,或参数化JSON
应用部署后未启动应用需要执行
databricks bundle run <resource_key>
来启动
应用环境变量不生效环境变量应放在
app.yaml
(源目录)中,而非databricks.yml
应用源路径错误如果源文件在项目根目录,从resources/目录使用
../
调试任何应用问题第一步:执行
databricks apps logs <app-name>
查看问题原因

Key Principles

核心原则

  1. Path resolution:
    ../src/
    in resources/*.yml,
    ./src/
    in databricks.yml
  2. Variables: Parameterize catalog, schema, warehouse
  3. Mode:
    development
    for dev/staging,
    production
    for prod
  4. Groups: Use
    "users"
    for all workspace users
  5. Job permissions: Verify custom groups exist; can't modify "admins"
  1. 路径解析:resources/*.yml中使用
    ../src/
    ,databricks.yml中使用
    ./src/
  2. 变量:参数化catalog、schema、计算仓库
  3. 模式:开发/预发布环境使用
    development
    ,生产环境使用
    production
  4. 用户组:使用
    "users"
    指代所有工作区用户
  5. 任务权限:确认自定义组已存在;无法修改admins权限

Resources

资源