Lakebase Autoscaling

Patterns and best practices for using Lakebase Autoscaling, the next-generation managed PostgreSQL on Databricks with autoscaling compute, branching, scale-to-zero, and instant restore.

使用Lakebase Autoscaling的模式与最佳实践——这是Databricks上针对OLTP工作负载的下一代托管PostgreSQL服务，具备自动扩缩容计算资源、类Git分支、缩容至零以及即时时间点恢复功能。

When to Use

适用场景

Use this skill when:

Building applications that need a PostgreSQL database with autoscaling compute
Working with database branching for dev/test/staging workflows
Adding persistent state to applications with scale-to-zero cost savings
Implementing reverse ETL from Delta Lake to an operational database via synced tables
Managing Lakebase Autoscaling projects, branches, computes, or credentials

在以下场景中使用该技能：

构建需要具备自动扩缩容计算资源的PostgreSQL数据库的应用
为开发/测试/预发布工作流使用数据库分支功能
为应用添加持久化状态并通过缩容至零节省成本
通过同步表实现从Delta Lake到业务数据库的反向ETL
管理Lakebase Autoscaling的项目、分支、计算资源或凭证

Overview

概述

Lakebase Autoscaling is Databricks' next-generation managed PostgreSQL service for OLTP workloads. It provides autoscaling compute, Git-like branching, scale-to-zero, and instant point-in-time restore.

Feature	Description
Autoscaling Compute	0.5-112 CU with 2 GB RAM per CU; scales dynamically based on load
Scale-to-Zero	Compute suspends after configurable inactivity timeout
Branching	Create isolated database environments (like Git branches) for dev/test
Instant Restore	Point-in-time restore from any moment within the configured window (up to 35 days)
OAuth Authentication	Token-based auth via Databricks SDK (1-hour expiry)
Reverse ETL	Sync data from Delta tables to PostgreSQL via synced tables

Available Regions (AWS): us-east-1, us-east-2, eu-central-1, eu-west-1, eu-west-2, ap-south-1, ap-southeast-1, ap-southeast-2

Available Regions (Azure Beta): eastus2, westeurope, westus

Lakebase Autoscaling是Databricks针对OLTP工作负载推出的下一代托管PostgreSQL服务。它提供自动扩缩容计算资源、类Git分支、缩容至零以及即时时间点恢复功能。

功能	说明
自动扩缩容计算资源	0.5-112 CU，每CU配备2GB内存；根据负载动态扩缩容
缩容至零	在配置的闲置超时后，计算资源会自动暂停
分支功能	创建隔离的数据库环境（类似Git分支），用于开发/测试
即时恢复	可在配置的时间窗口内（最长35天）从任意时间点进行恢复
OAuth认证	通过Databricks SDK实现基于令牌的认证（令牌1小时后过期）
反向ETL	通过同步表将Delta表的数据同步至PostgreSQL

可用区域（AWS）：us-east-1, us-east-2, eu-central-1, eu-west-1, eu-west-2, ap-south-1, ap-southeast-1, ap-southeast-2

可用区域（Azure Beta）：eastus2, westeurope, westus

Project Hierarchy

项目层级

Understanding the hierarchy is essential for working with Lakebase Autoscaling:

Project (top-level container)
  └── Branch(es) (isolated database environments)
        ├── Compute (primary R/W endpoint)
        ├── Read Replica(s) (optional, read-only)
        ├── Role(s) (Postgres roles)
        └── Database(s) (Postgres databases)
              └── Schema(s)

Object	Description
Project	Top-level container. Created via `w.postgres.create_project()` .
Branch	Isolated database environment with copy-on-write storage. Default branch is `production` .
Compute	Postgres server powering a branch. Configurable CU sizing and autoscaling.
Database	Standard Postgres database within a branch. Default is `databricks_postgres` .

理解层级结构是使用Lakebase Autoscaling的关键：

Project (top-level container)
  └── Branch(es) (isolated database environments)
        ├── Compute (primary R/W endpoint)
        ├── Read Replica(s) (optional, read-only)
        ├── Role(s) (Postgres roles)
        └── Database(s) (Postgres databases)
              └── Schema(s)

对象	说明
项目（Project）	顶级容器。通过 `w.postgres.create_project()` 创建。
分支（Branch）	具备写时复制存储的隔离数据库环境。默认分支是 `production` 。
计算资源（Compute）	为分支提供支撑的Postgres服务器。可配置CU规格和自动扩缩容。
数据库（Database）	分支内的标准Postgres数据库。默认数据库为 `databricks_postgres` 。

Quick Start

快速入门

Create a project and connect:

python

from databricks.sdk import WorkspaceClient
from databricks.sdk.service.postgres import Project, ProjectSpec

w = WorkspaceClient()

创建项目并连接：

python

from databricks.sdk import WorkspaceClient
from databricks.sdk.service.postgres import Project, ProjectSpec

w = WorkspaceClient()

Create a project (long-running operation)

operation = w.postgres.create_project( project=Project( spec=ProjectSpec( display_name="My Application", pg_version="17" ) ), project_id="my-app" ) result = operation.wait() print(f"Created project: {result.name}")

undefined

operation = w.postgres.create_project( project=Project( spec=ProjectSpec( display_name="My Application", pg_version="17" ) ), project_id="my-app" ) result = operation.wait() print(f"Created project: {result.name}")

undefined

Common Patterns

常见模式

Generate OAuth Token

生成OAuth令牌

python

from databricks.sdk import WorkspaceClient

w = WorkspaceClient()

python

from databricks.sdk import WorkspaceClient

w = WorkspaceClient()

Generate database credential for connecting (optionally scoped to an endpoint)

cred = w.postgres.generate_database_credential( endpoint="projects/my-app/branches/production/endpoints/ep-primary" ) token = cred.token # Use as password in connection string

Token expires after 1 hour

undefined

undefined

Connect from Notebook

从Notebook连接

python

import psycopg
from databricks.sdk import WorkspaceClient

w = WorkspaceClient()

python

import psycopg
from databricks.sdk import WorkspaceClient

w = WorkspaceClient()

Get endpoint details

endpoint = w.postgres.get_endpoint( name="projects/my-app/branches/production/endpoints/ep-primary" ) host = endpoint.status.hosts.host

Generate token (scoped to endpoint)

cred = w.postgres.generate_database_credential( endpoint="projects/my-app/branches/production/endpoints/ep-primary" )

Connect using psycopg3

conn_string = ( f"host={host} " f"dbname=databricks_postgres " f"user={w.current_user.me().user_name} " f"password={cred.token} " f"sslmode=require" ) with psycopg.connect(conn_string) as conn: with conn.cursor() as cur: cur.execute("SELECT version()") print(cur.fetchone())

undefined

conn_string = ( f"host={host} " f"dbname=databricks_postgres " f"user={w.current_user.me().user_name} " f"password={cred.token} " f"sslmode=require" ) with psycopg.connect(conn_string) as conn: with conn.cursor() as cur: cur.execute("SELECT version()") print(cur.fetchone())

undefined

Create a Branch for Development

为开发环境创建分支

python

from databricks.sdk.service.postgres import Branch, BranchSpec, Duration

python

from databricks.sdk.service.postgres import Branch, BranchSpec, Duration

Create a dev branch with 7-day expiration

branch = w.postgres.create_branch( parent="projects/my-app", branch=Branch( spec=BranchSpec( source_branch="projects/my-app/branches/production", ttl=Duration(seconds=604800) # 7 days ) ), branch_id="development" ).wait() print(f"Branch created: {branch.name}")

undefined

branch = w.postgres.create_branch( parent="projects/my-app", branch=Branch( spec=BranchSpec( source_branch="projects/my-app/branches/production", ttl=Duration(seconds=604800) # 7 days ) ), branch_id="development" ).wait() print(f"Branch created: {branch.name}")

undefined

Resize Compute (Autoscaling)

调整计算资源（自动扩缩容）

python

from databricks.sdk.service.postgres import Endpoint, EndpointSpec, FieldMask

python

from databricks.sdk.service.postgres import Endpoint, EndpointSpec, FieldMask

Update compute to autoscale between 2-8 CU

w.postgres.update_endpoint( name="projects/my-app/branches/production/endpoints/ep-primary", endpoint=Endpoint( name="projects/my-app/branches/production/endpoints/ep-primary", spec=EndpointSpec( autoscaling_limit_min_cu=2.0, autoscaling_limit_max_cu=8.0 ) ), update_mask=FieldMask(field_mask=[ "spec.autoscaling_limit_min_cu", "spec.autoscaling_limit_max_cu" ]) ).wait()

undefined

w.postgres.update_endpoint( name="projects/my-app/branches/production/endpoints/ep-primary", endpoint=Endpoint( name="projects/my-app/branches/production/endpoints/ep-primary", spec=EndpointSpec( autoscaling_limit_min_cu=2.0, autoscaling_limit_max_cu=8.0 ) ), update_mask=FieldMask(field_mask=[ "spec.autoscaling_limit_min_cu", "spec.autoscaling_limit_max_cu" ]) ).wait()

undefined

MCP Tools

MCP工具

The following MCP tools are available for managing Lakebase infrastructure. Use

type="autoscale"

for Lakebase Autoscaling.

以下MCP工具可用于管理Lakebase基础设施。针对Lakebase Autoscaling，使用

type="autoscale"

参数。

Database (Project) Management

数据库（项目）管理

Tool	Description
`create_or_update_lakebase_database`	Create or update a database. Finds by name, creates if new, updates if existing. Use `type="autoscale"` , `display_name` , `pg_version` params. A new project auto-creates a production branch, default compute, and databricks_postgres database.
`get_lakebase_database`	Get database details (including branches and endpoints) or list all. Pass `name` to get one, omit to list all. Use `type="autoscale"` to filter.
`delete_lakebase_database`	Delete a project and all its branches, computes, and data. Use `type="autoscale"` .

工具	说明
`create_or_update_lakebase_database`	创建或更新数据库。按名称查找，不存在则创建，存在则更新。使用 `type="autoscale"` 、 `display_name` 、 `pg_version` 参数。新建项目会自动创建production分支、默认计算资源和databricks_postgres数据库。
`get_lakebase_database`	获取数据库详情（包括分支和端点）或列出所有数据库。传入 `name` 获取单个数据库，省略则列出全部。使用 `type="autoscale"` 进行筛选。
`delete_lakebase_database`	删除项目及其所有分支、计算资源和数据。使用 `type="autoscale"` 。

Branch Management

分支管理

Tool	Description
`create_or_update_lakebase_branch`	Create or update a branch with its compute endpoint. Params: `project_name` , `branch_id` , `source_branch` , `ttl_seconds` , `is_protected` , plus compute params ( `autoscaling_limit_min_cu` , `autoscaling_limit_max_cu` , `scale_to_zero_seconds` ).
`delete_lakebase_branch`	Delete a branch and its compute endpoints.

工具	说明
`create_or_update_lakebase_branch`	创建或更新带有计算端点的分支。参数包括： `project_name` 、 `branch_id` 、 `source_branch` 、 `ttl_seconds` 、 `is_protected` ，以及计算资源参数（ `autoscaling_limit_min_cu` 、 `autoscaling_limit_max_cu` 、 `scale_to_zero_seconds` ）。
`delete_lakebase_branch`	删除分支及其计算端点。

Credentials

凭证管理

Tool	Description
`generate_lakebase_credential`	Generate OAuth token for PostgreSQL connections (1-hour expiry). Pass `endpoint` resource name for autoscale.

工具	说明
`generate_lakebase_credential`	生成用于PostgreSQL连接的OAuth令牌（1小时后过期）。针对Autoscale版本，传入 `endpoint` 资源名称。

Reference Files

参考文档

projects.md - Project management patterns and settings
branches.md - Branching workflows, protection, and expiration
computes.md - Compute sizing, autoscaling, and scale-to-zero
connection-patterns.md - Connection patterns for different use cases
reverse-etl.md - Synced tables from Delta Lake to Lakebase

projects.md - 项目管理模式与设置
branches.md - 分支工作流、保护与过期设置
computes.md - 计算资源规格、自动扩缩容与缩容至零
connection-patterns.md - 不同场景下的连接模式
reverse-etl.md - 从Delta Lake到Lakebase的同步表

CLI Quick Reference

CLI快速参考

bash

undefined

bash

undefined

Create a project

databricks postgres create-project
--project-id my-app
--json '{"spec": {"display_name": "My App", "pg_version": "17"}}'

List projects

databricks postgres list-projects

Get project details

databricks postgres get-project projects/my-app

Create a branch

databricks postgres create-branch projects/my-app development
--json '{"spec": {"source_branch": "projects/my-app/branches/production", "no_expiry": true}}'

List branches

databricks postgres list-branches projects/my-app

Get endpoint details

databricks postgres get-endpoint projects/my-app/branches/production/endpoints/ep-primary

Delete a project

databricks postgres delete-project projects/my-app

undefined

databricks postgres delete-project projects/my-app

undefined

Key Differences from Lakebase Provisioned

与Lakebase Provisioned的主要差异

Aspect	Provisioned	Autoscaling
SDK module	`w.database`	`w.postgres`
Top-level resource	Instance	Project
Capacity	CU_1, CU_2, CU_4, CU_8 (16 GB/CU)	0.5-112 CU (2 GB/CU)
Branching	Not supported	Full branching support
Scale-to-zero	Not supported	Configurable timeout
Operations	Synchronous	Long-running operations (LRO)
Read replicas	Readable secondaries	Dedicated read-only endpoints

维度	Provisioned版本	Autoscaling版本
SDK模块	`w.database`	`w.postgres`
顶级资源	实例（Instance）	项目（Project）
容量规格	CU_1、CU_2、CU_4、CU_8（每CU 16GB内存）	0.5-112 CU（每CU 2GB内存）
分支功能	不支持	完全支持分支功能
缩容至零	不支持	可配置超时时间
操作方式	同步操作	长时间运行操作（LRO）
只读副本	可读从节点	专用只读端点

Common Issues

常见问题

Issue	Solution
Token expired during long query	Implement token refresh loop; tokens expire after 1 hour
Connection refused after scale-to-zero	Compute wakes automatically on connection; reactivation takes a few hundred ms; implement retry logic
DNS resolution fails on macOS	Use `dig` command to resolve hostname, pass `hostaddr` to psycopg
Branch deletion blocked	Delete child branches first; cannot delete branches with children
Autoscaling range too wide	Max - min cannot exceed 8 CU (e.g., 8-16 CU is valid, 0.5-32 CU is not)
SSL required error	Always use `sslmode=require` in connection string
Update mask required	All update operations require an `update_mask` specifying fields to modify
Connection closed after 24h idle	All connections have a 24-hour idle timeout and 3-day max lifetime; implement retry logic

问题	解决方案
长查询过程中令牌过期	实现令牌刷新逻辑；令牌有效期为1小时
缩容至零后连接被拒绝	计算资源会在连接请求时自动唤醒；重新激活需几百毫秒；实现重试逻辑
macOS上DNS解析失败	使用 `dig` 命令解析主机名，在psycopg中传入 `hostaddr` 参数
分支删除被阻止	先删除子分支；无法删除包含子分支的分支
自动扩缩容范围过宽	最大值与最小值的差不能超过8 CU（例如8-16 CU是合法的，0.5-32 CU不合法）
要求SSL的错误	连接字符串中始终使用 `sslmode=require`
需要更新掩码	所有更新操作都需要指定 `update_mask` 来定义要修改的字段
闲置24小时后连接被关闭	所有连接的闲置超时为24小时，最大生命周期为3天；实现重试逻辑

Current Limitations

当前限制

These features are NOT yet supported in Lakebase Autoscaling:

High availability with readable secondaries (use read replicas instead)
Databricks Apps UI integration (Apps can connect manually via credentials)
Feature Store integration
Stateful AI agents (LangChain memory)
Postgres-to-Delta sync (only Delta-to-Postgres reverse ETL)
Custom billing tags and serverless budget policies
Direct migration from Lakebase Provisioned (use pg_dump/pg_restore or reverse ETL)

Lakebase Autoscaling目前暂不支持以下功能：

带可读从节点的高可用（可使用只读副本替代）
Databricks Apps UI集成（应用可通过凭证手动连接）
Feature Store集成
有状态AI Agent（LangChain内存）
Postgres到Delta的同步（仅支持Delta到Postgres的反向ETL）
自定义账单标签与无服务器预算策略
从Lakebase Provisioned直接迁移（可使用pg_dump/pg_restore或反向ETL）

SDK Version Requirements

SDK版本要求

Databricks SDK for Python: >= 0.81.0 (for
```
w.postgres
```
module)
psycopg: 3.x (supports
```
hostaddr
```
parameter for DNS workaround)
SQLAlchemy: 2.x with
```
postgresql+psycopg
```
driver

python

%pip install -U "databricks-sdk>=0.81.0" "psycopg[binary]>=3.0" sqlalchemy

Databricks SDK for Python：>= 0.81.0（用于
```
w.postgres
```
模块）
psycopg：3.x（支持
```
hostaddr
```
参数以解决DNS问题）
SQLAlchemy：2.x 搭配
```
postgresql+psycopg
```
驱动

python

%pip install -U "databricks-sdk>=0.81.0" "psycopg[binary]>=3.0" sqlalchemy

Notes

注意事项

Compute Units in Autoscaling provide ~2 GB RAM each (vs 16 GB in Provisioned).
Resource naming follows hierarchical paths:
```
projects/{id}/branches/{id}/endpoints/{id}
```
.
All create/update/delete operations are long-running -- use
```
.wait()
```
in the SDK.
Tokens are short-lived (1 hour) -- production apps MUST implement token refresh.
Postgres versions 16 and 17 are supported.

计算单元（CU）：Autoscaling版本中每个CU提供约2GB内存（Provisioned版本为每CU 16GB）。
资源命名：遵循层级路径格式：
```
projects/{id}/branches/{id}/endpoints/{id}
```
。
所有创建/更新/删除操作均为长时间运行操作——在SDK中使用
```
.wait()
```
方法等待完成。
令牌有效期较短（1小时）——生产应用必须实现令牌刷新逻辑。
PostgreSQL版本：支持16和17版本。

lakebase-autoscale

Original

Translation

Lakebase Autoscaling

Lakebase Autoscaling

When to Use

适用场景

Overview

概述

Project Hierarchy

项目层级

Quick Start

快速入门

Create a project (long-running operation)

Create a project (long-running operation)

Common Patterns

常见模式

Generate OAuth Token

生成OAuth令牌

Generate database credential for connecting (optionally scoped to an endpoint)

Generate database credential for connecting (optionally scoped to an endpoint)

Token expires after 1 hour

Token expires after 1 hour

Connect from Notebook

从Notebook连接

Get endpoint details

Get endpoint details

Generate token (scoped to endpoint)

Generate token (scoped to endpoint)

Connect using psycopg3

Connect using psycopg3

Create a Branch for Development

为开发环境创建分支

Create a dev branch with 7-day expiration

Create a dev branch with 7-day expiration

Resize Compute (Autoscaling)

调整计算资源（自动扩缩容）

Update compute to autoscale between 2-8 CU

Update compute to autoscale between 2-8 CU

MCP Tools

MCP工具

Database (Project) Management

数据库（项目）管理

Branch Management

分支管理

Credentials

凭证管理

Reference Files

参考文档

CLI Quick Reference

CLI快速参考

Create a project

Create a project

List projects

List projects

Get project details

Get project details

Create a branch

Create a branch

List branches

List branches

Get endpoint details

Get endpoint details

Delete a project

Delete a project

Key Differences from Lakebase Provisioned

与Lakebase Provisioned的主要差异

Common Issues

常见问题

Current Limitations

当前限制

SDK Version Requirements

SDK版本要求

Notes

注意事项