dbt

Compare original and translation side by side

🇺🇸

Original

English

🇨🇳

Translation

Chinese

dbt Expert Engineer Skill

dbt资深工程师技能

This skill provides a comprehensive guide for dbt development.

本技能提供dbt开发的全面指南。

1. dbt Command Basics

1. dbt命令基础

1.1. Required Options

1.1. 必备选项

Always specify these options with dbt commands:

--profiles-dir ~/.dbt --no-use-colors

执行dbt命令时请始终指定以下选项：

--profiles-dir ~/.dbt --no-use-colors

1.2. Connection Verification

1.2. 连接验证

Always verify connection at work start:

dbt debug --profiles-dir ~/.dbt --no-use-colors

每日工作开始时请务必验证连接：

dbt debug --profiles-dir ~/.dbt --no-use-colors

1.3. Ad-hoc Query Execution

1.3. 临时查询执行

Use

dbt show

command for ad-hoc queries in dbt:

undefined

使用

dbt show

命令在dbt中执行临时查询：

undefined

Basic query execution

基础查询执行

dbt show --inline "select 1 as test, current_timestamp() as now" --profiles-dir ~/.dbt --no-use-colors

Specify row limit (default is 5)

指定行限制（默认值为5）

dbt show --inline "select * from table_name" --limit 10 --profiles-dir ~/.dbt --no-use-colors

Reference dbt models

引用dbt模型

dbt show --inline "select * from {{ ref('model_name') }}" --profiles-dir ~/.dbt --no-use-colors

Direct reference using catalog.schema.table format

使用catalog.schema.table格式直接引用

dbt show --inline "select * from catalog_name.schema_name.table_name" --limit 3 --profiles-dir ~/.dbt --no-use-colors


Notes:

- Explicit LIMIT in query conflicts with `--limit` option and causes error
- DDL commands (SHOW statements, etc.) cause syntax error due to auto-LIMIT

dbt show --inline "select * from catalog_name.schema_name.table_name" --limit 3 --profiles-dir ~/.dbt --no-use-colors


注意事项：

- 查询中显式的LIMIT与`--limit`选项冲突，会导致错误
- DDL命令（如SHOW语句等）会因自动添加LIMIT而引发语法错误

2. Verification Procedures

2. 验证流程

2.1. Verification When dbt run is Prohibited

2.1. 禁止执行dbt run时的验证

Verification steps when

dbt run

cannot be executed to avoid production impact:

Edit model

Generate SQL with

dbt compile --profiles-dir ~/.dbt --no-use-colors

Get generated SQL from
```
target/compiled/
```
Verify with
```
bq query
```
or
```
databricks
```
command (recommend using LIMIT)

当无法执行

dbt run

以避免影响生产环境时的验证步骤：

编辑模型

使用

dbt compile --profiles-dir ~/.dbt --no-use-colors

生成SQL

从
```
target/compiled/
```
目录获取生成的SQL
使用
```
bq query
```
或
```
databricks
```
命令验证（建议使用LIMIT）

2.2. Verification When dbt run is Allowed

2.2. 允许执行dbt run时的验证

Verification steps when

dbt run

is allowed in development/sandbox environments:

Edit model

Execute

dbt run --select +model_name --profiles-dir ~/.dbt --no-use-colors

Execute

dbt test --select +model_name --profiles-dir ~/.dbt --no-use-colors

Query generated table directly if needed

Notes:

Use
```
--select
```
option to limit scope
For model AND tag conditions, use
```
--select "staging.target,tag:tag_name"
```

在开发/沙箱环境中允许执行

dbt run

时的验证步骤：

编辑模型

执行

dbt run --select +model_name --profiles-dir ~/.dbt --no-use-colors

执行

dbt test --select +model_name --profiles-dir ~/.dbt --no-use-colors

如有需要，直接查询生成的表

注意事项：

使用
```
--select
```
选项限制范围
如需同时满足模型和标签条件，使用
```
--select "staging.target,tag:tag_name"
```

3. Issue Work Target Setup

3. 问题专属Target配置

Always set up Issue-specific target before

dbt run

during Issue work.

处理问题时，执行

dbt run

前请务必配置问题专属的target。

3.1. Setup Procedure

3.1. 配置步骤

Read
```
~/.dbt/profiles.yml
```
and check existing settings
Add Issue-specific target if not exists, based on existing
```
dev
```
target

yaml

my_databricks_dbt:
  outputs:
    dev:
      # Existing settings...
    issue_123: # Name based on issue number
      catalog: dbt_dev_{username} # Same as dev
      host: dbc-xxxxx.cloud.databricks.com # Same as dev
      http_path: /sql/1.0/warehouses/xxxxx # Same as dev
      schema: dwh_issue_123 # Include issue number in schema name
      threads: 1
      token: dapixxxxx # Same as dev
      type: databricks
  target: dev

Then switch with

--target

option when executing dbt commands

undefined

读取
```
~/.dbt/profiles.yml
```
并检查现有配置
基于现有的
```
dev
```
target添加问题专属的target（如不存在）

yaml

my_databricks_dbt:
  outputs:
    dev:
      # 现有配置...
    issue_123: # 基于问题编号命名
      catalog: dbt_dev_{username} # 与dev一致
      host: dbc-xxxxx.cloud.databricks.com # 与dev一致
      http_path: /sql/1.0/warehouses/xxxxx # 与dev一致
      schema: dwh_issue_123 # 架构名称中包含问题编号
      threads: 1
      token: dapixxxxx # 与dev一致
      type: databricks
  target: dev

然后在执行dbt命令时使用

--target

选项切换

undefined

Execute with issue_123 target

使用issue_123 target执行

dbt run --select +model_name --target issue_123 --profiles-dir ~/.dbt --no-use-colors

Verify connection

验证连接

dbt debug --target issue_123 --profiles-dir ~/.dbt --no-use-colors

undefined

dbt debug --target issue_123 --profiles-dir ~/.dbt --no-use-colors

undefined

3.2. Notes

3.2. 注意事项

Keep target name and schema name consistent with issue number
Manually delete unused schemas after work completion
Intermediate layer auto-generates as
```
{schema}_dbt_intermediates
```

target名称与架构名称需与问题编号保持一致
工作完成后手动删除未使用的架构
中间层会自动生成为
```
{schema}_dbt_intermediates
```

4. Databricks SQL Dialect

4. Databricks SQL方言

Full-width column names require backticks
Column names and catalog names with hyphens require backticks

sql

-- Reference catalog name with hyphen
select * from `catalog-name`.schema_name.table_name;

-- Reference full-width column name
select `full-width column` from table_name;

全角列名需要使用反引号包裹
包含连字符的列名和catalog名称需要使用反引号包裹

sql

-- 引用包含连字符的catalog名称
select * from `catalog-name`.schema_name.table_name;

-- 引用全角列名
select `full-width column` from table_name;

5. Response Format

5. 响应格式

text

[Documentation-based response]

Source: [source_url]
Fetched: [fetched_at]

text

[基于文档的响应]

来源：[source_url]
获取时间：[fetched_at]