dbt
Compare original and translation side by side
🇺🇸
Original
English🇨🇳
Translation
Chinesedbt Expert Engineer Skill
dbt资深工程师技能
This skill provides a comprehensive guide for dbt development.
本技能提供dbt开发的全面指南。
1. dbt Command Basics
1. dbt命令基础
1.1. Required Options
1.1. 必备选项
Always specify these options with dbt commands:
sh
--profiles-dir ~/.dbt --no-use-colors执行dbt命令时请始终指定以下选项:
sh
--profiles-dir ~/.dbt --no-use-colors1.2. Connection Verification
1.2. 连接验证
Always verify connection at work start:
sh
dbt debug --profiles-dir ~/.dbt --no-use-colors每日工作开始时请务必验证连接:
sh
dbt debug --profiles-dir ~/.dbt --no-use-colors1.3. Ad-hoc Query Execution
1.3. 临时查询执行
Use command for ad-hoc queries in dbt:
dbt showsh
undefined使用命令在dbt中执行临时查询:
dbt showsh
undefinedBasic query execution
基础查询执行
dbt show --inline "select 1 as test, current_timestamp() as now" --profiles-dir ~/.dbt --no-use-colors
dbt show --inline "select 1 as test, current_timestamp() as now" --profiles-dir ~/.dbt --no-use-colors
Specify row limit (default is 5)
指定行限制(默认值为5)
dbt show --inline "select * from table_name" --limit 10 --profiles-dir ~/.dbt --no-use-colors
dbt show --inline "select * from table_name" --limit 10 --profiles-dir ~/.dbt --no-use-colors
Reference dbt models
引用dbt模型
dbt show --inline "select * from {{ ref('model_name') }}" --profiles-dir ~/.dbt --no-use-colors
dbt show --inline "select * from {{ ref('model_name') }}" --profiles-dir ~/.dbt --no-use-colors
Direct reference using catalog.schema.table format
使用catalog.schema.table格式直接引用
dbt show --inline "select * from catalog_name.schema_name.table_name" --limit 3 --profiles-dir ~/.dbt --no-use-colors
Notes:
- Explicit LIMIT in query conflicts with `--limit` option and causes error
- DDL commands (SHOW statements, etc.) cause syntax error due to auto-LIMITdbt show --inline "select * from catalog_name.schema_name.table_name" --limit 3 --profiles-dir ~/.dbt --no-use-colors
注意事项:
- 查询中显式的LIMIT与`--limit`选项冲突,会导致错误
- DDL命令(如SHOW语句等)会因自动添加LIMIT而引发语法错误2. Verification Procedures
2. 验证流程
2.1. Verification When dbt run is Prohibited
2.1. 禁止执行dbt run时的验证
Verification steps when cannot be executed to avoid production impact:
dbt run- Edit model
- Generate SQL with
dbt compile --profiles-dir ~/.dbt --no-use-colors - Get generated SQL from
target/compiled/ - Verify with or
bq querycommand (recommend using LIMIT)databricks
当无法执行以避免影响生产环境时的验证步骤:
dbt run- 编辑模型
- 使用生成SQL
dbt compile --profiles-dir ~/.dbt --no-use-colors - 从目录获取生成的SQL
target/compiled/ - 使用或
bq query命令验证(建议使用LIMIT)databricks
2.2. Verification When dbt run is Allowed
2.2. 允许执行dbt run时的验证
Verification steps when is allowed
in development/sandbox environments:
dbt run- Edit model
- Execute
dbt run --select +model_name --profiles-dir ~/.dbt --no-use-colors - Execute
dbt test --select +model_name --profiles-dir ~/.dbt --no-use-colors - Query generated table directly if needed
Notes:
- Use option to limit scope
--select - For model AND tag conditions, use
--select "staging.target,tag:tag_name"
在开发/沙箱环境中允许执行时的验证步骤:
dbt run- 编辑模型
- 执行
dbt run --select +model_name --profiles-dir ~/.dbt --no-use-colors - 执行
dbt test --select +model_name --profiles-dir ~/.dbt --no-use-colors - 如有需要,直接查询生成的表
注意事项:
- 使用选项限制范围
--select - 如需同时满足模型和标签条件,使用
--select "staging.target,tag:tag_name"
3. Issue Work Target Setup
3. 问题专属Target配置
Always set up Issue-specific target before during Issue work.
dbt run处理问题时,执行前请务必配置问题专属的target。
dbt run3.1. Setup Procedure
3.1. 配置步骤
- Read and check existing settings
~/.dbt/profiles.yml - Add Issue-specific target if not exists, based on existing target
dev
yaml
my_databricks_dbt:
outputs:
dev:
# Existing settings...
issue_123: # Name based on issue number
catalog: dbt_dev_{username} # Same as dev
host: dbc-xxxxx.cloud.databricks.com # Same as dev
http_path: /sql/1.0/warehouses/xxxxx # Same as dev
schema: dwh_issue_123 # Include issue number in schema name
threads: 1
token: dapixxxxx # Same as dev
type: databricks
target: devThen switch with option when executing dbt commands
--targetsh
undefined- 读取并检查现有配置
~/.dbt/profiles.yml - 基于现有的target添加问题专属的target(如不存在)
dev
yaml
my_databricks_dbt:
outputs:
dev:
# 现有配置...
issue_123: # 基于问题编号命名
catalog: dbt_dev_{username} # 与dev一致
host: dbc-xxxxx.cloud.databricks.com # 与dev一致
http_path: /sql/1.0/warehouses/xxxxx # 与dev一致
schema: dwh_issue_123 # 架构名称中包含问题编号
threads: 1
token: dapixxxxx # 与dev一致
type: databricks
target: dev然后在执行dbt命令时使用选项切换
--targetsh
undefinedExecute with issue_123 target
使用issue_123 target执行
dbt run --select +model_name --target issue_123 --profiles-dir ~/.dbt --no-use-colors
dbt run --select +model_name --target issue_123 --profiles-dir ~/.dbt --no-use-colors
Verify connection
验证连接
dbt debug --target issue_123 --profiles-dir ~/.dbt --no-use-colors
undefineddbt debug --target issue_123 --profiles-dir ~/.dbt --no-use-colors
undefined3.2. Notes
3.2. 注意事项
- Keep target name and schema name consistent with issue number
- Manually delete unused schemas after work completion
- Intermediate layer auto-generates as
{schema}_dbt_intermediates
- target名称与架构名称需与问题编号保持一致
- 工作完成后手动删除未使用的架构
- 中间层会自动生成为
{schema}_dbt_intermediates
4. Databricks SQL Dialect
4. Databricks SQL方言
- Full-width column names require backticks
- Column names and catalog names with hyphens require backticks
sql
-- Reference catalog name with hyphen
select * from `catalog-name`.schema_name.table_name;
-- Reference full-width column name
select `full-width column` from table_name;- 全角列名需要使用反引号包裹
- 包含连字符的列名和catalog名称需要使用反引号包裹
sql
-- 引用包含连字符的catalog名称
select * from `catalog-name`.schema_name.table_name;
-- 引用全角列名
select `full-width column` from table_name;5. Response Format
5. 响应格式
text
[Documentation-based response]
Source: [source_url]
Fetched: [fetched_at]text
[基于文档的响应]
来源:[source_url]
获取时间:[fetched_at]