creating-dbt-models
Compare original and translation side by side
🇺🇸
Original
English🇨🇳
Translation
Chinesedbt Model Development
dbt 模型开发
Read before you write. Build after you write. Verify your output.
先阅读,再编写。编写后构建,验证输出。
Critical Rules
关键规则
- ALWAYS run after creating/modifying models - compile is NOT enough
dbt build - ALWAYS verify output after build using - don't assume success
dbt show - If build fails 3+ times, stop and reassess your entire approach
- 创建/修改模型后必须运行- 仅compile是不够的
dbt build - 构建完成后必须验证输出 - 不要默认成功
- 如果构建失败3次以上,停止操作并重新评估整体方案
Workflow
工作流程
1. Understand the Task Requirements
1. 理解任务要求
- What columns are needed? List them explicitly.
- What is the grain of the table (one row per what)?
- What calculations or aggregations are required?
- 需要哪些列?明确列出。
- 表的粒度是什么(每行代表什么)?
- 需要哪些计算或聚合操作?
2. Discover Project Conventions
2. 识别项目规范
bash
cat dbt_project.yml
find models/ -name "*.sql" | head -20Read 2-3 existing models to learn naming, config, and SQL patterns.
bash
cat dbt_project.yml
find models/ -name "*.sql" | head -20阅读2-3个现有模型,学习命名、配置和SQL模式。
3. Find Similar Models
3. 查找相似模型
bash
undefinedbash
undefinedFind models with similar purpose
查找用途相似的模型
find models/ -name "agg.sql" -o -name "fct_.sql" | head -5
Learn from existing models: join types, aggregation patterns, NULL handling.find models/ -name "agg.sql" -o -name "fct_.sql" | head -5
从现有模型中学习:关联类型、聚合模式、NULL值处理方式。4. Check Upstream Data
4. 检查上游数据
bash
undefinedbash
undefinedPreview upstream data if needed
如有需要,预览上游数据
dbt show --select <upstream_model> --limit 10
undefineddbt show --select <upstream_model> --limit 10
undefined5. Write the Model
5. 编写模型
Follow discovered conventions. Match the required columns exactly.
遵循识别到的规范,严格匹配所需的列。
6. Compile (Syntax Check)
6. 编译(语法检查)
bash
dbt compile --select <model_name>bash
dbt compile --select <model_name>7. BUILD - MANDATORY
7. 构建 - 强制性步骤
This step is REQUIRED. Do NOT skip it.
bash
dbt build --select <model_name>If build fails:
- Read the error carefully
- Fix the specific issue
- Run build again
- If fails 3+ times, step back and reassess approach
此步骤为必填项,请勿跳过。
bash
dbt build --select <model_name>如果构建失败:
- 仔细阅读错误信息
- 修复具体问题
- 重新运行构建
- 如果失败3次以上,请暂停并重新评估方案
8. Verify Output (CRITICAL)
8. 验证输出(关键步骤)
Build success does NOT mean correct output.
bash
undefined构建成功并不代表输出正确。
bash
undefinedCheck the table was created and preview data
检查表是否已创建并预览数据
dbt show --select <model_name> --limit 10
Verify:
- Column names match requirements exactly
- Row count is reasonable
- Data values look correct
- No unexpected NULLsdbt show --select <model_name> --limit 10
验证内容:
- 列名是否与要求完全匹配
- 行数是否合理
- 数据值是否看起来正确
- 有无意外的NULL值9. Verify Calculations Against Sample Data
9. 对照样本数据验证计算结果
For models with calculations, verify correctness manually:
bash
undefined对于包含计算逻辑的模型,手动验证正确性:
bash
undefinedPick a specific row and verify calculation by hand
选择特定行并手动验证计算结果
dbt show --inline "
select *
from {{ ref('model_name') }}
where <primary_key> = '<known_value>'
" --limit 1
dbt show --inline "
select *
from {{ ref('model_name') }}
where <primary_key> = '<known_value>'
" --limit 1
Cross-check aggregations
交叉检查聚合结果
dbt show --inline "
select count(*), sum(<column>)
from {{ ref('model_name') }}
"
For example, if calculating `total_revenue = quantity * price`:
1. Pick one row from output
2. Look up the source quantity and price
3. Manually calculate: does it match?dbt show --inline "
select count(*), sum(<column>)
from {{ ref('model_name') }}
"
例如,若计算`total_revenue = quantity * price`:
1. 从输出中选取一行
2. 查找源数据中的quantity和price值
3. 手动计算:结果是否匹配?10. Re-review Against Requirements
10. 对照要求重新检查
Before declaring done, re-read the original request:
- Did you implement what was asked, not what you assumed?
- Are column names exactly as specified?
- Is the calculation logic correct per the requirements?
- Does the grain (one row per what?) match what was requested?
在完成前,重新阅读原始需求:
- 你实现的是需求要求的内容,而非自己假设的内容吗?
- 列名是否与指定的完全一致?
- 计算逻辑是否符合要求?
- 粒度(每行代表什么)是否与要求匹配?
Anti-Patterns
反模式
- Declaring done after compile without running build
- Not verifying output data after build
- Getting stuck in compile/build error loops
- Assuming table exists just because model file exists
- Writing SQL without checking existing model patterns first
- 仅compile后就宣布完成,未运行build
- 构建完成后未验证输出数据
- 陷入编译/构建错误循环
- 仅因为模型文件存在就假设表已创建
- 未先检查现有模型模式就编写SQL