data-analytics-engineering

Compare original and translation side by side

🇺🇸

Original

English
🇨🇳

Translation

Chinese

Data Analytics Engineering

数据分析工程

Scope

适用范围

  • Define metrics, grains, and dimensional models.
  • Build transformation layers and semantic models.
  • Implement data quality tests and observability.
  • Document datasets, lineage, and ownership.
  • Align analytics outputs with BI and product needs.
  • 定义指标、粒度和维度模型。
  • 构建转换层和语义模型。
  • 实施数据质量测试与可观测性。
  • 记录数据集、数据血缘和所有权。
  • 使分析输出与BI及产品需求对齐。

Ask For Inputs

所需输入

  • Business metrics and decision use cases.
  • Source systems, data freshness, and latency needs.
  • Existing warehouse, tooling, and orchestration.
  • Expected data volumes and change cadence.
  • Governance requirements and access controls.
  • 业务指标与决策用例。
  • 源系统、数据新鲜度和延迟需求。
  • 现有数据仓库、工具编排情况。
  • 预期数据量与变更频率。
  • 治理要求与访问控制规则。

Workflow

工作流程

  1. Define metric dictionary and grains.
  2. Design staging, intermediate, and mart layers.
  3. Model dimensions and facts with clear keys.
  4. Build semantic layer and metric definitions.
  5. Add tests for freshness, nulls, ranges, and duplicates.
  6. Document lineage, owners, and SLAs.
  7. Plan rollout, backfills, and validation checks.
  1. 定义指标字典与粒度。
  2. 设计 staging、中间层与 mart 层。
  3. 使用清晰键值对建模维度与事实表。
  4. 构建语义层与指标定义。
  5. 添加数据新鲜度、空值、范围及重复值测试。
  6. 记录数据血缘、所有者与SLA。
  7. 规划上线、数据回填与验证检查。

Outputs

输出成果

  • Metric dictionary and semantic model.
  • Data model with schema and grain definitions.
  • Transformation plan and dbt or SQLMesh structure.
  • Data quality test suite and alerting plan.
  • Documentation and ownership map.
  • 指标字典与语义模型。
  • 包含 schema 和粒度定义的数据模型。
  • 转换计划及dbt或SQLMesh结构。
  • 数据质量测试套件与告警计划。
  • 文档与所有权映射表。

Quality Checks

质量检查

  • Keep metric definitions stable and versioned.
  • Treat metrics as APIs: document changes, deprecate safely, and backfill deliberately.
  • Define data contracts for core tables (schema, freshness, keys) to control downstream breakage.
  • Avoid mixed grains in a single model.
  • Ensure tests cover critical joins and aggregates.
  • Validate against source of truth and historical baselines.
  • 保持指标定义稳定并进行版本控制。
  • 将指标视为API:记录变更、安全弃用并谨慎回填数据。
  • 为核心表定义数据契约(schema、新鲜度、键值)以避免下游故障。
  • 避免在单个模型中混合不同粒度。
  • 确保测试覆盖关键关联与聚合操作。
  • 对照可信数据源与历史基线进行验证。

Templates

模板

  • assets/metric-dictionary.md
    for metric definitions and owners.
  • assets/semantic-layer-spec.md
    for entities, measures, and dimensions.
  • assets/data-quality-test-plan.md
    for test coverage planning.
  • assets/metric-dictionary.md
    :用于指标定义与所有者记录。
  • assets/semantic-layer-spec.md
    :用于实体、度量与维度定义。
  • assets/data-quality-test-plan.md
    :用于测试覆盖规划。

Resources

参考资源

  • references/modeling-patterns.md
    for modeling guidance and data quality patterns.
  • references/tool-comparison-2026.md
    for dbt vs SQLMesh vs Coalesce decision matrix.
  • data/sources.json
    for curated vendor docs and trend-tracking sources (use as a WebSearch seed list).
  • references/modeling-patterns.md
    :建模指南与数据质量模式。
  • references/tool-comparison-2026.md
    :dbt、SQLMesh与Coalesce的决策矩阵。
  • data/sources.json
    :精选供应商文档与趋势跟踪源(用作WebSearch种子列表)。

Related Skills

相关技能

  • Use data-lake-platform for platform architecture.
  • Use data-sql-optimization for query tuning.
  • Use ai-ml-data-science for modeling and experiments.

  • 使用data-lake-platform进行平台架构设计。
  • 使用data-sql-optimization进行查询调优。
  • 使用ai-ml-data-science进行建模与实验。

Trend Awareness Protocol

趋势感知协议

IMPORTANT: When users ask recommendation questions about analytics engineering, data modeling, or BI, you MUST use WebSearch to check current trends before answering. If WebSearch is unavailable, use
data/sources.json
+ web browsing and state what you verified vs assumed.
重要提示:当用户询问关于分析工程、数据建模或BI的推荐问题时,你必须先通过WebSearch检查当前趋势再作答。若WebSearch不可用,使用
data/sources.json
+网页浏览,并说明已验证内容与假设内容。

Trigger Conditions

触发条件

  • "What's the best tool for [analytics engineering/data modeling/BI]?"
  • "What should I use for [transformation/semantic layer/metrics]?"
  • "What's the latest in analytics engineering?"
  • "Current best practices for [dbt/metrics layers/data quality]?"
  • "Is [tool/approach] still relevant in 2026?"
  • "[dbt] vs [SQLMesh] vs [other]?"
  • "Best BI tool for [use case]?"
  • "SQLMesh acquisition" or "Fivetran transformation"
  • "Agentic analytics" or "AI data workflows"
  • "Metric debt" or "metric governance"
  • "分析工程/数据建模/BI的最佳工具是什么?"
  • "转换/语义层/指标应该用什么工具?"
  • "分析工程的最新动态是什么?"
  • "dbt/指标层/数据质量的当前最佳实践?"
  • "[工具/方法]在2026年是否仍适用?"
  • "dbt vs SQLMesh vs [其他工具]?"
  • "适用于[用例]的最佳BI工具?"
  • "SQLMesh收购"或"Fivetran转换"
  • "Agentic analytics"或"AI数据工作流"
  • "指标债务"或"指标治理"

Required Searches

必要搜索

  1. Search:
    "analytics engineering best practices 2026"
  2. Search:
    "[dbt/SQLMesh/semantic layer] vs alternatives 2026"
  3. Search:
    "analytics engineering trends January 2026"
  4. Search:
    "[specific tool] new releases 2026"
  5. Search:
    "agentic analytics AI data 2026"
    (for AI-related queries)
  1. 搜索:
    "analytics engineering best practices 2026"
  2. 搜索:
    "[dbt/SQLMesh/semantic layer] vs alternatives 2026"
  3. 搜索:
    "analytics engineering trends January 2026"
  4. 搜索:
    "[特定工具] new releases 2026"
  5. 搜索:
    "agentic analytics AI data 2026"
    (针对AI相关查询)

What to Report

需要汇报的内容

After searching, provide:
  • Current landscape: What analytics tools/patterns are popular NOW
  • Emerging trends: New tools, patterns, or standards gaining traction
  • Deprecated/declining: Tools/approaches losing relevance or support
  • Recommendation: Based on fresh data, not just static knowledge
搜索完成后,需提供:
  • 当前格局:目前流行的分析工具/模式
  • 新兴趋势:正在获得关注的新工具、模式或标准
  • 已过时/衰退:正在失去相关性或支持的工具/方法
  • 推荐方案:基于最新数据,而非仅静态知识

Example Topics (verify with fresh search)

示例主题(需通过最新搜索验证)

  • Transformation tools (dbt, SQLMesh, Coalesce)
  • Semantic layers (dbt Semantic Layer, Cube, AtScale, warehouse-native)
  • Metrics stores and headless BI
  • Data quality tools (dbt tests, Elementary, dbt-expectations/Metaplane)
  • BI platforms (Metabase, Superset, Lightdash, Hex)
  • Data modeling patterns (dimensional, wide tables, activity schema)
  • Analytics engineering workflows and CI/CD
  • Agentic AI workflows for analytics
  • Data mesh and domain-owned data products
  • 转换工具(dbt、SQLMesh、Coalesce)
  • 语义层(dbt Semantic Layer、Cube、AtScale、数据仓库原生层)
  • 指标存储与无头BI
  • 数据质量工具(dbt tests、Elementary、dbt-expectations/Metaplane)
  • BI平台(Metabase、Superset、Lightdash、Hex)
  • 数据建模模式(维度建模、宽表、活动schema)
  • 分析工程工作流与CI/CD
  • 面向分析的Agentic AI工作流
  • 数据网格与领域级数据产品