Loading...
Loading...
Found 41 Skills
Use this skill when building dbt models, designing semantic layers, defining metrics, creating self-serve analytics, or structuring a data warehouse for analyst consumption. Triggers on dbt project setup, model layering (staging, intermediate, marts), ref() and source() usage, YAML schema definitions, metrics definitions, semantic layer configuration, dimensional modeling, slowly changing dimensions, data testing, and any task requiring analytics engineering best practices.
dbt Expert Engineer Skill - Comprehensive guide for dbt development best practices, command execution, and environment configuration Use when: - Running dbt commands (debug, compile, run, test, show) - Setting up Issue-specific targets in profiles.yml - Working with Databricks SQL dialect in dbt
Data pipeline expert for ETL, Apache Spark, Airflow, dbt, and data quality
Use this skill when building data pipelines, ETL/ELT workflows, or data transformation layers. Triggers on Airflow DAG design, dbt model creation, Spark job optimization, streaming vs batch architecture decisions, data ingestion, data quality checks, pipeline orchestration, incremental loads, CDC (change data capture), schema evolution, and data warehouse modeling. Acts as a senior data engineer advisor for building reliable, scalable data infrastructure.
Expert guidance for Dagster data orchestration including assets, resources, schedules, sensors, partitions, testing, and ETL patterns. Use when building or extending Dagster projects, writing assets, configuring automation, or integrating with dbt/dlt/Sling.
This skill helps the agent generate or update orchestration pipeline definitions for Google Cloud Composer to initialize orchestration pipeline or update the orchestration definition for orchestration of various data pipelines, like dbt pipelines, notebooks, Spark jobs, Dataform, Python scripts or inline BigQuery SQL queries. This skill also helps deploy and trigger orchestration pipelines.
Build scalable data pipelines, modern data warehouses, and real-time streaming architectures. Implements Apache Spark, dbt, Airflow, and cloud-native data platforms. Use PROACTIVELY for data pipeline design, analytics infrastructure, or modern data stack implementation.
Data pipelines, feature stores, and embedding generation for AI/ML systems. Use when building RAG pipelines, ML feature serving, or data transformations. Covers feature stores (Feast, Tecton), embedding pipelines, chunking strategies, orchestration (Dagster, Prefect, Airflow), dbt transformations, data versioning (LakeFS), and experiment tracking (MLflow, W&B).
Data engineering skill for building scalable data pipelines, ETL/ELT systems, and data infrastructure. Expertise in Python, SQL, Spark, Airflow, dbt, Kafka, and modern data stack. Includes data modeling, pipeline orchestration, data quality, and DataOps. Use when designing data architectures, building data pipelines, optimizing data workflows, implementing data governance, or troubleshooting data issues.
Expert data engineer for ETL/ELT pipelines, streaming, data warehousing. Activate on: data pipeline, ETL, ELT, data warehouse, Spark, Kafka, Airflow, dbt, data modeling, star schema, streaming data, batch processing, data quality. NOT for: API design (use api-architect), ML training (use ML skills), dashboards (use design skills).
Optimize BigQuery compute costs by assigning data models (Dataform, dbt, Airflow) to slot reservations or on-demand compute based on Masthead recommendations.
Initialize warehouse schema discovery. Generates .astro/warehouse.md with all table metadata for instant lookups. Run once per project, refresh when schema changes. Use when user says "/data:init" or asks to set up data discovery.