Search Results: data-pipeline

Found 96 Skills

airflow

Apache Airflow workflow orchestration. Use for data pipelines.

Data Processingdatabricks/databricks-age...

databricks-pipelines

Develop Lakeflow Spark Declarative Pipelines (formerly Delta Live Tables) on Databricks. Use when building batch or streaming data pipelines with Python or SQL. Invoke BEFORE starting implementation.

🇺🇸|EnglishTranslated

Data Processinggemini-cli-extensions/dat...

dbt-bigquery

Expert guidance for creating, modifying, and optimizing dbt pipelines for BigQuery. Use this skill whenever user asks for generating or modifying a dbt model or project. Activate this skill when the user - Creates, modifies, or troubleshoots **dbt models or pipelines** - Needs to **optimize SQL** within a dbt project - Is **setting up a new dbt project** or configuring existing one

🇺🇸|EnglishTranslated

Data Processinggemini-cli-extensions/dat...

gcp-pipeline-resource-provisioning

Automates declarative resource creation and provisioning for data pipelines, supporting BigQuery, Dataform, Dataproc, BigQuery Data Transfer Service (DTS), and other resources. It manages environment-specific configurations (dev, staging, prod) through a deployment.yaml file. Use when: - Modifying or creating deployment.yaml for deployment settings. - Resolving environment-specific variables (e.g., Project IDs, Regions) for deployment. - Provisioning supported infrastructure like BigQuery datasets/tables, Dataform resources, or DTS resources via deployment.yaml. Do not use when: - Resources already exist. - Managing resources not supported by `gcloud beta orchestration-pipelines resource-types list`. - Managing general cloud infrastructure (VMs, networks, Kubernetes, IAM policies), which are better suited for Terraform. - Infrastructure spans multiple cloud providers (AWS, Azure, etc.). - Already uses Terraform for the target resources.

🇺🇸|EnglishTranslated

DevOps & Cloud Servicesgemini-cli-extensions/dat...

gcp-pipeline-orchestration

This skill helps the agent generate or update orchestration pipeline definitions for Google Cloud Composer to initialize orchestration pipeline or update the orchestration definition for orchestration of various data pipelines, like dbt pipelines, notebooks, Spark jobs, Dataform, Python scripts or inline BigQuery SQL queries. This skill also helps deploy and trigger orchestration pipelines.

🇺🇸|EnglishTranslated

1 scripts/Checked

Data Processingastronomer/agents

authoring-dags

Workflow and best practices for writing Apache Airflow DAGs. Use when the user wants to create a new DAG, write pipeline code, or asks about DAG patterns and conventions. For testing and debugging DAGs, see the testing-dags skill.

🇺🇸|EnglishTranslated

Data Processingalirezarezvani/claude-ski...

senior-data-engineer

Data engineering skill for building scalable data pipelines, ETL/ELT systems, and data infrastructure. Expertise in Python, SQL, Spark, Airflow, dbt, Kafka, and modern data stack. Includes data modeling, pipeline orchestration, data quality, and DataOps. Use when designing data architectures, building data pipelines, optimizing data workflows, implementing data governance, or troubleshooting data issues.

🇺🇸|EnglishTranslated

3 scripts/Attention

Data Processingsickn33/antigravity-aweso...

data-engineer

Build scalable data pipelines, modern data warehouses, and real-time streaming architectures. Implements Apache Spark, dbt, Airflow, and cloud-native data platforms. Use PROACTIVELY for data pipeline design, analytics infrastructure, or modern data stack implementation.

🇺🇸|EnglishTranslated

Data Processingrand/cc-polymath

discover-data

Automatically discover data pipeline and ETL skills when working with ETL, data pipelines, streaming, batch processing, data validation, or pipeline orchestration. Activates for data development tasks.

🇺🇸|EnglishTranslated

Data Processingmarkdown-viewer/skills

data-analytics

Create data analytics and data pipeline diagrams using PlantUML syntax with analytics/database stencil icons. Best for ETL pipelines, data lakes, real-time streaming, data warehousing, and BI dashboards. NOT for simple flowcharts (use mermaid) or general cloud infra (use cloud skill).

🇺🇸|EnglishTranslated

Data Processinggemini-cli-extensions/dat...

gcp-dataflow

Provides guidance for writing, packaging and executing Apache Beam pipelines on GCP using Cloud Dataflow. Use when: - Creating an Apache Beam Dataflow pipeline. - Creating a Google Flex Template.

🇺🇸|EnglishTranslated

Data Processingnvidia-nemo/datadesigner

data-designer

Use when the user wants to create a dataset, generate synthetic data, or build a data generation pipeline.

🇺🇸|EnglishTranslated

1 scripts/Checked