Loading...
Loading...
Found 44 Skills
Guides understanding and working with Apache Beam runners (Direct, Dataflow, Flink, Spark, etc.). Use when configuring pipelines for different execution environments or debugging runner-specific issues.
Schema Validator - Auto-activating skill for Data Pipelines. Triggers on: schema validator, schema validator Part of the Data Pipelines skill category.
Master data engineering, ETL/ELT, data warehousing, SQL optimization, and analytics. Use when building data pipelines, designing data systems, or working with large datasets.
Use this skill when building real-time or near-real-time data pipelines. Covers Kafka, Flink, Spark Streaming, Snowpipe, BigQuery streaming, materialized views, and batch-vs-streaming decisions. Common phrases: "real-time pipeline", "Kafka consumer", "streaming vs batch", "low latency ingestion". Do NOT use for batch integration patterns (use integration-patterns-skill) or pipeline orchestration (use data-orchestration-skill).
Data Catalog Updater - Auto-activating skill for Data Pipelines. Triggers on: data catalog updater, data catalog updater Part of the Data Pipelines skill category.
Airflow Operator Creator - Auto-activating skill for Data Pipelines. Triggers on: airflow operator creator, airflow operator creator Part of the Data Pipelines skill category.
Master Node.js streams for memory-efficient processing of large datasets, real-time data handling, and building data pipelines
Python DAG workflow orchestration using Apache Airflow for data pipelines, ETL processes, and scheduled task automation