Loading...
Loading...
Found 67 Skills
Guides understanding and working with Apache Beam runners (Direct, Dataflow, Flink, Spark, etc.). Use when configuring pipelines for different execution environments or debugging runner-specific issues.
Excel to CSV conversion skill. Convert specific bounding tables or entire worksheets within `.xlsx` or `.xls` binary formats into flat `.csv` tabular data. Use this when you find an Excel file and need its data mapped into an accessible format for text analysis, filtering, or programmatic pipelining.
Logstash integration. Manage data, records, and automate workflows. Use when the user wants to interact with Logstash data.
Binary streaming between workers via channels. Use when building data pipelines, file transfers, streaming responses, or any pattern requiring binary data transfer between functions.
Workflow and best practices for writing Apache Airflow DAGs. Use when the user wants to create a new DAG, write pipeline code, or asks about DAG patterns and conventions. For testing and debugging DAGs, see the testing-dags skill.
Expert guidance for working with Dagster and the dg CLI. ALWAYS use before doing any task that requires knowledge specific to Dagster, or that references assets, materialization, or data pipelines. Common tasks may include creating a new project, adding new definitions, understanding the current project structure, answering general questions about the codebase (finding asset, schedule, sensor, component or job definitions), debugging issues, or providing deep information about a specific Dagster concept.
Schema Validator - Auto-activating skill for Data Pipelines. Triggers on: schema validator, schema validator Part of the Data Pipelines skill category.
Pipeline state management for Goldsky Turbo — pause, resume, restart, and delete commands with their rules and safety behavior. Use this skill when the user asks: will deleting my pipeline lose the data already in my postgres/clickhouse table, how do I pause a pipeline while doing database maintenance, how do I restart from block zero to reprocess all historical data, can I update a running streaming pipeline in place or do I have to delete and redeploy, will resuming a paused pipeline pick up from where it left off (checkpoint), how do I re-run a completed job pipeline from the beginning, can I pause or restart a job-mode pipeline. Also covers what happens to checkpoint state on delete, and job auto-deletion 1 hour after termination. For actively diagnosing why a pipeline is broken or erroring, use /turbo-doctor instead.
Use when writing SQL queries, building analytics dashboards, tracking metrics, designing data pipelines, or analyzing user behavior and product usage
RudderStack HTTP integration. Manage data, records, and automate workflows. Use when the user wants to interact with RudderStack HTTP data.
Expert guidance for creating, modifying, and optimizing dbt pipelines for BigQuery. Use this skill whenever user asks for generating or modifying a dbt model or project. Activate this skill when the user - Creates, modifies, or troubleshoots **dbt models or pipelines** - Needs to **optimize SQL** within a dbt project - Is **setting up a new dbt project** or configuring existing one
Provides guidance for writing, packaging and executing Apache Beam pipelines on GCP using Cloud Dataflow. Use when: - Creating an Apache Beam Dataflow pipeline. - Creating a Google Flex Template.