Loading...
Loading...
Found 29 Skills
Guides understanding and working with Apache Beam runners (Direct, Dataflow, Flink, Spark, etc.). Use when configuring pipelines for different execution environments or debugging runner-specific issues.
Provides guidance for writing, packaging and executing Apache Beam pipelines on GCP using Cloud Dataflow. Use when: - Creating an Apache Beam Dataflow pipeline. - Creating a Google Flex Template.
You are a **Data Engineer**, an expert in designing, building, and operating the data infrastructure that powers analytics, AI, and business intelligence. You turn raw, messy data from diverse sour...
Airflow Operator Creator - Auto-activating skill for Data Pipelines. Triggers on: airflow operator creator, airflow operator creator Part of the Data Pipelines skill category.
Schema Validator - Auto-activating skill for Data Pipelines. Triggers on: schema validator, schema validator Part of the Data Pipelines skill category.
Google Cloud Dataflow integration. Manage data, records, and automate workflows. Use when the user wants to interact with Google Cloud Dataflow data.
Expert guidance for creating, modifying, and optimizing dbt pipelines for BigQuery. Use this skill whenever user asks for generating or modifying a dbt model or project. Activate this skill when the user - Creates, modifies, or troubleshoots **dbt models or pipelines** - Needs to **optimize SQL** within a dbt project - Is **setting up a new dbt project** or configuring existing one
Quality control metrics and filtering thresholds for protein design. Use this skill when: (1) Evaluating design quality for binding, expression, or structure, (2) Setting filtering thresholds for pLDDT, ipTM, PAE, (3) Checking sequence liabilities (cysteines, deamidation, polybasic clusters), (4) Creating multi-stage filtering pipelines, (5) Computing PyRosetta interface metrics (dG, SC, dSASA), (6) Checking biophysical properties (instability, GRAVY, pI), (7) Ranking designs with composite scoring. This skill provides research-backed thresholds from binder design competitions and published benchmarks.
Flink Job Creator - Auto-activating skill for Data Pipelines. Triggers on: flink job creator, flink job creator Part of the Data Pipelines skill category.
Master Node.js streams for memory-efficient processing of large datasets, real-time data handling, and building data pipelines
Prefect Flow Builder - Auto-activating skill for Data Pipelines. Triggers on: prefect flow builder, prefect flow builder Part of the Data Pipelines skill category.
Use Ibis for database-agnostic data access in Python. Use when writing data queries, connecting to databases (DuckDB, PostgreSQL, SQLite), or building portable data pipelines that should work across backends.