Loading...
Loading...
Found 32 Skills
Data Quality Checker - Auto-activating skill for Data Pipelines. Triggers on: data quality checker, data quality checker Part of the Data Pipelines skill category.
Interactive tutorial that teaches Snowflake Dynamic Tables hands-on. The agent guides users step-by-step through building data pipelines with automatic refresh, incremental processing, and CDC patterns. Use when the user wants to learn dynamic tables, build a DT pipeline, or understand DT vs streams/tasks/materialized views.
Guides understanding and working with Apache Beam runners (Direct, Dataflow, Flink, Spark, etc.). Use when configuring pipelines for different execution environments or debugging runner-specific issues.
Schema Validator - Auto-activating skill for Data Pipelines. Triggers on: schema validator, schema validator Part of the Data Pipelines skill category.
Design ETL workflows with data validation using tools like Pandas, Dask, or PySpark. Use when building robust data processing systems in Python.
Patterns for efficient ML data pipelines using Polars, Arrow, and ClickHouse. TRIGGERS - data pipeline, polars vs pandas, arrow format, clickhouse ml, efficient loading, zero-copy, memory optimization.
Develops data processing pipelines, integrations, and machine learning scenarios in SAP Data Intelligence Cloud. Use when building graphs/pipelines with operators, integrating ABAP/S4HANA systems, creating replication flows, developing ML scenarios with JupyterLab, or using Data Transformation Language functions. Covers Gen1/Gen2 operators, subengines (Python, Node.js, C++), structured data operators, and repository objects.
Pyspark Transformer - Auto-activating skill for Data Pipelines. Triggers on: pyspark transformer, pyspark transformer Part of the Data Pipelines skill category.
Quality control metrics and filtering thresholds for protein design. Use this skill when: (1) Evaluating design quality for binding, expression, or structure, (2) Setting filtering thresholds for pLDDT, ipTM, PAE, (3) Checking sequence liabilities (cysteines, deamidation, polybasic clusters), (4) Creating multi-stage filtering pipelines, (5) Computing PyRosetta interface metrics (dG, SC, dSASA), (6) Checking biophysical properties (instability, GRAVY, pI), (7) Ranking designs with composite scoring. This skill provides research-backed thresholds from binder design competitions and published benchmarks.
Flink Job Creator - Auto-activating skill for Data Pipelines. Triggers on: flink job creator, flink job creator Part of the Data Pipelines skill category.
Apache Airflow workflow orchestration. Use for data pipelines.
Exactly-once processing semantics with distributed coordination for file-based data pipelines. Atomic file claiming, status tracking, and automatic retry with in-memory fallback.