Loading...
Loading...
Found 48 Skills
Use this for SQL queries, database schema design, ETL pipelines, data transformations (pandas/Spark), and data validation.
Data engineering patterns for ETL pipelines, data warehousing, Apache Spark, and data quality validation
Scalable data processing for ML workloads. Streaming execution across CPU/GPU, supports Parquet/CSV/JSON/images. Integrates with Ray Train, PyTorch, TensorFlow. Scales from single machine to 100s of nodes. Use for batch inference, data preprocessing, multi-modal data loading, or distributed ETL pipelines.
Guides technology selection and implementation of AI and ML features in .NET 8+ applications using ML.NET, Microsoft.Extensions.AI (MEAI), Microsoft Agent Framework (MAF), GitHub Copilot SDK, ONNX Runtime, and OllamaSharp. Covers the full spectrum from classic ML through modern LLM orchestration to local inference. Use when adding classification, regression, clustering, anomaly detection, recommendation, LLM integration (text generation, summarization, reasoning), RAG pipelines with vector search, agentic workflows with tool calling, Copilot extensions, or custom model inference via ONNX Runtime to a .NET project. DO NOT USE FOR projects targeting .NET Framework (requires .NET 8+), the task is pure data engineering or ETL with no ML/AI component, or the project needs a custom deep learning training loop (use Python with PyTorch/TensorFlow, then export to ONNX for .NET inference).
Use this skill when designing data warehouses, building star or snowflake schemas, implementing slowly changing dimensions (SCDs), writing analytical SQL for Snowflake or BigQuery, creating fact and dimension tables, or planning ETL/ELT pipelines for analytics. Triggers on dimensional modeling, surrogate keys, conformed dimensions, warehouse architecture, data vault, partitioning strategies, materialized views, and any task requiring OLAP schema design or warehouse query optimization.
Converts legacy SQL to modular dbt models. Use when migrating SQL to dbt for: (1) Converting stored procedures, views, or raw SQL files to dbt models (2) Task mentions "migrate", "convert", "legacy SQL", "transform to dbt", or "modernize" (3) Breaking monolithic queries into modular layers (discovers project conventions first) (4) Porting existing data pipelines or ETL to dbt patterns Checks for existing models/sources, builds and validates layer by layer.
OmniStudio Data Mapper (formerly DataRaptor) creation and validation with 100-point scoring. Use when building Extract, Transform, Load, or Turbo Extract Data Mappers, mapping Salesforce object fields, or reviewing existing Data Mapper configurations. TRIGGER when: user creates Data Mappers, configures field mappings, works with OmniDataTransform metadata, or asks about DataRaptor/Data Mapper patterns. DO NOT TRIGGER when: building Integration Procedures (use sf-industry-commoncore-integration-procedure), authoring OmniScripts (use sf-industry-commoncore-omniscript), or analyzing cross-component dependencies (use sf-industry-commoncore-omnistudio-analyze).
Quick data freshness check. Use when the user asks if data is up to date, when a table was last updated, if data is stale, or needs to verify data currency before using it.
Implement robust batch processing systems with job queues, schedulers, background tasks, and distributed workers. Use when processing large datasets, scheduled tasks, async operations, or resource-intensive computations.
You are a data pipeline architecture expert specializing in scalable, reliable, and cost-effective data pipelines for batch and streaming data processing.
Use when "Polars", "fast dataframe", "lazy evaluation", "Arrow backend", or asking about "pandas alternative", "parallel dataframe", "large CSV processing", "ETL pipeline", "expression API"
Use when "data pipelines", "ETL", "data warehousing", "data lakes", or asking about "Airflow", "Spark", "dbt", "Snowflake", "BigQuery", "data modeling"