Search Results: etl

Found 48 Skills

data-engineer

Use this for SQL queries, database schema design, ETL pipelines, data transformations (pandas/Spark), and data validation.

🇺🇸|EnglishTranslated

Data Processingrohitg00/awesome-claude-c...

data-engineering

Data engineering patterns for ETL pipelines, data warehousing, Apache Spark, and data quality validation

🇺🇸|EnglishTranslated

AI & Machine Learningorchestra-research/ai-res...

ray-data

Scalable data processing for ML workloads. Streaming execution across CPU/GPU, supports Parquet/CSV/JSON/images. Integrates with Ray Train, PyTorch, TensorFlow. Scales from single machine to 100s of nodes. Use for batch inference, data preprocessing, multi-modal data loading, or distributed ETL pipelines.

🇺🇸|EnglishTranslated

AI & Machine Learningdotnet/skills

technology-selection

Guides technology selection and implementation of AI and ML features in .NET 8+ applications using ML.NET, Microsoft.Extensions.AI (MEAI), Microsoft Agent Framework (MAF), GitHub Copilot SDK, ONNX Runtime, and OllamaSharp. Covers the full spectrum from classic ML through modern LLM orchestration to local inference. Use when adding classification, regression, clustering, anomaly detection, recommendation, LLM integration (text generation, summarization, reasoning), RAG pipelines with vector search, agentic workflows with tool calling, Copilot extensions, or custom model inference via ONNX Runtime to a .NET project. DO NOT USE FOR projects targeting .NET Framework (requires .NET 8+), the task is pure data engineering or ETL with no ML/AI component, or the project needs a custom deep learning training loop (use Python with PyTorch/TensorFlow, then export to ONNX for .NET inference).

🇺🇸|EnglishTranslated

Data Processingabsolutelyskilled/absolut...

data-warehousing

Use this skill when designing data warehouses, building star or snowflake schemas, implementing slowly changing dimensions (SCDs), writing analytical SQL for Snowflake or BigQuery, creating fact and dimension tables, or planning ETL/ELT pipelines for analytics. Triggers on dimensional modeling, surrogate keys, conformed dimensions, warehouse architecture, data vault, partitioning strategies, materialized views, and any task requiring OLAP schema design or warehouse query optimization.

🇺🇸|EnglishTranslated

Data Processingaltimateai/data-engineeri...

migrating-sql-to-dbt

Converts legacy SQL to modular dbt models. Use when migrating SQL to dbt for: (1) Converting stored procedures, views, or raw SQL files to dbt models (2) Task mentions "migrate", "convert", "legacy SQL", "transform to dbt", or "modernize" (3) Breaking monolithic queries into modular layers (discovers project conventions first) (4) Porting existing data pipelines or ETL to dbt patterns Checks for existing models/sources, builds and validates layer by layer.

🇺🇸|EnglishTranslated

Data Processingjaganpro/sf-skills

sf-industry-commoncore-datamapper

OmniStudio Data Mapper (formerly DataRaptor) creation and validation with 100-point scoring. Use when building Extract, Transform, Load, or Turbo Extract Data Mappers, mapping Salesforce object fields, or reviewing existing Data Mapper configurations. TRIGGER when: user creates Data Mappers, configures field mappings, works with OmniDataTransform metadata, or asks about DataRaptor/Data Mapper patterns. DO NOT TRIGGER when: building Integration Procedures (use sf-industry-commoncore-integration-procedure), authoring OmniScripts (use sf-industry-commoncore-omniscript), or analyzing cross-component dependencies (use sf-industry-commoncore-omnistudio-analyze).

🇺🇸|EnglishTranslated

Data Processingastronomer/agents

checking-freshness

Quick data freshness check. Use when the user asks if data is up to date, when a table was last updated, if data is stale, or needs to verify data currency before using it.

🇺🇸|EnglishTranslated

Data Processingaj-geddes/useful-ai-promp...

batch-processing-jobs

Implement robust batch processing systems with job queues, schedulers, background tasks, and distributed workers. Use when processing large datasets, scheduled tasks, async operations, or resource-intensive computations.

🇺🇸|EnglishTranslated

Data Processingsickn33/antigravity-aweso...

data-engineering-data-pipeline

You are a data pipeline architecture expert specializing in scalable, reliable, and cost-effective data pipelines for batch and streaming data processing.

🇺🇸|EnglishTranslated

Data Processingeyadsibai/ltk

polars

Use when "Polars", "fast dataframe", "lazy evaluation", "Arrow backend", or asking about "pandas alternative", "parallel dataframe", "large CSV processing", "ETL pipeline", "expression API"

🇺🇸|EnglishTranslated

Data Processingeyadsibai/ltk

data-engineering

Use when "data pipelines", "ETL", "data warehousing", "data lakes", or asking about "Airflow", "Spark", "dbt", "Snowflake", "BigQuery", "data modeling"

🇺🇸|EnglishTranslated