Search Results: etl

Found 91 Skills

Data Processingk-dense-ai/claude-scienti...

polars

Fast in-memory DataFrame library for datasets that fit in RAM. Use when pandas is too slow but data still fits in memory. Lazy evaluation, parallel execution, Apache Arrow backend. Best for 1-100GB datasets, ETL pipelines, faster pandas replacement. For larger-than-RAM data use dask or vaex.

🇺🇸|EnglishTranslated

Data Processingnvidia/skills

accelerated-computing-cudf

Official NVIDIA-authored guidance for NVIDIA cuDF GPU DataFrames, pandas acceleration, dask-cuDF, ETL, joins, groupby, CSV/Parquet I/O, nullable semantics, and multi-GPU DataFrame workloads.

🇺🇸|EnglishTranslated

23 scripts/Attention

Data Processingalirezarezvani/claude-ski...

senior-data-engineer

Data engineering skill for building scalable data pipelines, ETL/ELT systems, and data infrastructure. Expertise in Python, SQL, Spark, Airflow, dbt, Kafka, and modern data stack. Includes data modeling, pipeline orchestration, data quality, and DataOps. Use when designing data architectures, building data pipelines, optimizing data workflows, implementing data governance, or troubleshooting data issues.

🇺🇸|EnglishTranslated

3 scripts/Attention

Data Processingvamseeachanta/workspace-h...

airflow

Python DAG workflow orchestration using Apache Airflow for data pipelines, ETL processes, and scheduled task automation

🇺🇸|EnglishTranslated

Data Processingpatricio0312rev/skills

etl-sync-job-builder

Designs reliable ETL and data synchronization jobs with incremental updates, idempotency guarantees, watermark tracking, error handling, and retry logic. Use for "ETL jobs", "data sync", "incremental sync", or "data pipeline".

🇺🇸|EnglishTranslated

Data Processingclaude-office-skills/skil...

data-pipeline

Data pipeline and ETL automation - extract, transform, load workflows for data integration and analytics

🇺🇸|EnglishTranslated

Data Processingjorgealves/agent_skills

python-data-pipeline-designer

Design ETL workflows with data validation using tools like Pandas, Dask, or PySpark. Use when building robust data processing systems in Python.

🇺🇸|EnglishTranslated

Data Processinglanej/dotfiles

bigquery

Use bigquery CLI (instead of `bq`) for all Google BigQuery and GCP data warehouse operations including SQL query execution, data ingestion (streaming insert, bulk load, JSONL/CSV/Parquet), data extraction/export, dataset/table/view management, external tables, schema operations, query templates, cost estimation with dry-run, authentication with gcloud, data pipelines, ETL workflows, and MCP/LSP server integration for AI-assisted querying and editor support. Modern Rust-based replacement for the Python `bq` CLI with faster startup, better cost awareness, and streaming support. Handles both small-scale streaming inserts (<1000 rows) and large-scale bulk loading (>10MB files), with support for Cloud Storage integration.

🇺🇸|EnglishTranslated

Data Processingdkyazzentwatwa/chatgpt-sk...

dataset-comparer

Compare two datasets to find differences, added/removed rows, changed values. Use for data validation, ETL verification, or tracking changes.

🇺🇸|EnglishTranslated

1 scripts/Checked

Data Processingk1lgor/virtual-company

data-engineer

Use this for SQL queries, database schema design, ETL pipelines, data transformations (pandas/Spark), and data validation.

🇺🇸|EnglishTranslated

Data Processingsharadchaturveda-coder/ag...

agency-data-engineer

Build reliable data pipelines and analytics-ready datasets. USE when cleaning data, designing ETL/ELT, defining contracts, or shipping reproducible data workflows.

🇺🇸|EnglishTranslated

Data Processingdaffy0208/ai-dev-standard...

data-engineer

Expert in data pipelines, ETL processes, and data infrastructure

🇺🇸|EnglishTranslated