Search Results: delta-lake

Found 7 Skills

Data Processinglegout/data-platform-agen...

data-engineering-storage-remote-access-integrations-delta-lake

Delta Lake integration with cloud storage (S3, GCS, Azure). Covers storage_options, PyArrow filesystem, time travel, and partitioned writes.

🇺🇸|EnglishTranslated

Data Processingpersonamanagmentlayer/pcl

databricks-expert

Expert-level Databricks platform, Apache Spark, Delta Lake, MLflow, notebooks, and cluster management

🇺🇸|EnglishTranslated

Data Processingmicrosoft/skills-for-fabr...

e2e-medallion-architecture

Implement end-to-end Medallion Architecture (Bronze/Silver/Gold) lakehouse patterns in Microsoft Fabric using PySpark, Delta Lake, and Fabric Pipelines. Use when the user wants to: (1) design a Bronze/Silver/Gold data lakehouse, (2) set up multi-layer workspace with lakehouses for each tier, (3) build ingestion-to-analytics pipelines with data quality enforcement, (4) optimize Spark configurations per medallion layer, (5) orchestrate Bronze-to-Silver-to-Gold flows via notebooks. Triggers: "medallion architecture", "bronze silver gold", "lakehouse layers", "e2e data pipeline", "end-to-end lakehouse", "data lakehouse pattern", "multi-layer lakehouse", "build medallion", "setup medallion".

🇺🇸|EnglishTranslated

Data Processingmajesticlabs-dev/majestic...

parquet-coder

Columnar file patterns including partitioning, predicate pushdown, and schema evolution.

🇺🇸|EnglishTranslated

Data Processingg1joshi/agent-skills

spark

Apache Spark distributed computing. Use for big data processing.

🇺🇸|EnglishTranslated

Data Processingmicrosoft/skills-for-fabr...

spark-consumption-cli

Analyze lakehouse data interactively using Fabric Livy sessions and PySpark/Spark SQL for advanced analytics, DataFrames, cross-lakehouse joins, Delta time-travel, and unstructured/JSON data. Use when the user explicitly asks for PySpark, Spark DataFrames, Livy sessions, or Python-based analysis — NOT for simple SQL queries. Triggers: "PySpark", "Spark SQL", "analyze with PySpark", "Spark DataFrame", "Livy session", "lakehouse with Python", "PySpark analysis", "PySpark data quality", "Delta time-travel with Spark".

🇺🇸|EnglishTranslated

Data Processinglegout/data-platform-agen...

data-engineering-storage-remote-access-integrations-duckdb

Using DuckDB with remote cloud storage via HTTPFS extension, fsspec, and Delta Lake integration. Covers S3, GCS, Azure, and S3-compatible endpoints.

🇺🇸|EnglishTranslated