Loading...
Loading...
Found 20 Skills
Fast in-memory DataFrame library for datasets that fit in RAM. Use when pandas is too slow but data still fits in memory. Lazy evaluation, parallel execution, Apache Arrow backend. Best for 1-100GB datasets, ETL pipelines, faster pandas replacement. For larger-than-RAM data use dask or vaex.
Fast DataFrame library (Apache Arrow). Select, filter, group_by, joins, lazy evaluation, CSV/Parquet I/O, expression API, for high-performance data analysis workflows.
Use when "Polars", "fast dataframe", "lazy evaluation", "Arrow backend", or asking about "pandas alternative", "parallel dataframe", "large CSV processing", "ETL pipeline", "expression API"
Polars fast DataFrame library. Use for fast data processing.
Search 2.4M+ full-text PubMed Central Open Access papers for literature reviews, trends, and data extraction.
Patterns for efficient ML data pipelines using Polars, Arrow, and ClickHouse. TRIGGERS - data pipeline, polars vs pandas, arrow format, clickhouse ml, efficient loading, zero-copy, memory optimization.
Transform raw data into analytical assets using ETL/ELT patterns, SQL (dbt), Python (pandas/polars/PySpark), and orchestration (Airflow). Use when building data pipelines, implementing incremental models, migrating from pandas to polars, or orchestrating multi-step transformations with testing and quality checks.
Transform raw data into analytical assets using ETL/ELT patterns, SQL (dbt), Python (pandas/polars/PySpark), and orchestration (Airflow). Use when building data pipelines, implementing incremental models, migrating from pandas to polars, or orchestrating multi-step transformations with testing and quality checks.
Extract and analyze Agentforce session tracing data from Salesforce Data 360. Supports high-volume extraction (1-10M records/day), Polars-based analysis, and debugging workflows for agent sessions.
Fast in-process analytical database for SQL queries on DataFrames, CSV, Parquet, JSON files, and more. Use when user wants to perform SQL analytics on data files or Python DataFrames (pandas, Polars), run complex aggregations, joins, or window functions, or query external data sources without loading into memory. Best for analytical workloads, OLAP queries, and data exploration.
High-performance data analysis using Polars - load, transform, aggregate, visualize and export tabular data. Use for CSV/JSON/Parquet processing, statistical analysis, time series, and creating charts.
Exploratory Data Analysis skill for CSV and parquet datasets with deterministic profiling, drift/anomaly scans, contract generation and validation, and optional memory writeback into skill-system-memory. The implementation is Polars-first (lazy scan for large files and early `--sample` head), includes high-cardinality guards for profile/importance/contract flows, and supports categorical correlation with Cramer's V. Use when building or reviewing tabular fraud/risk/data-quality workflows, profiling new datasets, checking leakage or drift, or saving/validating data contracts.