Search Results: pandas

Found 95 Skills

chdb-datastore

Drop-in pandas replacement with ClickHouse performance. Use `import chdb.datastore as pd` (or `from datastore import DataStore`) and write standard pandas code — same API, 10-100x faster on large datasets. Supports 16+ data sources (MySQL, PostgreSQL, S3, MongoDB, ClickHouse, Iceberg, Delta Lake, etc.) and 10+ file formats (Parquet, CSV, JSON, Arrow, ORC, etc.) with cross-source joins. Use this skill when the user wants to analyze data with pandas-style syntax, speed up slow pandas code, query remote databases or cloud storage as DataFrames, or join data across different sources — even if they don't explicitly mention chdb or DataStore. Do NOT use for raw SQL queries, ClickHouse server administration, or non-Python languages.

🇺🇸|EnglishTranslated

1 scripts/Checked

Data Processingmeleantonio/awesome-econ-...

python-panel-data

Panel data analysis with Python using linearmodels and pandas.

🇺🇸|EnglishTranslated

Data Processingterrylica/cc-skills

ml-data-pipeline-architecture

Patterns for efficient ML data pipelines using Polars, Arrow, and ClickHouse. TRIGGERS - data pipeline, polars vs pandas, arrow format, clickhouse ml, efficient loading, zero-copy, memory optimization.

🇺🇸|EnglishTranslated

Data Processingvuralserhat86/antigravity...

data_transform

Transform raw data into analytical assets using ETL/ELT patterns, SQL (dbt), Python (pandas/polars/PySpark), and orchestration (Airflow). Use when building data pipelines, implementing incremental models, migrating from pandas to polars, or orchestrating multi-step transformations with testing and quality checks.

🇺🇸|EnglishTranslated

Data Processingeyadsibai/ltk

polars

Use when "Polars", "fast dataframe", "lazy evaluation", "Arrow backend", or asking about "pandas alternative", "parallel dataframe", "large CSV processing", "ETL pipeline", "expression API"

🇺🇸|EnglishTranslated

Data Processingancoleman/ai-design-compo...

transforming-data

🇺🇸|EnglishTranslated

6 scripts/Checked

Data Processingletta-ai/skills

modernize-scientific-stack

Guide for modernizing legacy Python 2 scientific computing code to Python 3 with modern libraries. This skill should be used when migrating scientific scripts involving data processing, numerical computation, or analysis from Python 2 to Python 3, or when updating deprecated scientific computing patterns to modern equivalents (pandas, numpy, pathlib).

🇺🇸|EnglishTranslated

Data Processingneversight/skills_feed

csv-data-analyst

Analyze CSV files, generate summary statistics, and create visualizations using Python and pandas. Use when the user uploads, attaches, or references a CSV file, asks to summarize or analyze tabular data, requests insights from CSV data, or wants to understand data structure and quality.

🇺🇸|EnglishTranslated

Data Processingeyadsibai/ltk

dask

Use when "Dask", "parallel computing", "distributed computing", "larger than memory", or asking about "parallel pandas", "parallel numpy", "out-of-core", "multi-file processing", "cluster computing", "lazy evaluation dataframe"

🇺🇸|EnglishTranslated

Data Processingtrailofbits/skills-curate...

openai-spreadsheet

Use when tasks involve creating, editing, analyzing, or formatting spreadsheets (`.xlsx`, `.csv`, `.tsv`) using Python (`openpyxl`, `pandas`), especially when formulas, references, and formatting need to be preserved and verified. Originally from OpenAI's curated skills catalog.

🇺🇸|EnglishTranslated

4 scripts/Checked

Security & Compliancemukul975/anthropic-cybers...

analyzing-api-gateway-access-logs

Parses API Gateway access logs (AWS API Gateway, Kong, Nginx) to detect BOLA/IDOR attacks, rate limit bypass, credential scanning, and injection attempts. Uses pandas for statistical analysis of request patterns and anomaly detection. Use when investigating API abuse or building API-specific threat detection rules.

🇺🇸|EnglishTranslated

1 scripts/Checked

Data Processingtondevrel/scientific-agen...

seaborn

A Python data visualization library based on matplotlib. It provides a high-level interface for drawing attractive and informative statistical graphics. Great for exploring relationships between variables and visualizing distributions. Use for statistical data visualization, exploratory data analysis (EDA), relationship plots, distribution plots, categorical comparisons, regression visualization, heatmaps, cluster maps, and creating publication-quality statistical graphics from Pandas DataFrames.

🇺🇸|EnglishTranslated