Loading...
Loading...
Found 74 Skills
Analyze datasets to extract insights, identify patterns, and generate reports. Use when exploring data, creating visualizations, or performing statistical analysis. Handles CSV, JSON, SQL queries, and Python pandas operations.
Statistical visualization with pandas integration. Use for quick exploration of distributions, relationships, and categorical comparisons with attractive defaults. Best for box plots, violin plots, pair plots, heatmaps. Built on matplotlib. For interactive plots use plotly; for publication styling use scientific-visualization.
Parallel/distributed computing. Scale pandas/NumPy beyond memory, parallel DataFrames/Arrays, multi-file processing, task graphs, for larger-than-RAM datasets and parallel workflows.
Fast in-memory DataFrame library for datasets that fit in RAM. Use when pandas is too slow but data still fits in memory. Lazy evaluation, parallel execution, Apache Arrow backend. Best for 1-100GB datasets, ETL pipelines, faster pandas replacement. For larger-than-RAM data use dask or vaex.
Distributed computing for larger-than-RAM pandas/NumPy workflows. Use when you need to scale existing pandas/NumPy code beyond memory or across clusters. Best for parallel file processing, distributed ML, integration with existing pandas code. For out-of-core analytics on single machine use vaex; for in-memory speed use polars.
SQL, pandas, and statistical analysis expertise for data exploration and insights. Use when: analyzing data, writing SQL queries, using pandas, performing statistical analysis, or when user mentions data analysis, SQL, pandas, statistics, or needs help exploring datasets.
Automatically generate Excel reports from data sources including CSV, databases, or Python data structures. Supports data analysis reports, business reports, data export, and template-based report generation using pandas and openpyxl. Activate when users mention Excel, spreadsheet, report generation, data export, or business reporting.
Use when working with pandas DataFrames, data cleaning, aggregation, merging, or time series analysis. Invoke for data manipulation, missing value handling, groupby operations, or performance optimization.
Data analysis best practices with pandas, numpy, matplotlib, seaborn, and Jupyter notebooks.
Guidelines for data analysis and Jupyter Notebook development with pandas, matplotlib, seaborn, and numpy.
Panel data analysis with Python using linearmodels and pandas.
Best practices for Pandas data manipulation, analysis, and DataFrame operations in Python