Search Results: dask

Found 9 Skills

Data Processingk-dense-ai/claude-scienti...

dask

Distributed computing for larger-than-RAM pandas/NumPy workflows. Use when you need to scale existing pandas/NumPy code beyond memory or across clusters. Best for parallel file processing, distributed ML, integration with existing pandas code. For out-of-core analytics on single machine use vaex; for in-memory speed use polars.

🇺🇸|EnglishTranslated

Data Processingdavila7/claude-code-templ...

dask

Parallel/distributed computing. Scale pandas/NumPy beyond memory, parallel DataFrames/Arrays, multi-file processing, task graphs, for larger-than-RAM datasets and parallel workflows.

🇺🇸|EnglishTranslated

Data Processingeyadsibai/ltk

dask

Use when "Dask", "parallel computing", "distributed computing", "larger than memory", or asking about "parallel pandas", "parallel numpy", "out-of-core", "multi-file processing", "cluster computing", "lazy evaluation dataframe"

🇺🇸|EnglishTranslated

Data Processingnvidia/skills

accelerated-computing-cudf

Official NVIDIA-authored guidance for NVIDIA cuDF GPU DataFrames, pandas acceleration, dask-cuDF, ETL, joins, groupby, CSV/Parquet I/O, nullable semantics, and multi-GPU DataFrame workloads.

🇺🇸|EnglishTranslated

23 scripts/Attention

Data Processingsteadfastasart/geoscience...

xarray

N-dimensional labeled arrays for geoscience data. Read/write NetCDF, work with climate and oceanographic datasets, perform multi-dimensional analysis with labeled coordinates. Use when Claude needs to: (1) Read/write NetCDF or Zarr files, (2) Work with multidimensional arrays with labeled dimensions, (3) Analyze climate, ocean, or atmosphere data, (4) Compute temporal aggregations (daily/monthly/annual means), (5) Perform area-weighted statistics, (6) Process large datasets with Dask, (7) Apply CF conventions to scientific data.

🇺🇸|EnglishTranslated

1 scripts/Checked

Data Processingk-dense-ai/claude-scienti...

polars

Fast in-memory DataFrame library for datasets that fit in RAM. Use when pandas is too slow but data still fits in memory. Lazy evaluation, parallel execution, Apache Arrow backend. Best for 1-100GB datasets, ETL pipelines, faster pandas replacement. For larger-than-RAM data use dask or vaex.

🇺🇸|EnglishTranslated

Data Processingdavila7/claude-code-templ...

zarr-python

Chunked N-D arrays for cloud storage. Compressed arrays, parallel I/O, S3/GCS integration, NumPy/Dask/Xarray compatible, for large-scale scientific computing pipelines.

🇺🇸|EnglishTranslated

Data Processingjorgealves/agent_skills

python-data-pipeline-designer

Design ETL workflows with data validation using tools like Pandas, Dask, or PySpark. Use when building robust data processing systems in Python.

🇺🇸|EnglishTranslated

Tools & Utilitiesdavila7/claude-code-templ...

get-available-resources

This skill should be used at the start of any computationally intensive scientific task to detect and report available system resources (CPU cores, GPUs, memory, disk space). It creates a JSON file with resource information and strategic recommendations that inform computational approach decisions such as whether to use parallel processing (joblib, multiprocessing), out-of-core computing (Dask, Zarr), GPU acceleration (PyTorch, JAX), or memory-efficient strategies. Use this skill before running analyses, training models, processing large datasets, or any task where resource constraints matter.

🇺🇸|EnglishTranslated

1 scripts/Attention