Search Results: gpu-acceleration

Found 19 Skills

cupynumeric-migration-readiness

Pre-migration readiness assessor for porting NumPy to cuPyNumeric. Use BEFORE substantial porting work begins when the user asks whether code will scale on GPU, whether they should migrate to cuPyNumeric, which NumPy patterns transfer cleanly, what must be refactored before porting, or mentions pre-port assessment, scaling analysis, or refactor planning. Inspect the user's source code, look up NumPy usage, cross-reference the cuPyNumeric API support manifest, and distinguish distributed-scaling-friendly patterns from blockers such as unsupported APIs, scalar synchronization, host round-trips, Python/object-heavy control flow, shape/data-dependent branching, and in-place mutation hazards. Produce a verdict of READY, LIGHT REFACTOR, SIGNIFICANT REFACTOR, or NOT RECOMMENDED, with concrete refactor pointers.

🇺🇸|EnglishTranslated

24 scripts/Attention

AI & Machine Learningadaptyvbio/protein-design...

boltz

Structure prediction using Boltz-1/Boltz-2, an open biomolecular structure predictor. Use this skill when: (1) Predicting protein complex structures, (2) Validating designed binders, (3) Need open-source alternative to AF2, (4) Predicting protein-ligand complexes, (5) Using local GPU resources. For QC thresholds, use protein-qc. For AlphaFold2 prediction, use alphafold. For Chai prediction, use chai.

🇺🇸|EnglishTranslated

AI & Machine Learningg1joshi/agent-skills

xgboost

XGBoost gradient boosting library. Use for tabular ML.

🇺🇸|EnglishTranslated

AI & Machine Learningpromptingcompany/nv-skill...

earth2studio-deterministic-forecast

Build deterministic forecast scripts with Earth2Studio (model, data source, IO, inference). Do NOT use for ensemble, diagnostics, data-only fetch, or install.

🇺🇸|EnglishTranslated

4 scripts/Checked

AI & Machine Learningdavila7/claude-code-templ...

optimizing-attention-flash

Optimizes transformer attention with Flash Attention for 2-4x speedup and 10-20x memory reduction. Use when training/running transformers with long sequences (>512 tokens), encountering GPU memory issues with attention, or need faster inference. Supports PyTorch native SDPA, flash-attn library, H100 FP8, and sliding window attention.

🇺🇸|EnglishTranslated

DevOps & Cloud Servicesjosiahsiegel/claude-plugi...

modal-knowledge

Comprehensive Modal.com platform knowledge covering all features, pricing, and best practices

🇺🇸|EnglishTranslated

Frontend Developmentdylantarre/animation-prin...

performance-optimization

Use when animation runs slow, janky, or causes frame drops

🇺🇸|EnglishTranslated

AI & Machine Learningnvidia/skills

perf-optimization

Performance optimization coordination playbook. Contains specialist routing table, TileIR two-step pipeline, kernel generation specialist selection, prioritization criteria, and safe modification workflow. Use when the user asks to apply optimizations, write kernels, or improve performance. Covers both user-specified optimization and autopilot-driven iterative optimization.

🇺🇸|EnglishTranslated

AI & Machine Learningorchestra-research/ai-res...

faiss

Facebook's library for efficient similarity search and clustering of dense vectors. Supports billions of vectors, GPU acceleration, and various index types (Flat, IVF, HNSW). Use for fast k-NN search, large-scale vector retrieval, or when you need pure similarity search without metadata. Best for high-performance applications.

🇺🇸|EnglishTranslated

Frontend Developmentmatthewharwood/fantasy-ph...

webgpu-canvas

WebGPU fundamentals for high-performance canvas rendering. Covers device initialization, buffer management, WGSL shaders, render pipelines, compute shaders, and web component integration. Use when building GPU-accelerated graphics, particle systems, or compute-intensive visualizations.

🇺🇸|EnglishTranslated

Data Processingstarlitnightly/omicverse

single-cell-preprocessing-with-omicverse

Walk through omicverse's single-cell preprocessing tutorials to QC PBMC3k data, normalise counts, detect HVGs, and run PCA/embedding pipelines on CPU, CPU–GPU mixed, or GPU stacks.

🇺🇸|EnglishTranslated

AI & Machine Learningtondevrel/scientific-agen...

jax

Composable transformations of Python+NumPy programs. Differentiate, vectorize, JIT-compile to GPU/TPU. Built for high-performance machine learning research and complex scientific simulations. Use for automatic differentiation, GPU/TPU acceleration, higher-order derivatives, physics-informed machine learning, differentiable simulations, and automatic vectorization.

🇺🇸|EnglishTranslated