All Skills

Total 50,676 skills, AI & Machine Learning has 8495 skills

Showing 12 of 8495 skills

Per page

Downloads

Sort

phoenix-cli

Debug LLM applications using the Phoenix CLI. Fetch traces, analyze errors, review experiments, and inspect datasets. Use when debugging AI/LLM applications, analyzing trace data, working with Phoenix observability, or investigating LLM performance issues.

🇺🇸|EnglishTranslated

AI & Machine Learningyeachan-heo/oh-my-claudec...

analyze

Deep analysis and investigation

🇺🇸|EnglishTranslated

AI & Machine Learningroelofvheeren/elvison-os

using-superpowers

A complete software development workflow for supercharging agentic coding. Use for EVERY coding task, planning session, or when starting a new project.

🇺🇸|EnglishTranslated

AI & Machine Learningyeachan-heo/oh-my-claudec...

trace

Show agent flow trace timeline and summary

🇺🇸|EnglishTranslated

AI & Machine Learningcoval-ai/coval-external-s...

quick-eval

Full evaluation workflow - launch a run, watch progress, and summarize results. Use for end-to-end agent testing.

🇺🇸|EnglishTranslated

AI & Machine Learningnvidia/skills

trtllm-codebase-exploration

Systematic approach to exploring the TensorRT-LLM codebase before implementing new features or optimizations. Teaches how to discover existing infrastructure, trace code paths, and avoid reimplementing what already exists. Derived from real mistakes where ~250 lines of code were written and deleted because existing forward methods weren't discovered upfront. Use when starting any new feature, optimization, or code modification in TRT-LLM.

🇺🇸|EnglishTranslated

AI & Machine Learningnvidia/skills

kernel-tileir-optimization

Optimize existing Triton kernels for NVIDIA TileIR backend on Blackwell GPUs (sm_100+). Adds TileIR-specific autotune configs: occupancy, num_ctas, TMA descriptors. Covers kernel classification (dot-related, norm-like, elementwise, reduction), type-specific transformations, and PTX-vs-TileIR benchmarking. Triggered by: "optimize for TileIR", "add TileIR configs", "Blackwell optimization", "TMA descriptors", "2CTA mode", "occupancy tuning". Kernels use standard `import triton`; TileIR activates via ENABLE_TILE=1 when nvtriton is installed.

🇺🇸|EnglishTranslated

2 scripts/Attention

AI & Machine Learningnvidia/skills

perf-optimization

Performance optimization coordination playbook. Contains specialist routing table, TileIR two-step pipeline, kernel generation specialist selection, prioritization criteria, and safe modification workflow. Use when the user asks to apply optimizations, write kernels, or improve performance. Covers both user-specified optimization and autopilot-driven iterative optimization.

🇺🇸|EnglishTranslated

AI & Machine Learningnvidia/skills

recipe-recommender

Recommend and customize Megatron Bridge recipes for a user's model, GPU count, and training goal. Indexes library recipes (pretrain/SFT/PEFT) and performance recipes.

🇺🇸|EnglishTranslated

AI & Machine Learningnvidia/skills

rt-vlm

Use this skill when working with the RTVI VLM or RT-VLM microservice API on VSS 3.1. Generate dense captions and alerts for stored video files and live RTSP streams via `/v1/generate_captions_alerts`; upload media via `/v1/files`; add and remove live streams with `/v1/streams/add` and `/v1/streams/delete/{stream_id}`; call OpenAI-compatible `/v1/chat/completions`; consume Kafka caption, incident, and error topics; or debug rtvi-vlm responses. For deployment, read `references/deploy-rt-vlm-service.md` first.

🇺🇸|EnglishTranslated

AI & Machine Learningnvidia/skills

perf-cpu-offloading

Validate and use CPU offloading in Megatron Bridge, including layer-level activation offloading and fractional optimizer state offloading with HybridDeviceOptimizer.

🇺🇸|EnglishTranslated

AI & Machine Learningnvidia/skills

mlm-bridge-training

Run Megatron-LM (MLM) and Megatron Bridge training with mock or real data. Covers correlation testing, available recipes, and multi-GPU examples.

🇺🇸|EnglishTranslated