Total 50,865 skills, AI & Machine Learning has 8517 skills
Showing 12 of 8517 skills
Use when the user wants embeddings, vector indexing, retrieval, or retrieval-backed answers, including embedding-agent setup, Chroma-backed collections, collection add/query, and KB-to-answer flows.
Use when the user wants tool use, MCP access, HTTP or streaming API exposure, auto-function helpers, or wait-for-key behavior through Agently-native extension surfaces rather than custom wrappers first.
Use when the user needs conversation continuity, memo, or restore-after-restart behavior for a request family, including session ids, chat history, request-side memory boundaries, and session-backed continuity.
AgentMail MCP server for email tools in AI assistants. Use when setting up AgentMail with Claude Desktop, Cursor, VS Code, Windsurf, or other MCP-compatible clients. Provides tools for inbox management, sending/receiving emails, and thread handling.
Full evaluation workflow - launch a run, watch progress, and summarize results. Use for end-to-end agent testing.
Systematic approach to exploring the TensorRT-LLM codebase before implementing new features or optimizations. Teaches how to discover existing infrastructure, trace code paths, and avoid reimplementing what already exists. Derived from real mistakes where ~250 lines of code were written and deleted because existing forward methods weren't discovered upfront. Use when starting any new feature, optimization, or code modification in TRT-LLM.
Optimize existing Triton kernels for NVIDIA TileIR backend on Blackwell GPUs (sm_100+). Adds TileIR-specific autotune configs: occupancy, num_ctas, TMA descriptors. Covers kernel classification (dot-related, norm-like, elementwise, reduction), type-specific transformations, and PTX-vs-TileIR benchmarking. Triggered by: "optimize for TileIR", "add TileIR configs", "Blackwell optimization", "TMA descriptors", "2CTA mode", "occupancy tuning". Kernels use standard `import triton`; TileIR activates via ENABLE_TILE=1 when nvtriton is installed.
Performance optimization coordination playbook. Contains specialist routing table, TileIR two-step pipeline, kernel generation specialist selection, prioritization criteria, and safe modification workflow. Use when the user asks to apply optimizations, write kernels, or improve performance. Covers both user-specified optimization and autopilot-driven iterative optimization.
Recommend and customize Megatron Bridge recipes for a user's model, GPU count, and training goal. Indexes library recipes (pretrain/SFT/PEFT) and performance recipes.
Use this skill when working with the RTVI VLM or RT-VLM microservice API on VSS 3.1. Generate dense captions and alerts for stored video files and live RTSP streams via `/v1/generate_captions_alerts`; upload media via `/v1/files`; add and remove live streams with `/v1/streams/add` and `/v1/streams/delete/{stream_id}`; call OpenAI-compatible `/v1/chat/completions`; consume Kafka caption, incident, and error topics; or debug rtvi-vlm responses. For deployment, read `references/deploy-rt-vlm-service.md` first.
Validate and use CPU offloading in Megatron Bridge, including layer-level activation offloading and fractional optimizer state offloading with HybridDeviceOptimizer.
Run Megatron-LM (MLM) and Megatron Bridge training with mock or real data. Covers correlation testing, available recipes, and multi-GPU examples.