Total 50,614 skills, AI & Machine Learning has 8484 skills
Showing 12 of 8484 skills
Multi-step video annotation pipeline that turns raw videos into Chain-of-Thought training data — multi-level captions, structured descriptions, and QA pairs (MCQ, binary, open-ended) with reasoning traces, via VLM/LLM distillation. Use when the user wants to "create video training data", "generate video QA datasets", "build CoT reasoning traces from videos", "auto-label videos", or run the video_reasoning_annotation pipeline. Triggers include "video annotation", "video CoT", "video QA", "chain-of-thought", "video captioning pipeline", "video distillation".
Mask2Former for universal image segmentation (panoptic, instance, and semantic). Transformer-based with masked attention for high-quality segmentation results. Use when training, evaluating, exporting, quantizing, or running inference for a TAO Mask2Former model. Trigger phrases include "train Mask2Former", "universal segmentation", "panoptic / instance / semantic segmentation", "masked-attention transformer segmenter".
DGX Cloud Lepton managed GPU compute platform with run/status/cancel interface. Use when submitting TAO jobs to DGX Cloud, dispatching training/eval/inference to Lepton GPU resources, or managing Lepton workspace deployments. Trigger phrases include "run on Lepton", "submit to DGX Cloud", "Lepton job", "managed GPU on DGX Cloud".
Runs the DEFT embed-then-mine workflow for VCN AOI iterations — embeds the gap-analysis target parquet, embeds a source pool, and mines nearest-neighbour source images for downstream augmentation. Use as the immediate next step after `tao-route-visual-changenet-samples` when expanding a real-image augmentation queue from the mining subset.
Integrate TileGym kernels into Hugging Face `transformers` models by replacing the library's submodule(s) and certain class(es)' implementations, and patching certain class(es)' init/forward/load weight methods prior to instantiating models. Used when the user requires integrating TileGym kernels into `transformers` models.
Use to run top-level VSS fusion search on archived video, or to ingest video files / RTSP streams for search. Do NOT use for ad-hoc visual Q&A (use vss-ask-video), live captioning (use vss-deploy-dense-captioning), or video summarization and reports (use vss-summarize-video).
Design task-local harnesses, eval gates, and reusable skill extraction for Claude dynamic workflow mode and other adaptive agent harnesses.
Adversarial senior-engineer review for agent-generated plans, designs, and architectures. Treats the current output as junior work, constructs a senior reviewer whose domain expertise comes from live codebase research plus web research of current best practices, diagnoses altitude failures (too vague or too granular), then rewrites the plan into a scoped, state-of-the-art version. Use when the user says "junior to senior", "senior review", "review this like a staff engineer", when a plan feels hand-wavy or lost in details, or before committing to any agent-written plan.
Build an AI agent backend with persistent memory: one Rivet Actor per conversation, queued message handling, and streaming LLM responses as realtime events.
Use this skill PROACTIVELY when the user indicates they are done working, ending the session, wrapping up, or saying goodbye. Also use when significant work has been completed and the user hasn't explicitly asked to continue. Analyzes the session to evolve skills/commands/agents and propagates useful project permissions to global settings.
A shared, file-based town square where multiple coding agents talk, coordinate, and debate — no server required. Use whenever more than one agent works the same repo (parallel Claude Code or Codex sessions, separate git worktrees, a fleet splitting a task) and they must stay out of each other's way or think together. TRIGGER on phrasings like "coordinate with the other agent/session", "post to / check the agora", "ask the other agents", "leave a message for whoever's working on X", "announce what files you're touching", "is anyone else editing this?", or any time you're about to edit shared code while other agents are live. Also trigger when an agent is stuck and wants a peer's second opinion, or when several agents each drafted a design (an API, a schema, an architecture) and the group needs to compare the proposals and converge on the best one. Works for any agent that can run a Python script, not just Claude Code.
Run a two-agent code review: spawn two fresh, clean-context agents that examine the SAME committed branch diff in parallel. One agent runs Codex's native `codex review --base` command, while the other independently reviews the code against Google's "What to look for in a code review" guidance. Merge both outputs into one agreement-ranked report. Use this whenever the user asks for "review-all", a second-opinion review, a dual review, a cross-check before a PR, or a maximum-confidence review of committed branch changes. Do not use it to APPLY fixes; it is review-only.