Loading...
Loading...
Found 2,227 Skills
Use when the user asks to "create an evaluator", "create evals", "create a scenario", "write a test scenario", "design a test case", "test my agent", "build eval coverage", "plan a test suite", "create red team tests", "set up test profiles", "configure conditional actions", "write a conditional action evaluator", "build a deterministic test", "design an IVR test", "IVR navigation test", "write a unit test for a voice agent", "build a regression test", "scripted scenario", "scripted voice test", "structured evaluator", "exact flow test", "sequential conditions", "fixed sequence test", or "run evals". Covers individual evaluator design, suite coverage strategy, test profiles, mock-tool data design, conditional actions (deterministic / unit test / regression / IVR navigation flows), and best practices for workflow / red-team / edge-case / deterministic test types.
Security scanner and health check for your AI agent skills tree. Identifies dead skills, missing documentation, and unsafe shell execution paths.
Extend Pydantic AI agents with batteries-included capabilities from pydantic-ai-harness — currently Code Mode, which collapses many tool calls into one sandboxed Python execution. Use when the user mentions pydantic-ai-harness, CodeMode, Monty, code mode, or tool sandboxing, when they want an agent to run agent-written Python, or when a Pydantic AI agent would benefit from orchestrating multiple tool calls in a single sandboxed script.
Loads documents fully into the main agent's context so the agent can answer questions, summarize, or work with that content in subsequent turns. Use whenever the user wants to ingest, read, study, review, absorb, or pull in documents — especially when they say things like "load these docs", "read all of these", "ingest this folder", "pull in these PDFs", "load all docs in X", or paste a list of file paths/URLs and ask you to read them. Handles local files (text, code, markdown, PDFs, notebooks, images), entire folders (recursively), and remote URLs. The skill is single-turn — once the agent reports "DONE", it deactivates until the user invokes it again.
Replace with a trigger-style description of when this skill should activate. Be specific — this is what the agent uses to decide whether to load the skill. Example: "Sui TypeScript SDK integration. Use when writing, reviewing, or debugging TypeScript code that interacts with Sui RPCs, transactions, or on-chain state."
Apply when tempted to ask 'should I do X?' on reversible work. Proceed, present the result, let the human course-correct after the fact; reserve confirmation for irreversible actions.
Design and operate an advanced AI agent memory system on HelixDB using hybrid graph + vector + BM25 search. Use when building long-term memory, user profiles, document/chunk RAG, recall/remember features, memory extraction, deduplication, consolidation, versioning, updating, forgetting/deletion, categorisation, or connector-backed ingestion. Covers tenant-safe Helix data modeling, modality decision rules, the full write/maintain lifecycle, and the product layers an agent must implement around Helix. TypeScript-first (@helix-db/helix-db); a Rust DSL variant is in EXAMPLES.rust.md.
Audit an AI agent skill for security risks before installing or trusting it. Runs a deterministic scanner (regex patterns, Python AST analysis, source-to-sink taint tracking, and YARA signatures) and then reasons about intent — catching prompt injection, credential exfiltration, persistence, memory poisoning, malicious code, supply-chain risks, and description-vs-behavior mismatch. Make sure to use this skill whenever the user wants to scan, audit, vet, review, or check the safety of a skill, plugin, SKILL.md, or agent tool — whether it is a local folder, a zip/.skill file, or a cloned repo — and whenever someone asks "is this skill safe to install?".
Run a two-agent code review: spawn two fresh, clean-context agents that examine the SAME committed branch diff in parallel. One agent runs Codex's native `codex review --base` command, while the other independently reviews the code against Google's "What to look for in a code review" guidance. Merge both outputs into one agreement-ranked report. Use this whenever the user asks for "review-all", a second-opinion review, a dual review, a cross-check before a PR, or a maximum-confidence review of committed branch changes. Do not use it to APPLY fixes; it is review-only.
Prevent feature creep when building software, apps, and AI-powered products. Use this skill when planning features, reviewing scope, building MVPs, managing backlogs, or when a user says "just one more feature." Helps developers and AI agents stay focused, ship faster, and avoid bloated products.
Manage Model Context Protocol (MCP) servers - discover, analyze, and execute tools/prompts/resources from configured MCP servers. Use when working with MCP integrations, need to discover available MCP capabilities, filter MCP tools for specific tasks, execute MCP tools programmatically, access MCP prompts/resources, or implement MCP client functionality. Supports intelligent tool selection, multi-server management, and context-efficient capability discovery.
Analyze agent-user interaction transcripts to identify context network maintenance needs and guidance improvements. Use after significant agent interactions or to improve context networks.