Loading...
Loading...
Found 15 Skills
This skill should be used when the user asks to "create an agent", "add an agent", "write a subagent", "agent frontmatter", "when to use description", "agent examples", "agent tools", "agent colors", "autonomous agent", or needs guidance on agent structure, system prompts, triggering conditions, or agent development best practices for Claude Code plugins.
Build AI agents with Pydantic AI — tools, capabilities, structured output, streaming, testing, and multi-agent patterns. Use when the user mentions Pydantic AI, imports pydantic_ai, or asks to build an AI agent, add tools/capabilities, stream output, define agents from YAML, or test agent behavior.
End-to-end interactive workflow — pick a product, then either run existing tasks and environments (Path A) or set up new ones from docs, suggested tasks, credentials, and templates (Path B). Builds the experiment, attaches signals, and optionally triggers the first iteration. Trigger when users say: "set up an experiment", "create an experiment", "I want to run an experiment", "run my tasks", "setup experiment", "new experiment", "configure an experiment", or "experiment setup".
Deploy prompt-based Azure AI agents from YAML definitions to Azure AI Foundry projects. Use when users want to (1) create and deploy Azure AI agents, (2) set up Azure AI infrastructure, (3) deploy AI models to Azure, or (4) test deployed agents interactively. Handles authentication, RBAC, quotas, and deployment complexities automatically.
Use when evaluating agent performance, building test frameworks, measuring quality, or asking about "agent evaluation", "LLM-as-judge", "agent testing", "quality metrics", "evaluation rubrics", "agent benchmarks"
Creates, modifies, and manages Claude Code subagents by writing agent files with YAML frontmatter, system prompts, and tool configurations. Use when you need to "create an agent", "modify an agent", "set up a specialist", "I need an agent for [task]", "agent to handle [domain]", or "configure agent tools". Covers agent file format, YAML frontmatter, system prompts, tool restrictions, MCP integration, model selection, and testing.
Guidance for creating, running, fixing, and promoting behavioral evaluations. Use when verifying agent decision logic, debugging failures, debugging prompt steering, or adding workspace regression tests.
Use when creating specialized subagents for Claude Code plugins or the Task tool - covers description writing for auto-delegation, tool selection, prompt structure, and testing agents
Integrate a raw customer agent repo with Veris end to end. Installs or verifies veris-cli, logs in, creates or reuses a Veris environment, analyzes the repo, generates or updates `.veris/veris.yaml`, `.veris/Dockerfile.sandbox`, `.veris/.dockerignore`, configures runtime env vars, and can finish with `veris env push`. Use when a repo has no Veris setup yet, or when an existing `.veris/` integration is stale and needs to be refreshed.
Converts CXAS golden evaluations to SCRAPI SimulationEvals test cases. Use when generating high-level, goal-oriented test cases from turn-by-turn evaluation JSONs, and when enriching test expectations with inferred tool calls.
Systematic improvement of existing agents through performance analysis, prompt engineering, and continuous iteration.
Complete workflow for building, implementing, and testing goal-driven agents. Orchestrates hive-* skills. Use when starting a new agent project, unsure which skill to use, or need end-to-end guidance.