Search Results: agent-testing

Found 15 Skills

AI & Machine Learningaiskillstore/marketplace

agent-development

This skill should be used when the user asks to "create an agent", "add an agent", "write a subagent", "agent frontmatter", "when to use description", "agent examples", "agent tools", "agent colors", "autonomous agent", or needs guidance on agent structure, system prompts, triggering conditions, or agent development best practices for Claude Code plugins.

🇺🇸|EnglishTranslated

1 scripts/Attention

AI & Machine Learningpydantic/skills

building-pydantic-ai-agents

Build AI agents with Pydantic AI — tools, capabilities, structured output, streaming, testing, and multi-agent patterns. Use when the user mentions Pydantic AI, imports pydantic_ai, or asks to build an AI agent, add tools/capabilities, stream output, define agents from YAML, or test agent behavior.

🇺🇸|EnglishTranslated

AI & Machine Learningpromptingcompany/agent-sk...

setup-experiment

End-to-end interactive workflow — pick a product, then either run existing tasks and environments (Path A) or set up new ones from docs, suggested tasks, credentials, and templates (Path B). Builds the experiment, attaches signals, and optionally triggers the first iteration. Trigger when users say: "set up an experiment", "create an experiment", "I want to run an experiment", "run my tasks", "setup experiment", "new experiment", "configure an experiment", or "experiment setup".

🇺🇸|EnglishTranslated

AI & Machine Learningjohnmaeda/azure-ai-agent-...

azure-ai-agent-deploy

Deploy prompt-based Azure AI agents from YAML definitions to Azure AI Foundry projects. Use when users want to (1) create and deploy Azure AI agents, (2) set up Azure AI infrastructure, (3) deploy AI models to Azure, or (4) test deployed agents interactively. Handles authentication, RBAC, quotas, and deployment complexities automatically.

🇺🇸|EnglishTranslated

8 scripts/Attention

AI & Machine Learningeyadsibai/ltk

agent-evaluation

Use when evaluating agent performance, building test frameworks, measuring quality, or asking about "agent evaluation", "LLM-as-judge", "agent testing", "quality metrics", "evaluation rubrics", "agent benchmarks"

🇺🇸|EnglishTranslated

AI & Machine Learningdawiddutoit/custom-claude

manage-agents

Creates, modifies, and manages Claude Code subagents by writing agent files with YAML frontmatter, system prompts, and tool configurations. Use when you need to "create an agent", "modify an agent", "set up a specialist", "I need an agent for [task]", "agent to handle [domain]", or "configure agent tools". Covers agent file format, YAML frontmatter, system prompts, tool restrictions, MCP integration, model selection, and testing.

🇺🇸|EnglishTranslated

3 scripts/Attention

AI & Machine Learninggoogle-gemini/gemini-cli

behavioral-evals

Guidance for creating, running, fixing, and promoting behavioral evaluations. Use when verifying agent decision logic, debugging failures, debugging prompt steering, or adding workspace regression tests.

🇺🇸|EnglishTranslated

AI & Machine Learninged3dai/ed3d-plugins

creating-an-agent

Use when creating specialized subagents for Claude Code plugins or the Task tool - covers description writing for auto-delegation, tool selection, prompt structure, and testing agents

🇺🇸|EnglishTranslated

Testing & QAveris-ai/veris-skills

agent-integration

Integrate a raw customer agent repo with Veris end to end. Installs or verifies veris-cli, logs in, creates or reuses a Veris environment, analyzes the repo, generates or updates `.veris/veris.yaml`, `.veris/Dockerfile.sandbox`, `.veris/.dockerignore`, configures runtime env vars, and can finish with `veris env push`. Use when a repo has no Veris setup yet, or when an existing `.veris/` integration is stale and needs to be refreshed.

🇺🇸|EnglishTranslated

AI & Machine Learninggooglecloudplatform/cxas-...

cxas-sim-eval

Converts CXAS golden evaluations to SCRAPI SimulationEvals test cases. Use when generating high-level, goal-oriented test cases from turn-by-turn evaluation JSONs, and when enriching test expectations with inferred tool calls.

🇺🇸|EnglishTranslated

4 scripts/Attention

AI & Machine Learningsickn33/antigravity-aweso...

agent-orchestration-improve-agent

Systematic improvement of existing agents through performance analysis, prompt engineering, and continuous iteration.

🇺🇸|EnglishTranslated

AI & Machine Learningadenhq/hive

hive

Complete workflow for building, implementing, and testing goal-driven agents. Orchestrates hive-* skills. Use when starting a new agent project, unsure which skill to use, or need end-to-end guidance.

🇺🇸|EnglishTranslated