Search Results: testing

Found 3,730 Skills

evaluation

This skill should be used when the user asks to "evaluate agent performance", "build test framework", "measure agent quality", "create evaluation rubrics", or mentions LLM-as-judge, multi-dimensional evaluation, agent testing, or quality gates for agent pipelines. Part of the context engineering skill suite — also activates when the user mentions "context engineering" or "context-engineering" in the context of measuring agent effectiveness.

🇺🇸|EnglishTranslated

1 scripts/Checked

Code Qualityvinvcn/mattpocock-skills-...

setup-pre-commit

在当前 repo 设置 Husky pre-commit hooks，集成 lint-staged (Prettier)、type checking 和 tests。Use when user wants to add pre-commit hooks, set up Husky, configure lint-staged, or add commit-time formatting/typechecking/testing.

🇺🇸|EnglishTranslated

AI & Machine Learninglearnprompt/luban-skill

luban

Luban - Skill Polishing Workshop. Transform a "usable Skill" into a public Skill asset that is "understandable, installable, shareable, verifiable, and continuously evolvable". The methodology consists of five craftsman-like steps: 1. Material Inspection: First challenge whether the premise of this Skill is valid; directly state if the "material" is not worth polishing. 2. Peer Research: Search for similar Skills online to clarify its position in the ecosystem. 3. Dimension Measurement: Evaluate using three metrics - structure, actual testing, and live verification (live verification means reconciling with real running outputs; a green CI can be deceptive). 4. Iterative Refinement: Freeze the original version as a baseline; only retain changes that pass the verification gate, otherwise revert. Try to institutionalize verification methods as tools and rules in the repository. 5. Post-Release Iteration: Release is not the end; maintain a benchmark observation list, and start the next iteration based on real feedback. This tool is used when users want to upgrade, optimize, polish, productize, or release their self-developed Skills. The final deliverables include a structured Skill Polishing Report, directly replaceable rewritten segments, and a shareable "Graduation Certificate" result card that can be screenshot. Trigger phrases include but are not limited to: "Let Luban take a look at this skill", "Polish at Luban's Workshop", "Polish my skill", "Upgrade my skill", "Optimize this skill", "Skill check-up", "Skill audit", "Productize my skill", "How to release this skill", "Benchmark against similar skills", "Why no one installs my skill", "Help me publish my skill to GitHub/ClawHub", "Improve SKILL.md". Even if users only provide a Skill directory, GitHub repository link, or a segment of SKILL.md saying "Help me figure out how to modify it", it should be triggered as long as the context is about making the Skill more usable and shareable. Do NOT use this for creating a new Skill from scratch (use skill-creator), regular code review (use code-review), or rewriting ordinary prompts unrelated to Skill assets.

🇨🇳|ChineseTranslated

2 scripts/Attention

Frontend Developmentonmax/nuxt-skills

ts-library

Use when authoring TypeScript libraries - covers project setup, package exports, build tooling (tsdown/unbuild), API design patterns, type inference tricks, testing, and release workflows. Patterns extracted from 20+ high-quality ecosystem libraries.

🇺🇸|EnglishTranslated

Mobile Developmentrshankras/claude-code-app...

generators

Code generator skills that produce production-ready Swift code for common app components. Use when user wants to add logging, analytics, onboarding, review prompts, networking, authentication, paywalls, settings, persistence, error monitoring, CI/CD pipelines, localization, push notifications, deep linking, testing, accessibility, widgets, or feature flags.

🇺🇸|EnglishTranslated

2 scripts/Attention

AI & Machine Learningqodex-ai/ai-agent-skills

autonomous-agent-gaming

Build autonomous game-playing agents using AI and reinforcement learning. Covers game environments, agent decision-making, strategy development, and performance optimization. Use when creating game-playing bots, testing game AI, strategic decision-making systems, or game theory applications.

🇺🇸|EnglishTranslated

10 scripts/Checked

Tools & Utilitieswordpress/agent-skills

wordpress-router

Use when the user asks about WordPress codebases (plugins, themes, block themes, Gutenberg blocks, WP core checkouts) and you need to quickly classify the repo and route to the correct workflow/skill (blocks, theme.json, REST API, WP-CLI, performance, security, testing, release packaging).

🇺🇸|EnglishTranslated

AI & Machine Learningadaptationio/skrillz

bedrock-agentcore-evaluations

Amazon Bedrock AgentCore Evaluations for testing and monitoring AI agent quality. 13 built-in evaluators plus custom LLM-as-Judge patterns. Use when testing agents, monitoring production quality, setting up alerts, or validating agent behavior.

🇺🇸|EnglishTranslated

AI & Machine Learningbasecamp/skills

consult-outside-expert

Get a second opinion via Codex MCP. Use for stress-testing ideas, getting fresh perspective, steelmanning arguments, or iteratively refining work through expert back-and-forth. Invoke for ANY request involving external review, feedback, or consultation.

🇺🇸|EnglishTranslated

Project Managementjustinedevs/collection

blueprintkit

Complete project planning and execution framework. Automatically includes all 14 planning sections (planning/0-Master-Index.md through planning/13-Lessons-Learned-Continuous-Improvement.md) plus all 9 Claude Skills (tech-stack-selector, architecture-decisions, code-standards-enforcer, ci-cd-pipeline-builder, agile-executor, project-risk-identifier, automation-orchestrator, webapp-testing, web-artifacts-builder). When installed, all planning templates and execution skills are immediately available.

🇺🇸|EnglishTranslated

10 scripts/Attention

AI & Machine Learningmichaelboeding/skills

style-guide

Analyze a codebase to extract its conventions, patterns, and style. Spawns specialized analyzer agents that each focus on one aspect (structure, naming, patterns, testing, frontend). Generates a comprehensive style guide that other skills can reference. Use when starting work on an unfamiliar codebase, or to create explicit documentation of implicit conventions.

🇺🇸|EnglishTranslated

AI & Machine Learningalirezarezvani/claude-cod...

prompt-factory

World-class prompt powerhouse that generates production-ready mega-prompts for any role, industry, and task through intelligent 7-question flow, 69 comprehensive presets across 15 professional domains (technical, business, creative, legal, finance, HR, design, customer, executive, manufacturing, R&D, regulatory, specialized-technical, research, creative-media), multiple output formats (XML/Claude/ChatGPT/Gemini), quality validation gates, and contextual best practices from OpenAI/Anthropic/Google. Supports both core and advanced modes with testing scenarios and prompt variations.

🇺🇸|EnglishTranslated

4 scripts/Checked