Loading...
Loading...
Found 65 Skills
Generates eval test cases from an eval suite plan (output of /eval-suite-planner) or a plain-English agent description. Supports both single-response and conversation (multi-turn) evaluation modes. Outputs a Copilot Studio test set table, a CSV file for import (single-response only), and a docx report for human review.
A comprehensive verification system for Claude Code sessions.
Help users ship products faster and with higher quality. Use when someone is planning a launch, struggling to release features, dealing with shipping velocity issues, or trying to establish better release practices.
Amazon Bedrock AgentCore Evaluations for testing and monitoring AI agent quality. 13 built-in evaluators plus custom LLM-as-Judge patterns. Use when testing agents, monitoring production quality, setting up alerts, or validating agent behavior.
Use when working with tdd workflows tdd cycle
Analyze gaps between implementation plans and actual codebase implementation for the Rust self-learning memory project
Help teams continuously improve skills, automatically identify skill optimization opportunities (such as missing essential information, format issues, version update needs, etc.), execute secure update processes (backup, modification, testing, restoration), and ensure skill quality keeps improving as the project progresses
Build comprehensive, mobile-compatible Obsidian study vaults from academic course materials with checkpoint-based workflow, error pattern recognition, and quality assurance. Battle-tested patterns from 828KB/37-file projects. Works across all subjects - CS, medicine, business, self-study.
Comprehensive testing and validation strategies for spec-driven development. Learn phase-specific validation techniques, quality gates, and testing approaches to ensure high-quality implementation.
Coordinator workflow for orchestrating dockeragents through fix-review-iterate-present loop. Use when delegating any task that produces code changes. Ensures agents achieve 10/10 quality before presenting to human.
Comprehensive autonomous development strategies including milestone planning, incremental implementation, auto-debugging, and continuous quality assurance for full development lifecycle management
自动生成全面的测试套件,包括单元测试、集成测试和E2E测试,支持Jest、Vitest、pytest、xUnit等。