Loading...
Loading...
Found 2 Skills
Visualize whether skills, rules, and agent definitions are actually followed — auto-generates scenarios at 3 prompt strictness levels, runs agents, classifies behavioral sequences, and reports compliance rates with full tool call timelines
Evaluates and optimizes agent skills using a DSPy-powered GEPA (Generate/Evaluate/Propose/Apply) loop. Loads scenario YAML files as DSPy datasets, scores outputs with pattern-matching metrics, and optimizes prompts via BootstrapFewShot or MIPROv2 teleprompters. Also generates new scenario YAML files from skill descriptions.