Loading...
Loading...
Found 33 Skills
Converts CXAS golden evaluations to SCRAPI SimulationEvals test cases. Use when generating high-level, goal-oriented test cases from turn-by-turn evaluation JSONs, and when enriching test expectations with inferred tool calls.
Systematic improvement of existing agents through performance analysis, prompt engineering, and continuous iteration.
Use when creating or configuring Claude Code agents and their frontmatter.
Use when creating or editing any prompt (commands, hooks, skills, subagent instructions) to verify it produces desired behavior - applies RED-GREEN-REFACTOR cycle to prompt engineering using subagents for isolated testing
Use when creating or editing skills, before deployment, to verify they work under pressure and resist rationalization - applies RED-GREEN-REFACTOR cycle to process documentation by running baseline without skill, writing to address failures, iterating to close loopholes
Use when the user says "get started with Cekura", "set up Cekura", "onboard to Cekura", "I'm new to Cekura", "help me set up my agent", "how do I use Cekura", "walk me through Cekura", "configure my project", "first time using Cekura", or needs guidance on initial platform setup. Covers two onboarding paths: **testing** (default — build evaluators and run simulated calls) and **observability** (ingest production call logs and evaluate them).
Implementation guidance for creating individual agents in the Arcanea system with proper structure, capabilities, and integration.
Agent testing methodology - run agents with test inputs, observe outputs, iterate until outputs are accurate and well-structured.
This skill should be used when the user asks to "test the triage skill", "run triage tests", "validate antithesis triage", "test:triage", or "smoke test triage". Orchestrates end-to-end testing of the antithesis-triage skill by running real triage operations via sub-agents and reviewing the results for bugs, skill compliance issues, and papercuts.