Loading...
Loading...
Found 31 Skills
Autonomously optimize an existing AI skill by running it repeatedly against binary evals, mutating one instruction at a time, and keeping only changes that improve pass rate. Based on Karpathy-style autoresearch, but applied to SKILL.md iteration instead of ML training. Use when optimizing a skill, benchmarking prompt quality, building evals for a skill, or running self-improvement loops on reusable agent instructions. Triggers on: skill-autoresearch, optimize this skill, improve this skill, benchmark this skill, eval my skill, run autoresearch on this skill, self-improve skill.
Continuous self-improvement through structured reflection and memory
Monitors context window health throughout a session and rides peak context quality for maximum output fidelity. Activates automatically after plan-interview and intent-framed-agent. Stays active through execution and hands off cleanly to simplify-and-harden and self-improvement when the wave completes naturally or exits via handoff. Use this skill whenever a multi-step agent task is underway and session continuity or context drift is a concern. Especially important for long-running tasks, complex refactors, or any work where degraded context would silently corrupt the output. Trigger even if the user doesn't say "context surfing" — if an agent task is running across multiple steps with intent and a plan already established, this skill is live.
Pipeline orchestrator that classifies incoming coding tasks and routes them through the correct combination of skills in the right order at the right depth. Auto-activates on any coding task. Centralizes the decision logic for which skills to use, how deep each goes, and how artifacts pass between them. Handles three pipeline variants: standard (plan-interview, intent-framed-agent, context-surfing, simplify-and-harden, self-improvement), team-based (agent-teams-simplify-and-harden), and CI (simplify-and-harden-ci, self-improvement-ci). Use this skill whenever starting any coding work — it determines the appropriate pipeline depth and variant automatically. Does not replace individual skills; dispatches to them.
Start a repo-local OptimizeSpec self-improvement change. Use when the user wants to create evals, optimize an agent with GEPA, define an agent self-improvement loop, or begin an ASI-first evaluation workflow.
Use when the user asks to "improve my agent", "self-improving agent", "auto-tune my agent", "iterate on my agent prompt", "fix my agent based on test results", "close the loop on agent quality", "auto-improve agent prompt", "use eval results to improve agent", "optimize my prompt based on failures", "rewrite my prompt", or describes agent self-improvement, prompt iteration from run results, or automated agent quality loops. Covers the full diagnose → propose → apply → re-validate loop for VAPI agents (squads + tool definitions) and for self-hosted agents (custom websocket servers, including the offline / pasted-prompt degenerate variant).
Anthropic's method for training harmless AI through self-improvement. Two-phase approach - supervised learning with self-critique/revision, then RLAIF (RL from AI Feedback). Use for safety alignment, reducing harmful outputs without human labels. Powers Claude's safety system.
Add persistent learning and self-improvement to AI agents using ACE framework
HOWL v2 — Hunt, Optimize, Win, Learn. Nightly self-improvement loop for the WOLF autonomous trading strategy. Runs once per day (via cron) to review all trades from the last 24 hours, compute win rates, analyze signal quality correlation, evaluate DSL tier performance, identify missed opportunities, and produce concrete improvement suggestions for the wolf-strategy skill. v2 adds fee drag ratio (FDR) analysis, holding period bucketing, LONG vs SHORT regime detection, rotation cost tracking, cumulative drift detection, and gross vs net profit factor separation. Use when setting up daily trade review automation, analyzing trading performance, or improving an autonomous trading strategy through data-driven feedback loops. Requires Senpi MCP connection, mcporter CLI, and OpenClaw cron system.
Workflow orchestration for complex coding tasks. Use for ANY non-trivial task (3+ steps or architectural decisions) to enforce planning, subagent strategy, self-improvement, verification, elegance, and autonomous bug fixing. Triggers: multi-step implementation, bug fixes, refactoring, architectural changes, or any task requiring structured execution.
Encodes a continuous improvement loop for goal-seeking agents: EVAL, ANALYZE, RESEARCH (hypothesis + evidence + counter-arguments), IMPROVE, RE-EVAL, DECIDE. Auto-commits improvements (+2% net, no regression >5%) and reverts failures. Works with all 4 SDK implementations. Auto-activates on "improve agent", "self-improving loop", "agent eval loop", "benchmark agents", "run improvement cycle".
Use when a session produced reusable insights, when the user says "learn from this", "remember this", or "improve yourself", or after completing a complex task where patterns were discovered