Loading...
Loading...
Found 22 Skills
Autonomous LLM training optimization with GPU support. Runs 5-minute training experiments, measures val_bpb, keeps improvements or reverts — repeat forever. Use this skill when the user asks to "train a model autonomously", "optimize LLM training", "run ML experiments", "autoresearch with GPU", "optimize val_bpb", "autonomous ML training", "LLM pretraining loop", "setup ML autoresearch", "GPU training experiments", "pretrain from scratch", "speed up training", "lower my loss", "GPU optimization", "CUDA training", or mentions "train.py", "prepare.py", "bits per byte", "val_bpb", "NVIDIA GPU training", "RTX training", "H100 training", "autonomous model training", "consumer GPU training", "low VRAM training". Always use this skill when the user wants to autonomously optimize any ML training metric.
Run metric-driven iterative optimization loops. Define a measurable goal, build measurement scaffolding, then run parallel experiments that try many approaches, measure each against hard gates and/or LLM-as-judge quality scores, keep improvements, and converge toward the best solution. Use when optimizing clustering quality, search relevance, build performance, prompt quality, or any measurable outcome that benefits from systematic experimentation. Inspired by Karpathy's autoresearch, generalized for multi-file code changes and non-ML domains.
Autonomously optimize an existing AI skill by running it repeatedly against binary evals, mutating one instruction at a time, and keeping only changes that improve pass rate. Based on Karpathy-style autoresearch, but applied to SKILL.md iteration instead of ML training. Use when optimizing a skill, benchmarking prompt quality, building evals for a skill, or running self-improvement loops on reusable agent instructions. Triggers on: skill-autoresearch, optimize this skill, improve this skill, benchmark this skill, eval my skill, run autoresearch on this skill, self-improve skill.
[Hyper] Optimize an existing codebase through baseline-first experiments, binary evaluation, and one-mutation-at-a-time iteration. Use for codebase autoresearch, measured bottleneck reduction, benchmarked code optimization, and evidence-backed refactors.
Use when user wants autonomous iteration on any task — improving metrics, completing features, running experiments, optimizing code, or working unattended. Make sure to use this skill whenever someone mentions autoresearch, autonomous loops, iterating until done, running overnight, keep improving, hill-climbing, or any measurable improvement goal, even if they don't explicitly ask for a 'loop'.
Run Karpathy-style autoresearch optimization on any content. Generates 50+ variants, scores with a 5-expert simulated panel, evolves winners through multiple rounds, outputs optimized version + full experiment log. Use when optimizing landing pages, email sequences, ad copy, headlines, form pages, CTA text, or any conversion-focused content. Triggers on "optimize this page", "run autoresearch", "score these variants", "A/B test this copy".
Resume a paused experiment. Checkout the experiment branch, read results history, continue iterating.
Run a single experiment iteration. Edit the target file, evaluate, keep or discard.
Structured prompts, vault templates, and autonomous research workflows for AI-assisted genealogy using Claude Code.
Self-directed iterative research skill for Codex that continuously cycles through modify, verify, retain or discard, and repeat until a measurable goal is reached.