Search Results: experiment-automation

Found 6 Skills

ce-optimize

Run metric-driven iterative optimization loops. Define a measurable goal, build measurement scaffolding, then run parallel experiments that try many approaches, measure each against hard gates and/or LLM-as-judge quality scores, keep improvements, and converge toward the best solution. Use when optimizing clustering quality, search relevance, build performance, prompt quality, or any measurable outcome that benefits from systematic experimentation. Inspired by Karpathy's autoresearch, generalized for multi-file code changes and non-ML domains.

🇺🇸|EnglishTranslated

3 scripts/Attention

Automationalirezarezvani/claude-ski...

loop

Start an autonomous experiment loop with user-selected interval (10min, 1h, daily, weekly, monthly). Uses CronCreate for scheduling.

🇺🇸|EnglishTranslated

AI & Machine Learningaradotso/claude-code-skil...

aris-autonomous-ml-research

Use ARIS (Auto-Research-In-Sleep) for autonomous ML research — idea generation, paper review, experiment automation, and cross-model collaboration with Claude Code, Codex, or any LLM agent.

🇺🇸|EnglishTranslated

AI & Machine Learningnvidia/skills

auto-research

Autonomous NeMo-RL research agent workflow for directed hypothesis testing and open-ended discovery. Guides agents through the full experiment lifecycle: understanding recipes and environments, wiring RL or NeMo-gym runs, launching reproducible baselines and iterations, analyzing results, preserving human oversight, and using git plus TSV logs as the research ledger.

🇺🇸|EnglishTranslated

AI & Machine Learningzy-ning/oh_my_co-research...

experiment

Runs ML experiments reproducibly — single runs or autonomous BFS batches. Single mode: isolated venv, time-budgeted, failure-handled, logs to RESEARCH.md. BFS mode (opt-in): designs N hypotheses, runs each for a fixed budget, compares via a single verifiable metric, keeps improvements and git-resets failures — fully autonomous until done. Respects the RESEARCH.md supervision policy for notifications, approvals, and stop limits. Trigger phrases: "run experiment", "train model", "explore design space", "find best config", "autoresearch".

🇺🇸|EnglishTranslated

Automationdabiggm0e/autoresearch-op...

autoresearch

Set up and run an autonomous experiment loop for any optimization target. Use when asked to start autoresearch or run experiments.

🇺🇸|EnglishTranslated