Loading...
Loading...
Found 26 Skills
Best practices for writing AI research papers. Use when the project involves writing a research paper in AI field.
Guides research engineering and science on LLM tokens—hypotheses about context use, tokenization, compression, and inference efficiency; rigorous benchmarks (tokens per task, quality–cost Pareto); ablation design; instrumentation and reproducible logs; and research memos that inform product decisions. Use when designing token-efficiency experiments, measuring context utilization, comparing compression or routing methods, analyzing tokenizer effects, or writing technical reports on token/cost trade-offs—not for phased cost roadmaps and owners (ai-token-improvement-plan-engineer), production context pipeline implementation (ai-context-engineer), single-prompt edits (prompt-engineer), general non-token AI research (ai-researcher), or shipping features (ai-engineer).
Defines a testable hypothesis with clear success metrics and validation approach. Use when forming assumptions to test, designing experiments, or aligning team on what success looks like.
Use after solution concepts exist to surface and prioritize assumptions behind outcomes, opportunities, or solution ideas and design experiments to test them.
OKR trees, KPI dashboards, North Star Metric, leading/lagging indicators, and experiment design. Use when setting team goals, defining success metrics, building measurement frameworks, or designing A/B experiment guardrails.
Design, plan, and analyze A/B tests with statistical rigor. Use when the user asks about A/B testing, split testing, experiment design, statistical significance, sample size calculation, test duration, multivariate testing, or conversion experiments. Trigger phrases include "A/B test", "split test", "experiment", "statistical significance", "sample size", "test duration", "which version wins", "conversion experiment", "hypothesis test", "variant testing".
Hypothesis → Prediction → Test → Revise with explicit falsification. Use for debugging, feature experimentation, performance investigation, and A/B testing design.
Guides pre-writing planning for academic papers with 4 structured steps: story design (task-challenge-insight-contribution-advantage), experiment planning (comparisons + ablations), figure design (pipeline + teaser), and 4-week timeline management. Includes counterintuitive planning tactics (write a mock rejection letter to identify weaknesses before writing, narrow before broad claims, design ablations first). Use when: user wants to plan a paper before writing, design story/contributions, plan experiments, create figure sketches, set a writing timeline, or write a pre-emptive rejection letter for planning purposes. Do NOT use for actual writing (use paper-writing), running experiments (use experiment-pipeline), self-reviewing a finished draft (use paper-review), or finding research problems (use research-ideation).
Use this skill when applying Jobs-to-be-Done, building opportunity solution trees, mapping assumptions, or validating product ideas. Triggers on product discovery, JTBD, jobs-to-be-done, opportunity solution trees, assumption mapping, experiment design, prototype testing, and any task requiring product discovery methodology.
Audit whether an ML or AI paper's experimental baselines are necessary, fair, current, and reviewer-proof. Use this skill whenever the user is planning experiments, comparing methods, choosing baselines, worried about missing SOTA or unfair comparisons, preparing a reviewer-proof experiment section, or converting a literature review into must-have, should-have, optional, and not-comparable baselines.
Product analytics expert using PostHog MCP. Triggers on requests to understand user behavior, surface insights, create dashboards, analyze funnels, track metrics, set up experiments, or answer questions about product performance. Use when working with PostHog data, discussing analytics strategy, investigating user journeys, retention, conversion, feature adoption, or when asked to help understand what's happening in the product.
Use when main results pass result-to-claim (claim_supported=yes or partial) and ablation studies are needed for paper submission. Codex designs ablations from a reviewer's perspective, CC reviews feasibility and implements.