Loading...
Loading...
Found 1,065 Skills
Runs external LLM code reviews (OpenAI Codex or Google Gemini CLI) on uncommitted changes, branch diffs, or specific commits. Use when the user asks for a second opinion, external review, codex review, gemini review, or mentions /second-opinion.
AI-powered design review for Figma components with weighted dual-scoring system. Evaluates Style Guide Implementation (70%) and LLM Metadata Accessibility (30%). For export, hands off to atomic-design skill.
Execute complex tasks through sequential sub-agent orchestration with intelligent model selection, and LLM-as-a-judge verification
MUST READ before running any ADK evaluation. ADK evaluation methodology — eval metrics, evalset schema, LLM-as-judge, tool trajectory scoring, and common failure causes. Use when evaluating agent quality, running adk eval, or debugging eval results. Do NOT use for API code patterns (use adk-cheatsheet), deployment (use adk-deploy-guide), or project scaffolding (use adk-scaffold).
Arquitecto de soluciones digitales basadas en IA. Dos modos: (1) ANALIZAR repositorios o código existente y explicar su arquitectura para cualquier audiencia, incluyendo personas sin conocimiento técnico. (2) DISEÑAR la arquitectura completa de sistemas nuevos que usan LLMs, RAG, agentes o fine-tuning. Usa este skill cuando el usuario mencione: arquitectura de IA, diseño de sistema con LLM, capas arquitectónicas, RAG architecture, tech stack para IA, vector database, diagrama de arquitectura, componentes del sistema, embedding, retrieval, pipeline de datos, MLOps, LLMOps, evaluar enfoques, RAG vs fine-tuning, diseñar solución de inteligencia artificial, explicar repositorio, explicar código, analizar proyecto, qué hace este repo, cómo funciona este sistema, explícame este proyecto, o cualquier variación de "qué componentes necesito" o "explícame cómo funciona esto". Actívalo cuando el usuario pegue código, README, estructura de archivos, o mencione un repositorio de GitHub para analizar. También cuando quiera diseñar arquitectura nueva.
Instrument Python LLM apps, build golden datasets, write eval-based tests, run them, and root-cause failures — covering the full eval-driven development cycle. Make sure to use this skill whenever a user is developing, testing, QA-ing, evaluating, or benchmarking a Python project that calls an LLM, even if they don't say "evals" explicitly. Use for making sure an AI app works correctly, catching regressions after prompt changes, debugging why an agent started behaving differently, or validating output quality before shipping.
Documentation reference for writing Python code using the browser-use open-source library. Use this skill whenever the user needs help with Agent, Browser, or Tools configuration, is writing code that imports from browser_use, asks about @sandbox deployment, supported LLM models, Actor API, custom tools, lifecycle hooks, MCP server setup, or monitoring/observability with Laminar or OpenLIT. Also trigger for questions about browser-use installation, prompting strategies, or sensitive data handling. Do NOT use this for Cloud API/SDK usage or pricing — use the cloud skill instead. Do NOT use this for directly automating a browser via CLI commands — use the browser-use skill instead.
Scaffolds a personal LLM Wiki from scratch — the Karpathy pattern of incrementally building a persistent, interlinked markdown knowledge base maintained by LLMs. Generates directory structure, schema file, index, log, and workflow conventions. Use when user says "create wiki", "new wiki", "bootstrap wiki", "llm wiki", "knowledge base", "start a wiki", "build a wiki", or wants to set up a structured markdown knowledge base for any domain.
Run metric-driven iterative optimization loops. Define a measurable goal, build measurement scaffolding, then run parallel experiments that try many approaches, measure each against hard gates and/or LLM-as-judge quality scores, keep improvements, and converge toward the best solution. Use when optimizing clustering quality, search relevance, build performance, prompt quality, or any measurable outcome that benefits from systematic experimentation. Inspired by Karpathy's autoresearch, generalized for multi-file code changes and non-ML domains.
Every product will be AI-powered. The question is whether you'll build it right or ship a demo that falls apart in production. This skill covers LLM integration patterns, RAG architecture, prompt engineering that scales, AI UX that users trust, and cost optimization that doesn't bankrupt you. Use when: keywords, file_patterns, code_patterns.
World-class prompt engineering skill for LLM optimization, prompt patterns, structured outputs, and AI product development. Expertise in Claude, GPT-4, prompt design patterns, few-shot learning, chain-of-thought, and AI evaluation. Includes RAG optimization, agent design, and LLM system architecture. Use when building AI products, optimizing LLM performance, designing agentic systems, or implementing advanced prompting techniques.
Accelerate LLM inference using speculative decoding, Medusa multiple heads, and lookahead decoding techniques. Use when optimizing inference speed (1.5-3.6× speedup), reducing latency for real-time applications, or deploying models with limited compute. Covers draft models, tree-based attention, Jacobi iteration, parallel token generation, and production deployment strategies.