Loading...
Loading...
Found 1,573 Skills
Discover scientific equations from data using LLM-guided evolutionary search (LLM-SR). Multi-island algorithm with softmax-based cluster sampling, island reset, and LLM-proposed equation mutations. Use for symbolic regression and equation discovery.
LLM Wiki — persistent markdown knowledge base that compounds across sessions (Karpathy model)
Use when managing AI Hub account, API keys, balance, usage, or API endpoints. Use when user says "AI Hub", "add AI credits", "create API key", "check AI usage", "auto-recharge", "AI Hub endpoint", "AI Hub base URL", "how to use AI Hub API", "LLM API", "AI API", "OpenAI compatible", "Anthropic API", "GPT", "Claude", "Gemini", "DeepSeek", or "Grok" in the context of Zeabur.
This skill should be used when the user asks to "implement LLM-as-judge", "compare model outputs", "create evaluation rubrics", "mitigate evaluation bias", or mentions direct scoring, pairwise comparison, position bias, evaluation pipelines, or automated quality assessment. Part of the context engineering skill suite — also activates when the user mentions "context engineering" or "context-engineering" in the context of evaluating LLM output quality.
Track LLM API costs in real-time across multiple providers. Monitor token usage, spending limits, budget alerts, and cost attribution per job or task.
Automatic LLM provider failover with fallback chains, inspired by OpenClaw/ZeroClaw model configuration.
Set up a new Obsidian knowledge base with the LLM Wiki pattern. Use when the user wants to create a wiki, second brain, personal knowledge base, initialize a vault, or says "onboard", "set up", "new wiki", or "new vault".
Run a decision through 5 AI advisors with different thinking styles, anonymous peer review, and chairman synthesis. For genuine decisions with stakes and tradeoffs — not simple questions. Based on Karpathy's LLM Council.
Scans lyrics for phrases that may match existing songs using web search and LLM knowledge. Use before release to check for unintentional borrowing.
Evaluates LLMs across 100+ benchmarks from 18+ harnesses (MMLU, HumanEval, GSM8K, safety, VLM) with multi-backend execution. Use when needing scalable evaluation on local Docker, Slurm HPC, or cloud platforms. NVIDIA's enterprise-grade platform with container-first architecture for reproducible benchmarking.
Quick single-paper lookup via AlphaXiv LLM-optimized summaries with tiered source fallback. Use when user says "explain this paper", "summarize paper", pastes an arXiv/AlphaXiv URL, or provides a bare arXiv ID for quick understanding - not for broad literature search.
Analyze code changes for security vulnerabilities using LLM reasoning and threat model patterns. Use for PR reviews, pre-commit checks, or branch comparisons.