Loading...
Loading...
Found 1,564 Skills
Recipes and configs for serving LLMs locally on RTX 3090 GPUs using vLLM, llama.cpp, and SGLang with OpenAI-compatible API
Configure RuVLLM local inference with model selection, MicroLoRA fine-tuning, and SONA adaptation
Router skill for LLMQuant portfolio-lab workflows. Use when the user needs portfolio exposure maps, what-if simulations, scenario states, or virtual portfolio comparisons.
Help users build effective AI applications. Use when someone is building with LLMs, writing prompts, designing AI features, implementing RAG, creating agents, running evals, or trying to improve AI output quality.
Master fine-tuning of large language models for specific domains and tasks. Covers data preparation, training techniques, optimization strategies, and evaluation methods. Use when adapting models for specialized applications, reducing inference costs, or improving domain-specific performance.
Implement real-time streaming UI patterns for AI chat applications. Use when adding response lifecycle handlers, progress indicators, client effects, or thread state synchronization. Covers onResponseStart/End, onEffect, ProgressUpdateEvent, and client tools. NOT when building basic chat without real-time feedback.
LLM integration patterns for function calling, streaming responses, local inference with Ollama, and fine-tuning customization. Use when implementing tool use, SSE streaming, local model deployment, LoRA/QLoRA fine-tuning, or multi-provider LLM APIs.
Best practices for LLM-assisted coding. Declarative workflows, simplicity, tenacity.
Build, validate, and deploy LLM-as-Judge evaluators for automated quality assessment of LLM pipeline outputs. Use this skill whenever the user wants to: create an automated evaluator for subjective or nuanced failure modes, write a judge prompt for Pass/Fail assessment, split labeled data for judge development, measure judge alignment (TPR/TNR), estimate true success rates with bias correction, or set up CI evaluation pipelines. Also trigger when the user mentions "judge prompt", "automated eval", "LLM evaluator", "grading prompt", "alignment metrics", "true positive rate", or wants to move from manual trace review to automated evaluation. This skill covers the full lifecycle: prompt design → data splitting → iterative refinement → success rate estimation.
LLM app development with RAG, prompt engineering, vector databases, and AI agents
Monitor LLMs and agentic apps: performance, token/cost, response quality, and workflow orchestration. Use when the user asks about LLM monitoring, GenAI observability, or AI cost/quality.
Use Claude Code's full tool system with any OpenAI-compatible LLM — GPT-4o, DeepSeek, Gemini, Ollama, and 200+ models via environment variable configuration.