Loading...
Loading...
Found 1,564 Skills
Router skill for LLMQuant portfolio workflows. Use when the user needs company profiles, thesis tracking, theme research, watchlist monitoring, or alert management.
LLM-as-judge methodology for comparing code implementations across repositories. Scores implementations on functionality, security, test quality, overengineering, and dead code using weighted rubrics. Used by /beagle:llm-judge command.
Patterns and architectures for building AI agents and workflows with LLMs. Use when designing systems that involve tool use, multi-step reasoning, autonomous decision-making, or orchestration of LLM-driven tasks.
Use when "LLM inference", "serving LLM", "vLLM", "llama.cpp", "GGUF", "text generation", "model serving", "inference optimization", "KV cache", "continuous batching", "speculative decoding", "local LLM", "CPU inference"
Use when building an LLM-powered app that needs cost control via model routing, budget tracking, retry, and prompt caching.
Build LLM applications using Dify's visual workflow platform. Use when creating AI chatbots, implementing RAG pipelines, developing agents with tools, managing knowledge bases, deploying LLM apps, or building workflows with drag-and-drop. Supports hundreds of LLMs, Docker/Kubernetes deployment.
Implement comprehensive evaluation strategies for LLM applications using automated metrics, human feedback, and benchmarking. Use when testing LLM performance, measuring AI application quality, or establishing evaluation frameworks.
Enterprise LLM Fine-Tuning with LoRA, QLoRA, and PEFT techniques
Use when adding LangChain-based LLM routes or services in Python or Next.js stacks; pair with architect-stack-selector.
vLLM Ascend plugin for LLM inference serving on Huawei Ascend NPU. Use for offline batch inference, API server deployment, quantization inference (with msmodelslim quantized models), tensor/pipeline parallelism for distributed serving, and OpenAI-compatible API endpoints. Supports Qwen, DeepSeek, GLM, LLaMA models with Ascend-optimized kernels.
LLM fine-tuning expert for LoRA, QLoRA, dataset preparation, and training optimization
A session continuity loop where the frog is disposable but the pad is not.