Loading...
Loading...
Found 18 Skills
LLM and ML model deployment for inference. Use when serving models in production, building AI APIs, or optimizing inference. Covers vLLM (LLM serving), TensorRT-LLM (GPU optimization), Ollama (local), BentoML (ML deployment), Triton (multi-model), LangChain (orchestration), LlamaIndex (RAG), and streaming patterns.
Use when "LangChain", "LLM chains", "ReAct agents", "tool calling", or asking about "RAG pipelines", "conversation memory", "document QA", "agent tools", "LangSmith"
Build LLM applications using Dify's visual workflow platform. Use when creating AI chatbots, implementing RAG pipelines, developing agents with tools, managing knowledge bases, deploying LLM apps, or building workflows with drag-and-drop. Supports hundreds of LLMs, Docker/Kubernetes deployment.
Use this skill when crafting LLM prompts, implementing chain-of-thought reasoning, designing few-shot examples, building RAG pipelines, or optimizing prompt performance. Triggers on prompt design, system prompts, few-shot learning, chain-of-thought, prompt chaining, RAG, retrieval-augmented generation, prompt templates, structured output, and any task requiring effective LLM interaction patterns.
Use this skill when working with Mastra - the TypeScript AI framework for building agents, workflows, tools, and AI-powered applications. Triggers on creating agents, defining workflows, configuring memory, RAG pipelines, MCP client/server setup, voice integration, evals/scorers, deployment, and Mastra CLI commands. Also triggers on "mastra dev", "mastra build", "mastra init", Mastra Studio, or any Mastra package imports.
LLM app development with RAG, prompt engineering, vector databases, and AI agents