Loading...
Loading...
Found 1,564 Skills
Run LLMs and AI models on Cloudflare's GPU network with Workers AI. Includes Llama 4, Gemma 3, Mistral 3.1, Flux images, BGE embeddings, streaming, and AI Gateway. Handles 2025 breaking changes. Prevents 7 documented errors. Use when: implementing LLM inference, images, RAG, or troubleshooting AI_ERROR, rate limits, max_tokens, BGE pooling, context window, neuron billing, Miniflare AI binding, NSFW filter, num_steps.
Add and manage evaluation results in Hugging Face model cards. Supports extracting eval tables from README content, importing scores from Artificial Analysis API, and running custom model evaluations with vLLM/lighteval. Works with the model-index metadata format.
Create PydanticAI agents with type-safe dependencies, structured outputs, and proper configuration. Use when building AI agents, creating chat systems, or integrating LLMs with Pydantic validation.
Configure LLM providers, use fallback models, handle streaming, and manage model settings in PydanticAI. Use when selecting models, implementing resilience, or optimizing API calls.
Perform 12-Factor Agents compliance analysis on any codebase. Use when evaluating agent architecture, reviewing LLM-powered systems, or auditing agentic applications against the 12-Factor methodology.
Configure LLM models and providers for Letta agents and servers. Use when setting model handles, adjusting temperature/tokens, configuring provider-specific settings, setting up BYOK providers, or configuring self-hosted deployments with environment variables.
Extract text from PDFs for LLM consumption. Use when processing PDFs for RAG, document analysis, or text extraction. Supports API services (Mistral OCR) and local tools (PyMuPDF, pdfplumber). Handles text-based PDFs, tables, and scanned documents with OCR.
Master context engineering for AI agent systems. Use when designing agent architectures, debugging context failures, optimizing token usage, implementing memory systems, building multi-agent coordination, evaluating agent performance, or developing LLM-powered pipelines. Covers context fundamentals, degradation patterns, optimization techniques (compaction, masking, caching), compression strategies, memory architectures, multi-agent patterns, LLM-as-Judge evaluation, tool design, and project development.
Expert guidance for deep learning, transformers, diffusion models, and LLM development with PyTorch, Transformers, Diffusers, and Gradio.
LangGraph tool calling patterns. Use when binding tools to LLMs, implementing ToolNode for execution, dynamic tool selection, or adding approval gates to tool calls.
Deploy GPU workloads to RunPod serverless and pods - vLLM endpoints, A100/H100 setup, scale-to-zero, cost optimization. Use when: deploy to RunPod, GPU serverless, vLLM endpoint, scale to zero, A100 deployment, H100 setup, serverless handler, GPU cost optimization.
Debug LLM applications using the Phoenix CLI. Fetch traces, analyze errors, review experiments, and inspect datasets. Use when debugging AI/LLM applications, analyzing trace data, working with Phoenix observability, or investigating LLM performance issues.