Loading...
Loading...
Found 776 Skills
Expert in Langfuse - the open-source LLM observability platform. Covers tracing, prompt management, evaluation, datasets, and integration with LangChain, LlamaIndex, and OpenAI. Essential for debugging, monitoring, and improving LLM applications in production. Use when: langfuse, llm observability, llm tracing, prompt management, llm evaluation.
Expert prompt engineering for LLM applications including prompt design, optimization, RAG systems, agent architectures, and AI product development.
Quickly test and compare LLM models via OpenRouter. Find the fastest/cheapest model, compare response quality. Trigger words: openrouter, test model, compare models, find fastest model, find cheapest model
You are an expert prompt engineer specializing in crafting effective prompts for LLMs through advanced techniques including constitutional AI, chain-of-thought reasoning, and model-specific optimizati
Expert in building Retrieval-Augmented Generation systems. Masters embedding models, vector databases, chunking strategies, and retrieval optimization for LLM applications. Use when: building RAG, vector search, embeddings, semantic search, document retrieval.
This skill should be used when the user asks to "refine a prompt", "optimize a prompt", "improve my prompt", "rewrite prompt for LLM", "craft a better prompt", or mentions prompt engineering, prompt optimization, or appending to PROMPT.md.
Access and interact with Large Language Models from the command line using Simon Willison's llm CLI tool. Supports OpenAI, Anthropic, Gemini, Llama, and dozens of other models via plugins. Features include chat sessions, embeddings, structured data extraction with schemas, prompt templates, conversation logging, and tool use. This skill is triggered when the user says things like "run a prompt with llm", "use the llm command", "call an LLM from the command line", "set up llm API keys", "install llm plugins", "create embeddings", or "extract structured data from text".
USE FOR RAG/LLM grounding. Returns pre-extracted web content (text, tables, code) optimized for LLMs. GET + POST. Adjust max_tokens/count based on complexity. Supports Goggles, local/POI. For AI answers use answers. Recommended for anyone building AI/agentic applications.
Build AI agents on Cloudflare Workers with MCP integration, tool use, and LLM providers.
Guidelines for creating high-quality datasets for LLM post-training (SFT/DPO/RLHF). Use when preparing data for fine-tuning, evaluating data quality, or designing data collection strategies.
Calculate training costs for Tinker fine-tuning jobs. Use when estimating costs for Tinker LLM training, counting tokens in datasets, or comparing Tinker model training prices. Tokenizes datasets using the correct model tokenizer and provides accurate cost estimates.
Use when evaluating LLMs, running benchmarks like MMLU/HumanEval/GSM8K, setting up evaluation pipelines, or asking about "NeMo Evaluator", "LLM benchmarking", "model evaluation", "MMLU", "HumanEval", "GSM8K", "benchmark harnesses"