Loading...
Loading...
Found 1,065 Skills
Calculate training costs for Tinker fine-tuning jobs. Use when estimating costs for Tinker LLM training, counting tokens in datasets, or comparing Tinker model training prices. Tokenizes datasets using the correct model tokenizer and provides accurate cost estimates.
INVOKE THIS SKILL when building evaluation pipelines for LangSmith. Covers three core components: (1) Creating Evaluators - LLM-as-Judge, custom code; (2) Defining Run Functions - how to capture outputs and trajectories from your agent; (3) Running Evaluations - locally with evaluate() or auto-run via LangSmith. Uses the langsmith CLI tool.
Guides LLM agents through large-scale coding tasks using a spec-driven, phase-by-phase methodology covering requirement definition, planning, algorithm design, and implementation with OOP principles and language-specific coding standards. Use when starting a new software project, implementing a complex feature, refactoring existing code, or when you need a disciplined step-by-step approach to any non-trivial coding task.
Analyzes and generates llms.txt files -- the emerging standard for helping AI systems understand website structure and content. Can validate existing llms.txt files or generate new ones from scratch by crawling the site.
INVOKE THIS SKILL when creating, reading, updating, or deleting Arize AI integrations. Covers listing integrations, creating integrations for any supported LLM provider (OpenAI, Anthropic, Azure OpenAI, AWS Bedrock, Vertex AI, Gemini, NVIDIA NIM, custom), updating credentials or metadata, and deleting integrations using the ax CLI.
Attach judges to AI Config variations for automatic LLM-as-a-judge evaluation. Create custom judges, configure sampling rates, and monitor quality scores.
Provides AI and machine learning techniques for CTF challenges. Use when attacking ML models, crafting adversarial examples, performing model extraction, prompt injection, membership inference, training data poisoning, fine-tuning manipulation, neural network analysis, LoRA adapter exploitation, LLM jailbreaking, or solving AI-related puzzles.
Use when implementing speech-to-text, audio transcription, real-time streaming STT, audio intelligence features, or voice AI using AssemblyAI APIs or SDKs. Use when user mentions AssemblyAI, voice agents, transcription, speaker diarization, PII redaction of audio, LLM Gateway for audio understanding, or applying LLMs to transcripts. Also use when building voice agents with LiveKit or Pipecat that need speech-to-text, or when the user is working with any audio/video processing pipeline that could benefit from transcription, even if they don't mention AssemblyAI by name.
Use Crawl4AI for web crawling, markdown extraction, and LLM-powered structured extraction through OpenRouter. Use when the user mentions Crawl4AI, unclecode/crawl4ai, wants website data extracted with Crawl4AI, or needs an agent to crawl pages and turn them into structured JSON with OpenRouter-backed models.
Complete reference for the Galileo AI platform TypeScript/JS SDK for evaluating, observing, and protecting GenAI applications. Use when building Node.js or TypeScript applications that need LLM evaluation, production observability, tracing, or runtime guardrails with Galileo.
Migrate an application with hardcoded LLM prompts to a full LaunchDarkly AI Configs implementation in five stages: extract prompts, wrap in the AI SDK, add tools, add tracking, add evals/judges. Use when the user wants to externalize model/prompt configuration, move from direct provider calls (OpenAI, Anthropic, Bedrock, Gemini) to a managed AI Config, or stage a full hardcoded-to-LaunchDarkly migration.
Run cross-framework agent comparisons using evaluatorq from orqkit — compares any combination of agents (orq.ai, LangGraph, CrewAI, OpenAI Agents SDK, Vercel AI SDK) head-to-head on the same dataset with LLM-as-a-judge scoring. Use when comparing agents, benchmarking, or wanting side-by-side evaluation. Do NOT use when comparing only orq.ai configurations with no external agents (use run-experiment instead).