Loading...
Loading...
Found 20 Skills
Interact with Langfuse and access its documentation. Use when needing to (1) query or modify Langfuse data programmatically via the CLI — traces, prompts, datasets, scores, sessions, and any other API resource, (2) look up Langfuse documentation, concepts, integration guides, or SDK usage, or (3) understand how any Langfuse feature works. This skill covers CLI-based API access (via npx) and multiple documentation retrieval methods.
Generate deep links to traces, spans, and sessions in the Arize UI. Use when the user wants a clickable URL to open a specific trace, span, or session.
Setup Sentry AI Agent Monitoring in any project. Use when asked to monitor LLM calls, track AI agents, or instrument OpenAI/Anthropic/Vercel AI/LangChain/Google GenAI. Detects installed AI SDKs and configures appropriate integrations.
Read production traces, identify what's failing, and build failure taxonomies using open coding and axial coding methodology. Use when debugging agent or pipeline quality, investigating "why are my outputs bad?", or before building any evaluator — error analysis must come first. Do NOT use when you already have identified failure modes and need evaluators (use build-evaluator) or datasets (use generate-synthetic-dataset).
Comprehensive LLM audit. Model currency, prompt quality, evals, observability, CI/CD. Ensures all LLM-powered features follow best practices and are properly instrumented. Auto-invoke when: model names/versions mentioned, AI provider config, prompt changes, .env with AI keys, aiProviders.ts or prompts.ts modified, AI-related PRs. CRITICAL: Training data lags months. ALWAYS web search before LLM decisions.
List Langfuse sessions. Use when checking user sessions, analyzing conversation flows, or monitoring session activity.
Integrate Portkey AI Gateway into TypeScript/JavaScript applications. Use when building LLM apps with observability, caching, fallbacks, load balancing, or routing across 200+ LLM providers.
Expert skill for using Future AGI — the open-source end-to-end platform for evaluating, observing, and improving LLM and AI agent applications with tracing, evals, simulations, datasets, gateway, and guardrails.
AI 도입 전략, Build vs Buy, 우선순위 설정, 거버넌스/보안, 6개월 확장 로드맵을 다루는 모듈.
Instrument LLM applications with Langfuse tracing. Use when setting up Langfuse, adding observability to LLM calls, or auditing existing instrumentation.
Fetch, organize, and analyze LangSmith traces for debugging and evaluation. Use when you need to: query traces/runs by project, metadata, status, or time window; download traces to JSON; organize outcomes into passed/failed/error buckets; analyze token/message/tool-call patterns; compare passed vs failed behavior; or investigate benchmark and production failures.
Multi-agent systems with LangGraph - supervisor/swarm/handoff/router patterns, state coordination, Deep Agents, guardrails, testing, observability, deployment. Use when building multi-agent workflows, coordinating agents, or need cost-optimized orchestration. Uses Claude, DeepSeek, Gemini (no OpenAI).