Loading...
Loading...
Found 1,066 Skills
DeepEval evaluation workflow for AI agents and LLM applications. TRIGGER when the user wants to evaluate or improve an AI agent, tool-using workflow, multi-turn chatbot, RAG pipeline, or LLM app; add evals; generate datasets or goldens; use deepeval generate; use deepeval test run; add tracing or @observe; send results to Confident AI; monitor production; run online evals; inspect traces; or iterate on prompts, tools, retrieval, or agent behavior from eval failures. AI agents are the primary use case. Covers Python SDK, pytest eval suites, CLI generation, tracing, Confident AI reporting, and agent-driven improvement loops. DO NOT TRIGGER for unrelated generic pytest, non-AI test setup, or non-DeepEval observability work unless the user asks to compare or migrate to DeepEval.
Comprehensive testing doctrine for software and AI systems — covers positive patterns, anti-patterns, gates for coding agents writing tests, CI discipline, and an LLM/agent evaluation primer. Use when authoring or reviewing tests, adding mocks, deciding test placement, generating tests via agents, debugging flaky CI, designing eval suites for LLM features, or rebuilding a brittle test suite. Contains 12 positive patterns (selector hierarchy, table-driven, builders, real-system gates), 25 anti-patterns across Brittleness, Flakiness, Mock-misuse, Process, and AI-specific families, 7 mandatory gates for agents writing tests, flaky-test taxonomy with quarantine workflow, contract / property / mutation testing patterns, and an oracle-ladder primer for LLM-as-judge and agent eval. Language-agnostic — pseudo-code only. Don't use for general code review, library-specific debugging unrelated to tests, non-testing CI pipeline design, or production observability.
Install and configure LLMem for an agent harness. Handles CLI install, plugin deployment, skill registration, and provider setup. Triggers on: "install llmem", "set up memory", "configure memory", "add llmem to harness", "memory setup".
Manage LLMem — structured memory system with SQLite-backed factual memory, semantic search, and background dreaming (decay, boost, promote, merge). Use when the user wants to: (1) add, search, update, or delete memories, (2) generate context for injection, (3) check memory stats, (4) run background consolidation/dream. Triggers on: "memory", "remember", "recall", "llmem", "memories", "forget", "consolidate memories", "dream".
Guide for using Microsoft MarkItDown - a Python utility for converting files to Markdown. Use when converting PDF, Word, PowerPoint, Excel, images, audio, HTML, CSV, JSON, XML, ZIP, YouTube URLs, EPubs, Jupyter notebooks, RSS feeds, or Wikipedia pages to Markdown format. Also use for document processing pipelines, LLM preprocessing, or text extraction tasks.
Return public original model architecture diagrams for user-specified LLM, VLM, MoE, diffusion, OCR, and SGLang/sgl-cookbook model families. Use when the user asks for a model structure chart, architecture diagram, or rendered image link for a specific model such as DeepSeek, GLM, Qwen, Kimi, MiniMax, Step, Hunyuan, or Qwen3-VL.
Reference Documentation for Jiekou AI Model Services, covering LLM API (OpenAI-compatible), Image/Video/Audio APIs, integration solutions, authentication/billing/pricing/rate limiting, and troubleshooting. Suitable for questions like "How to integrate Jiekou AI into tools such as OpenAI SDK / LangChain?" and issues like Jiekou AI request failures.
Guide for creating high-quality MCP (Model Context Protocol) servers that enable LLMs to interact with external services through well-designed tools. Use when building MCP servers to integrate external APIs or services, whether in Python (FastMCP) or Node/TypeScript (MCP SDK).
Run Claude Code CLI, VS Code, or JetBrains ACP through a local proxy that routes to NVIDIA NIM, Kimi, OpenRouter, DeepSeek, or local LLMs
Use ARIS (Auto-Research-In-Sleep) for autonomous ML research — idea generation, paper review, experiment automation, and cross-model collaboration with Claude Code, Codex, or any LLM agent.
Chinese public opinion analytics platform integrating 26 trending lists from 15 platforms with LLM-powered sentiment analysis, topic clustering, and multi-channel alert push
Browser automation MCP server using Playwright's accessibility tree for LLM-friendly web interaction