Loading...
Loading...
Found 48 Skills
Reference guide for permanent free-tier LLM APIs with rate limits, model lists, and OpenAI-compatible integration patterns.
Use to select models to run locally with llama.cpp and GGUF on CPU, Mac Metal, CUDA, or ROCm. Covers finding GGUFs, quant selection, running servers, exact GGUF file lookup, conversion, and OpenAI-compatible local serving.
Reference Documentation for Jiekou AI Model Services, covering LLM API (OpenAI-compatible), Image/Video/Audio APIs, integration solutions, authentication/billing/pricing/rate limiting, and troubleshooting. Suitable for questions like "How to integrate Jiekou AI into tools such as OpenAI SDK / LangChain?" and issues like Jiekou AI request failures.
Use this skill when working with the RTVI VLM or RT-VLM microservice API on VSS 3.1. Generate dense captions and alerts for stored video files and live RTSP streams via `/v1/generate_captions_alerts`; upload media via `/v1/files`; add and remove live streams with `/v1/streams/add` and `/v1/streams/delete/{stream_id}`; call OpenAI-compatible `/v1/chat/completions`; consume Kafka caption, incident, and error topics; or debug rtvi-vlm responses. For deployment, read `references/deploy-rt-vlm-service.md` first.
Summarize a video by calling the VLM NIM or the Long Video Summarization (LVS) microservice directly. For short videos (under 60s) call the VLM's OpenAI-compatible chat completions endpoint; for long videos (60s or longer) call the LVS microservice. Use when asked to summarize a video, describe what happens in a video, analyze a recording, call or debug LVS summarize/model/health/recommended-config/metrics endpoints, or configure and troubleshoot the LVS service that backs long-video summarization.
[QwenCloud] Generate text, have conversations, write code, reason, and call functions with Qwen models. TRIGGER when: user asks to chat with Qwen, generate text, write code with Qwen, use Qwen function calling, or explicitly invokes this skill by name (e.g. use qwencloud-text). DO NOT TRIGGER when: general coding questions without Qwen, non-Qwen AI model usage (OpenAI, Gemini, etc.), image/video understanding (use qwencloud-vision), image/video/audio generation.
Use when managing AI Hub account, API keys, balance, usage, or API endpoints. Use when user says "AI Hub", "add AI credits", "create API key", "check AI usage", "auto-recharge", "AI Hub endpoint", "AI Hub base URL", "how to use AI Hub API", "LLM API", "AI API", "OpenAI compatible", "Anthropic API", "GPT", "Claude", "Gemini", "DeepSeek", or "Grok" in the context of Zeabur.
Evaluates LLMs across 100+ benchmarks from 18+ harnesses (MMLU, HumanEval, GSM8K, safety, VLM) with multi-backend execution. Use when needing scalable evaluation on local Docker, Slurm HPC, or cloud platforms. NVIDIA's enterprise-grade platform with container-first architecture for reproducible benchmarking.
High-level map of the Venice.ai API - base URL, authentication modes, endpoint categories, response headers, pricing model, error shape, and versioning. Load this first when starting any Venice integration.
Use major AI models (Claude, ChatGPT, Gemini, DeepSeek, Qwen, etc.) without API tokens by leveraging browser authentication instead of paid API keys
Local proxy that lets OpenAI Codex CLI/desktop talk to MiMo, DeepSeek, and other LLMs via Responses API translation
Configure MeiGen plugin provider and API keys. Use this when the user runs /meigen:setup, asks to "configure meigen", "set up image generation", "add API key", or needs help configuring the plugin.