Total 50,658 skills, AI & Machine Learning has 8491 skills
Showing 12 of 8491 skills
Generate speech from text using Google Gemini TTS models via scripts/. Use for text-to-speech, audio generation, voice synthesis, multi-speaker conversations, and creating audio content. Supports multiple voices and streaming. Triggers on "text to speech", "TTS", "generate audio", "voice synthesis", "speak this text".
Meta-agent for creating new custom agents, skills, and MCP integrations. Expert in agent design, MCP development, skill architecture, and rapid prototyping. Activate on 'create agent', 'new skill', 'MCP server', 'custom tool', 'agent design'. NOT for using existing agents (invoke them directly), general coding (use language-specific skills), or infrastructure setup (use deployment-engineer).
模型自动降级与故障切换。当主模型请求失败、超时、达到速率限制或配额耗尽时,自动切换到备用模型,确保服务连续性。支持多供应商、多优先级的智能模型选择,提供健康监控、自动重试和错误恢复机制。
Guide for video analysis and frame-level event detection tasks using OpenCV and similar libraries. This skill should be used when detecting events in videos (jumps, movements, gestures), extracting frames, analyzing motion patterns, or implementing computer vision algorithms on video data. It provides verification strategies and helps avoid common pitfalls in video processing workflows.
Run Claude Code programmatically without interactive UI. Triggers on: headless, CLI automation, --print, output-format, stream-json, CI/CD, scripting.
Expert in designing and building autonomous AI agents. Masters tool use, memory systems, planning strategies, and multi-agent orchestration. Use when "build agent, AI agent, autonomous agent, tool use, function calling, multi-agent, agent memory, agent planning, langchain agent, crewai, autogen, claude agent sdk, ai-agents, langchain, autogen, crewai, tool-use, function-calling, autonomous, llm, orchestration" mentioned.
Generate and transcribe speech using Google's Gemini-TTS and Chirp 3 models. Supports Text-to-Speech (Single/Multi-speaker), Instant Custom Voice, and Speech-to-Text (Transcription/Diarization).
Generate AI images with DALL-E - create images from text descriptions, edit existing images, and manage creations
Train ML models with scikit-learn, PyTorch, TensorFlow. Use for classification/regression, neural networks, hyperparameter tuning, or encountering overfitting, underfitting, convergence issues.
Reviews and validates agent skills against best practices. Triggers on "review this skill", "check my skill", "validate skill", "is this skill well-written", or when creating/editing skills.
Production voice AI agents with sub-500ms latency. Groq LLM, Deepgram STT, Cartesia TTS, Twilio integration. No OpenAI. Use when: voice agent, phone bot, STT, TTS, Deepgram, Cartesia, Twilio, voice AI, speech to text, IVR, call center, voice latency.
Platform APIs for model management, pricing, and usage tracking