Loading...
Loading...
Found 20 Skills
Expert in building voice AI applications - from real-time voice agents to voice-enabled apps. Covers OpenAI Realtime API, Vapi for voice agents, Deepgram for transcription, ElevenLabs for synthesis, LiveKit for real-time infrastructure, and WebRTC fundamentals. Knows how to build low-latency, production-ready voice experiences. Use when: voice ai, voice agent, speech to text, text to speech, realtime voice.
Build real-time conversational AI voice engines using async worker pipelines, streaming transcription, LLM agents, and TTS synthesis with interrupt handling and multi-provider support
Build voice-enabled AI applications with speech recognition, text-to-speech, and voice-based interactions. Supports multiple voice providers and real-time processing. Use when creating voice assistants, voice-controlled applications, audio interfaces, or hands-free AI systems.
Architecting real-time Voice AI agents.
Production voice AI agents with sub-500ms latency. Groq LLM, Deepgram STT, Cartesia TTS, Twilio integration. No OpenAI. Use when: voice agent, phone bot, STT, TTS, Deepgram, Cartesia, Twilio, voice AI, speech to text, IVR, call center, voice latency.
Integrate Shengwang products: ConvoAI voice agents, RTC audio/video, RTM messaging, Cloud Recording, and token generation. Use when the user mentions Shengwang, 声网, ConvoAI, RTC, RTM, voice agent, AI agent, video call, live streaming, recording, token, or any Shengwang product task.
Build real-time voice AI applications using Azure AI Voice Live SDK (azure-ai-voicelive). Use this skill when creating Python applications that need real-time bidirectional audio communication with Azure AI, including voice assistants, voice-enabled chatbots, real-time speech-to-speech translation, voice-driven avatars, or any WebSocket-based audio streaming with AI models. Supports Server VAD (Voice Activity Detection), turn-based conversation, function calling, MCP tools, avatar integration, and transcription.
Build voice AI agents with LiveKit Cloud and the Agents SDK. Use when the user asks to "build a voice agent", "create a LiveKit agent", "add voice AI", "implement handoffs", "structure agent workflows", or is working with LiveKit Agents SDK. Provides opinionated guidance for the recommended path: LiveKit Cloud + LiveKit Inference. REQUIRES writing tests for all implementations.
Build LiveKit Agent backends in TypeScript or JavaScript. Use this skill when creating voice AI agents, voice assistants, or any realtime AI application using LiveKit's Node.js Agents SDK (@livekit/agents-js). Covers AgentSession, Agent class, function tools with zod, STT/LLM/TTS models, turn detection, and realtime models.
vox.ai 개발 베스트 프랙티스를 적용한다. (1) 한국어 음성 에이전트 system prompt 설계/작성/리팩터링(템플릿, {{...}} 변수 주입, 필러 옵션, Character normalization, 도구/무음 액션, 테스트/운영), (2) vox MCP 서버(https://mcp.tryvox.co/, Streamable HTTP, OAuth 또는 API token)를 ChatGPT/Claude Desktop/Claude Code/Cursor/OpenCode/Codex/VS Code Copilot 등에 연결할 때 사용한다.
Build conversational AI voice agents with ElevenLabs Platform. Configure agents, tools, RAG knowledge bases, agent versioning with A/B testing, and MCP security. React, React Native, or Swift SDKs. Prevents 34 documented errors. Use when: building voice agents, AI phone systems, agent versioning/branching, MCP security, or troubleshooting @11labs deprecated, webhook errors, CSP violations, localhost allowlist, tool parsing errors.
Build conversational AI voice agents with ElevenLabs Platform using React, JavaScript, React Native, or Swift SDKs. Configure agents, tools (client/server/MCP), RAG knowledge bases, multi-voice, and Scribe real-time STT. Use when: building voice chat interfaces, implementing AI phone agents with Twilio, configuring agent workflows or tools, adding RAG knowledge bases, testing with CLI "agents as code", or troubleshooting deprecated @11labs packages, Android audio cutoff, CSP violations, dynamic variables, or WebRTC config. Keywords: ElevenLabs Agents, ElevenLabs voice agents, AI voice agents, conversational AI, @elevenlabs/react, @elevenlabs/client, @elevenlabs/react-native, @elevenlabs/elevenlabs-js, @elevenlabs/agents-cli, elevenlabs SDK, voice AI, TTS, text-to-speech, ASR, speech recognition, turn-taking model, WebRTC voice, WebSocket voice, ElevenLabs conversation, agent system prompt, agent tools, agent knowledge base, RAG voice agents, multi-voice agents, pronunciation dictionary, voice speed control, elevenlabs scribe, @11labs deprecated, Android audio cutoff, CSP violation elevenlabs, dynamic variables elevenlabs, case-sensitive tool names, webhook authentication