Loading...
Loading...
Found 125 Skills
Char (formerly Hyprnote) platform help — open-source, bot-free, local-first AI meeting notepad with system audio capture, markdown output, plugin SDK, and optional cloud STT/LLM (GPL-3.0). Use when setting up Char on macOS for the first time, speaker identification not working in group meetings, configuring local-only transcription with Cactus or Ollama for full offline use, choosing between Char's cloud STT providers (Deepgram, AssemblyAI, Soniox, OpenAI, etc.), app not launching or bouncing on dock without opening, telemetry concerns with PostHog or Sentry in a local-first app, building a Char plugin or using the automation hooks system, comparing Char to Granola or Meetily or Fathom for privacy, or configuring the CLI for template management. Do NOT use for picking between note-takers generally (use /sales-note-taker) or reviewing a single call for coaching (use /sales-call-review).
OpenAI's general-purpose speech recognition model. Supports 99 languages, transcription, translation to English, and language identification. Six model sizes from tiny (39M params) to large (1550M params). Use for speech-to-text, podcast transcription, or multilingual audio processing. Best for robust, multilingual ASR.
Use when user requests Chinese terminology conversion, checking, or ensuring terminology - "使用繁體中文", "使用台灣用語", "轉換成台灣用語", "確保都是台灣用語", "統一台灣用語", "改成台灣用語", "用台灣的說法", "簡體轉繁體", "繁體轉簡體", "全部改成繁體", "轉成台灣繁體", check/ensure Taiwan/Hong Kong/China terminology, simplified/traditional conversion, or phonetic transcription (Pinyin/Bopomofo)
Route audio, video, transcript, subtitle, and edit-prep requests into the right media-understanding workflow before execution. Use this when the user wants transcription, subtitle generation, beat mapping, B-roll planning, or edit-ready outputs and the first question is which skill and model chain should run.
Text-to-speech, speech-to-text, voice conversion, and audio processing using EachLabs AI models. Supports ElevenLabs TTS, Whisper transcription with diarization, and RVC voice conversion. Use when the user needs TTS, transcription, or voice conversion.
Read, watch, and listen to video/audio files. Extract key frames to "see" videos, extract audio to "hear" them via Whisper transcription. Use when a user sends a video/audio and asks about its content, what's in it, what someone said, etc.
Extract, transcribe, and translate YouTube video transcripts using the YouTubeTranscript.dev V2 API. Supports captions, ASR audio transcription, batch processing (up to 100 videos), translation to 100+ languages, and multiple output formats. Use when working with YouTube videos, subtitles, captions, or video-to-text conversion.
Shadow platform help — bot-free AI meeting assistant capturing audio + screen on macOS, on-device transcription, autopilot meeting detection, AI summaries/action items/follow-up emails, Skills system for custom post-meeting tasks. Use when setting up Shadow for the first time, Shadow not detecting meetings automatically, Shadow using too much CPU or memory on Mac, Shadow speaker attribution is wrong, Shadow screen capture not working, Shadow free tier ran out of AI meetings, choosing between Shadow and Granola or Jamie or Bluedot for bot-free recording, or exporting Shadow notes to Markdown or Zapier. Do NOT use for choosing between all AI note-takers (use /sales-note-taker) or reviewing a call for coaching (use /sales-call-review).
Corrects speech-to-text transcription errors in meeting notes, lectures, and interviews using dictionary rules and AI. Learns patterns to build personalized correction databases. Use when working with transcripts containing ASR/STT errors, homophones, or Chinese/English mixed content requiring cleanup.
Use local FunASR service to transcribe audio or video files into timestamped Markdown files, supporting common formats such as mp4, mov, mp3, wav, m4a, etc. This skill should be used when users need speech-to-text conversion, meeting minutes, video subtitles, or podcast transcription.
@copilotkit/runtime — mount a fetch-native CopilotRuntime on any JS server, wire middleware, pick an AgentRunner, instantiate BuiltInAgent (Factory Mode with TanStack AI is the preferred default) or plug in any of 12 external agent frameworks (Mastra, LangGraph, CrewAI Crews/Flows, PydanticAI, ADK, LlamaIndex, Agno, AWS Strands, MS Agent Framework, AG2, A2A), enable Intelligence mode for durable threads + websocket, register server-side tools via defineTool, and wire voice transcription. Uses the fetch-based createCopilotRuntimeHandler primitive — the Express/Hono adapters are discouraged. Load the reference under references/ that matches your task.
Video understanding for any model — native passthrough for small files, frame extraction + audio transcription fallback for large files. Use when the user asks to analyze, describe, or understand a video file (e.g. "what's in this video", "summarize this clip", "transcribe this recording").