Loading...
Loading...
Found 103 Skills
Multimodal AI processing via Google Gemini API (2M tokens context). Capabilities: audio (transcription, 9.5hr max, summarization, music analysis), images (captioning, OCR, object detection, segmentation, visual Q&A), video (scene detection, 6hr max, YouTube URLs, temporal analysis), documents (PDF extraction, tables, forms, charts), image generation (text-to-image, editing). Actions: transcribe, analyze, extract, caption, detect, segment, generate from media. Keywords: Gemini API, audio transcription, image captioning, OCR, object detection, video analysis, PDF extraction, text-to-image, multimodal, speech recognition, visual Q&A, scene detection, YouTube transcription, table extraction, form processing, image generation, Imagen. Use when: transcribing audio/video, analyzing images/screenshots, extracting data from PDFs, processing YouTube videos, generating images from text, implementing multimodal AI features.
Local speech-to-text using OpenAI Whisper. Runs fully offline after model download. High quality transcription with multiple model sizes.
ML-powered Karaoke app in Rust using Bevy, WhisperX, and Demucs for stem separation, lyrics transcription, and pitch scoring.
Connect to PAXS AI platform to create meetings, upload recordings, and generate transcriptions and meeting notes. Use this skill when a user wants to transcribe audio, create meeting notes, or interact with the PAXS platform.
Download videos from social media URLs (X/Twitter, YouTube, Instagram, TikTok, etc.) using yt-dlp. Use when saving a video locally, extracting content for transcription, or archiving video references.
Execute TwinMind primary workflow: Meeting transcription and summary generation. Use when implementing meeting capture, building transcription features, or automating meeting documentation. Trigger with phrases like "twinmind transcription workflow", "meeting transcription", "capture meeting with twinmind".
Transcribe audio/video using trx CLI and post-process results with agent corrections. Use when: (1) user wants to transcribe a video or audio file, (2) user shares a YouTube/Twitter/Instagram URL for transcription, (3) user says "transcribe", "subtitles", "srt", "transcript", (4) user wants to fix/clean up a whisper transcription, (5) user asks to extract text from a video.
Summarize Hebrew tech lectures and meetings from transcription files into structured Markdown. Use when the user asks to summarize a transcription, create meeting notes, summarize a lecture/presentation in Hebrew, or mentions סיכום/תמלול/הרצאה.
Jiminny platform help — conversation intelligence, revenue intelligence, AI notetaker, sales coaching, and automatic CRM logging. Use when setting up Jiminny call recording or transcription, configuring Jiminny CRM sync to Salesforce or HubSpot, connecting Jiminny to a dialer like Aircall or Dialpad, troubleshooting calls not appearing in Jiminny or tagging delays, pulling activity data from the Jiminny API, comparing Jiminny vs Gong pricing or features, or evaluating Jiminny for pipeline visibility. Do NOT use for general coaching program design (use /sales-coaching) or comparing standalone AI note-takers (use /sales-note-taker).
alfred_ (get-alfred.ai) platform help — AI executive assistant that autonomously triages email, drafts replies, extracts tasks, and manages calendar. Use when inbox is overwhelming and you need AI to triage email overnight, spending too much time on email replies and want AI drafting in your voice, need tasks auto-extracted from emails into a kanban board, want focus time protected on your calendar, need a daily brief of what matters today, or comparing alfred_ to Fyxer AI, Superhuman, Lindy, or Reclaim. Do NOT use for meeting recording or transcription (use /sales-note-taker) or meeting scheduling and booking pages (use /sales-meeting-scheduler).
Char (formerly Hyprnote) platform help — open-source, bot-free, local-first AI meeting notepad with system audio capture, markdown output, plugin SDK, and optional cloud STT/LLM (GPL-3.0). Use when setting up Char on macOS for the first time, speaker identification not working in group meetings, configuring local-only transcription with Cactus or Ollama for full offline use, choosing between Char's cloud STT providers (Deepgram, AssemblyAI, Soniox, OpenAI, etc.), app not launching or bouncing on dock without opening, telemetry concerns with PostHog or Sentry in a local-first app, building a Char plugin or using the automation hooks system, comparing Char to Granola or Meetily or Fathom for privacy, or configuring the CLI for template management. Do NOT use for picking between note-takers generally (use /sales-note-taker) or reviewing a single call for coaching (use /sales-call-review).
People.ai (now Backstory) platform help — automatic activity capture, deal intelligence, pipeline health, revenue forecasting, MCP integration, Salesforce/Dynamics/Oracle CRM sync. Use when reps aren't logging activities and CRM data is stale, deals are slipping without warning and you need early risk signals, forecast accuracy is poor because it's based on gut not data, evaluating People.ai vs Gong vs Clari vs Revenue.io for revenue intelligence, activity data isn't tying back to pipeline or revenue outcomes, or you want to connect People.ai to AI agents via MCP. Do NOT use for conversation intelligence with call recording and transcription (use /sales-gong or /sales-note-taker), building outbound sequences (use /sales-cadence), or general CRM data cleanup strategy (use /sales-data-hygiene).