Search Results: dio

Found 1,612 Skills

AI & Machine Learningdeveloperpaxs/paxs-skills

paxs-api

Connect to PAXS AI platform to create meetings, upload recordings, and generate transcriptions and meeting notes. Use this skill when a user wants to transcribe audio, create meeting notes, or interact with the PAXS platform.

🇺🇸|EnglishTranslated

1 scripts/Checked

Tools & Utilitiesrameerez/claude-code-star...

transcribe-video

Generate subtitles (SRT/VTT) and plain text transcripts from video or audio files using AWS Transcribe. Use when creating captions, extracting spoken content, generating transcripts for notes, or making video content searchable.

🇺🇸|EnglishTranslated

AI & Machine Learningdaymade/claude-code-skill...

asr-transcribe-to-text

Transcribe audio and video files to text using a remote ASR service (Qwen3-ASR or OpenAI-compatible endpoint). Extracts audio from video, sends to configurable ASR endpoint, outputs clean text. Use when the user wants to transcribe recordings, convert audio/video to text, do speech-to-text, or mentions ASR, Qwen ASR, 转录, 语音转文字, 录音转文字, or has a meeting recording, lecture, interview, or screen recording to transcribe.

🇺🇸|EnglishTranslated

1 scripts/Checked

AI & Machine Learningcinience/alicloud-skills

aliyun-wan-digital-human

Use when generating talking, singing, or presentation videos from a single character image and audio with Alibaba Cloud Model Studio digital-human model `wan2.2-s2v`. Use when creating narrated avatar videos, singing portraits, or broadcast-style talking-head clips.

🇺🇸|EnglishTranslated

1 scripts/Checked

Testing & QAcinience/alicloud-skills

alicloud-ai-search-multimodal-embedding-test

Minimal multimodal embedding smoke test for Model Studio VL embedding models.

🇺🇸|EnglishTranslated

AI & Machine Learningcinience/alicloud-skills

aliyun-qwen-asr

Use when transcribing non-realtime speech with Alibaba Cloud Model Studio Qwen ASR models (`qwen3-asr-flash`, `qwen-audio-asr`, `qwen3-asr-flash-filetrans`). Use when converting recorded audio files to text, generating transcripts with timestamps, or documenting DashScope/OpenAI-compatible ASR request and response fields.

🇺🇸|EnglishTranslated

1 scripts/Checked

AI & Machine Learningminimax-ai/skills

minimax-music-gen

Use when user wants to generate music, songs, or audio tracks. Triggers on phrases like "generate a song", "make music", "create a track", "写首歌", "生成音乐", "来一首歌", "帮我做首歌", "纯音乐", "cover", "唱一首", or any request involving music creation, song writing, lyrics generation, or audio production. Also triggers when user provides lyrics and wants them turned into a song, or describes a mood/scene and wants background music. Even casual requests like "给我来点音乐" or "I want a chill beat" should trigger this skill. Do NOT use for music playback of existing files, music theory questions, or music recommendation without generation.

🇺🇸|EnglishTranslated

6 scripts/Attention

Marketing & Growthsharadchaturveda-coder/ag...

agency-podcast-strategist

Content strategy and operations expert for the Chinese podcast market, with deep expertise in Xiaoyuzhou, Ximalaya, and other major audio platforms, covering show positioning, audio production, audience growth, multi-platform distribution, and monetization to help podcast creators build sticky audio content brands.

🇺🇸|EnglishTranslated

AI & Machine Learningnexu-io/open-design

speech

Generate spoken audio from text using OpenAI's API with built-in voices. Useful for narrated explainers, lecture audio, and quick voiceover tracks.

🇺🇸|EnglishTranslated

AI & Machine Learningrediumvex/ai-video-genera...

seedance-podcast-visual

Generate podcast clip visualization video prompts for Seedance 2.0 on Higgsfield. Use for podcast clip videos, audio-to-visual content, audiogram alternatives, podcast highlight reels, interview clip visuals, or any video that transforms audio content into engaging visual format. Triggers on podcast, audio clip, audiogram, interview clip, sound bite, audio visual, podcast video, episode highlight, podcast clip.

🇺🇸|EnglishTranslated

AI & Machine Learningdeepgram/skills

api

Deepgram API reference for speech-to-text, text-to-speech, voice agents, audio intelligence, and account management. Use whenever building with Deepgram APIs — REST or WebSocket. Covers authentication, all endpoints, query parameters, request/response schemas, and WebSocket message formats. Reference files are organized by domain: listen (STT), speak (TTS), agent (voice agents), read (text/audio intelligence), models, projects, auth, and self-hosted.

🇺🇸|EnglishTranslated

Code Qualityexistential-birds/beagle

go-code-review

Reviews Go code for idiomatic patterns, error handling, concurrency safety, and common mistakes. Use when reviewing .go files, checking error handling, goroutine usage, or interface design.

🇺🇸|EnglishTranslated