subtitle-generation

Compare original and translation side by side

🇺🇸

Original

English
🇨🇳

Translation

Chinese

Subtitle Generation

字幕生成

Generate professional subtitles and captions for videos using each::sense. This skill creates accurate transcriptions, multi-language subtitles, animated captions, and various export formats optimized for social media, video production, and accessibility.
使用each::sense为视频生成专业级字幕和字幕文件。该功能可生成精准的转录文本、多语言字幕、动画字幕,以及针对社交媒体、视频制作和无障碍场景优化的多种导出格式。

Features

功能特性

  • Auto-Generated Subtitles: Automatic speech-to-text transcription with accurate timing
  • Multi-Language Generation: Generate subtitles in multiple languages from audio
  • Animated Captions: TikTok/Instagram-style animated word-by-word captions
  • SRT/VTT Export: Standard subtitle formats for editing and distribution
  • Speaker Diarization: Identify and label different speakers in conversations
  • Subtitle Translation: Translate existing subtitles to other languages
  • Burned-In Subtitles: Render subtitles directly into video (hardcoded)
  • Karaoke Style: Word-by-word highlighting for music and lyric videos
  • Timing Adjustment: Fine-tune subtitle timing and synchronization
  • Batch Processing: Generate subtitles for multiple videos at once
  • 自动生成字幕:带精准时间轴的自动语音转文本转录
  • 多语言生成:从音频生成多种语言的字幕
  • 动画字幕:TikTok/Instagram风格的逐词动画字幕
  • SRT/VTT导出:用于编辑和分发的标准字幕格式
  • 说话人分离:识别并标记对话中的不同说话人
  • 字幕翻译:将现有字幕翻译为其他语言
  • 内嵌式字幕:将字幕直接渲染到视频中(硬编码)
  • 卡拉OK风格:针对音乐和歌词视频的逐词高亮显示
  • 时间轴调整:微调字幕的时间轴和同步性
  • 批量处理:一次性为多个视频生成字幕

Quick Start

快速开始

bash
curl -X POST https://sense.eachlabs.run/chat \
  -H "Content-Type: application/json" \
  -H "X-API-Key: $EACHLABS_API_KEY" \
  -H "Accept: text/event-stream" \
  -d '{
    "message": "Generate subtitles for this video with accurate timestamps",
    "mode": "max",
    "file_urls": ["https://example.com/my-video.mp4"]
  }'
bash
curl -X POST https://sense.eachlabs.run/chat \\
  -H "Content-Type: application/json" \\
  -H "X-API-Key: $EACHLABS_API_KEY" \\
  -H "Accept: text/event-stream" \\
  -d '{
    "message": "Generate subtitles for this video with accurate timestamps",
    "mode": "max",
    "file_urls": ["https://example.com/my-video.mp4"]
  }'

Subtitle Formats & Outputs

字幕格式与输出

FormatExtensionUse Case
SRT.srtUniversal, most video players and editors
VTT.vttWeb video, HTML5 players, YouTube
Burned-In.mp4Social media, no player support needed
JSON.jsonCustom applications, programmatic access
ASS/SSA.assAdvanced styling, anime subtitles
格式扩展名适用场景
SRT.srt通用格式,支持大多数视频播放器和编辑器
VTT.vtt网页视频、HTML5播放器、YouTube
内嵌式.mp4社交媒体,无需播放器支持
JSON.jsonCustom applications, programmatic access
ASS/SSA.ass高级样式、动漫字幕

Use Case Examples

使用场景示例

1. Auto-Generate Subtitles from Video

1. 从视频自动生成字幕

Automatically transcribe speech from a video file with accurate word-level timestamps.
bash
curl -X POST https://sense.eachlabs.run/chat \
  -H "Content-Type: application/json" \
  -H "X-API-Key: $EACHLABS_API_KEY" \
  -H "Accept: text/event-stream" \
  -d '{
    "message": "Transcribe this video and generate subtitles with accurate timestamps. Output as SRT format. The video contains English speech.",
    "mode": "max",
    "file_urls": ["https://example.com/interview-video.mp4"]
  }'
自动转录视频文件中的语音,并生成带精准逐词时间轴的字幕。
bash
curl -X POST https://sense.eachlabs.run/chat \\
  -H "Content-Type: application/json" \\
  -H "X-API-Key: $EACHLABS_API_KEY" \\
  -H "Accept: text/event-stream" \\
  -d '{
    "message": "Transcribe this video and generate subtitles with accurate timestamps. Output as SRT format. The video contains English speech.",
    "mode": "max",
    "file_urls": ["https://example.com/interview-video.mp4"]
  }'

2. Multi-Language Subtitle Generation

2. 多语言字幕生成

Generate subtitles in multiple languages directly from the audio.
bash
curl -X POST https://sense.eachlabs.run/chat \
  -H "Content-Type: application/json" \
  -H "X-API-Key: $EACHLABS_API_KEY" \
  -H "Accept: text/event-stream" \
  -d '{
    "message": "Generate subtitles for this video in English, Spanish, and French. Provide separate SRT files for each language. The original audio is in English.",
    "mode": "max",
    "file_urls": ["https://example.com/product-demo.mp4"]
  }'
直接从音频生成多种语言的字幕。
bash
curl -X POST https://sense.eachlabs.run/chat \\
  -H "Content-Type: application/json" \\
  -H "X-API-Key: $EACHLABS_API_KEY" \\
  -H "Accept: text/event-stream" \\
  -d '{
    "message": "Generate subtitles for this video in English, Spanish, and French. Provide separate SRT files for each language. The original audio is in English.",
    "mode": "max",
    "file_urls": ["https://example.com/product-demo.mp4"]
  }'

3. Animated/Styled Captions (TikTok Style)

3. 动画/样式化字幕(TikTok风格)

Create eye-catching animated captions with word-by-word highlighting, popular on TikTok and Instagram Reels.
bash
curl -X POST https://sense.eachlabs.run/chat \
  -H "Content-Type: application/json" \
  -H "X-API-Key: $EACHLABS_API_KEY" \
  -H "Accept: text/event-stream" \
  -d '{
    "message": "Add TikTok-style animated captions to this video. Use bold white text with black outline, word-by-word pop animation, centered at the bottom third of the screen. Make it trendy and engaging.",
    "mode": "max",
    "file_urls": ["https://example.com/short-form-content.mp4"]
  }'
创建引人注目的逐词高亮动画字幕,在TikTok和Instagram Reels上广受欢迎。
bash
curl -X POST https://sense.eachlabs.run/chat \\
  -H "Content-Type: application/json" \\
  -H "X-API-Key: $EACHLABS_API_KEY" \\
  -H "Accept: text/event-stream" \\
  -d '{
    "message": "Add TikTok-style animated captions to this video. Use bold white text with black outline, word-by-word pop animation, centered at the bottom third of the screen. Make it trendy and engaging.",
    "mode": "max",
    "file_urls": ["https://example.com/short-form-content.mp4"]
  }'

4. SRT/VTT Export

4. SRT/VTT导出

Generate clean subtitle files in standard formats for use in video editors or streaming platforms.
bash
curl -X POST https://sense.eachlabs.run/chat \
  -H "Content-Type: application/json" \
  -H "X-API-Key: $EACHLABS_API_KEY" \
  -H "Accept: text/event-stream" \
  -d '{
    "message": "Transcribe this video and export subtitles in both SRT and VTT formats. Ensure proper line breaks (max 42 characters per line, 2 lines max). Include timestamps accurate to milliseconds.",
    "mode": "max",
    "file_urls": ["https://example.com/documentary.mp4"]
  }'
生成标准格式的清晰字幕文件,用于视频编辑器或流媒体平台。
bash
curl -X POST https://sense.eachlabs.run/chat \\
  -H "Content-Type: application/json" \\
  -H "X-API-Key: $EACHLABS_API_KEY" \\
  -H "Accept: text/event-stream" \\
  -d '{
    "message": "Transcribe this video and export subtitles in both SRT and VTT formats. Ensure proper line breaks (max 42 characters per line, 2 lines max). Include timestamps accurate to milliseconds.",
    "mode": "max",
    "file_urls": ["https://example.com/documentary.mp4"]
  }'

5. Speaker Diarization (Identify Speakers)

5. 说话人分离(识别说话人)

Generate subtitles that identify and label different speakers in conversations, interviews, or podcasts.
bash
curl -X POST https://sense.eachlabs.run/chat \
  -H "Content-Type: application/json" \
  -H "X-API-Key: $EACHLABS_API_KEY" \
  -H "Accept: text/event-stream" \
  -d '{
    "message": "Generate subtitles for this podcast with speaker diarization. There are 2 speakers - identify them as Speaker 1 and Speaker 2 (or Host and Guest if you can determine roles). Format each line with the speaker label.",
    "mode": "max",
    "file_urls": ["https://example.com/podcast-episode.mp4"]
  }'
生成可识别并标记对话、访谈或播客中不同说话人的字幕。
bash
curl -X POST https://sense.eachlabs.run/chat \\
  -H "Content-Type: application/json" \\
  -H "X-API-Key: $EACHLABS_API_KEY" \\
  -H "Accept: text/event-stream" \\
  -d '{
    "message": "Generate subtitles for this podcast with speaker diarization. There are speakers - identify them as Speaker 1 and Speaker 2 (or Host and Guest if you can determine roles). Format each line with the speaker label.",
    "mode": "max",
    "file_urls": ["https://example.com/podcast-episode.mp4"]
  }'

6. Subtitle Translation

6. 字幕翻译

Translate existing subtitles from one language to another while preserving timing.
bash
curl -X POST https://sense.eachlabs.run/chat \
  -H "Content-Type: application/json" \
  -H "X-API-Key: $EACHLABS_API_KEY" \
  -H "Accept: text/event-stream" \
  -d '{
    "message": "Translate these English subtitles to Japanese. Preserve the original timing and format. Ensure natural Japanese phrasing rather than literal translation.",
    "mode": "max",
    "file_urls": ["https://example.com/original-subtitles.srt"]
  }'
在保留时间轴的前提下,将现有字幕从一种语言翻译为另一种语言。
bash
curl -X POST https://sense.eachlabs.run/chat \\
  -H "Content-Type: application/json" \\
  -H "X-API-Key: $EACHLABS_API_KEY" \\
  -H "Accept: text/event-stream" \\
  -d '{
    "message": "Translate these English subtitles to Japanese. Preserve the original timing and format. Ensure natural Japanese phrasing rather than literal translation.",
    "mode": "max",
    "file_urls": ["https://example.com/original-subtitles.srt"]
  }'

7. Burned-In Subtitles

7. 内嵌式字幕

Render subtitles directly into the video file (hardcoded/embedded) so they appear without needing player support.
bash
curl -X POST https://sense.eachlabs.run/chat \
  -H "Content-Type: application/json" \
  -H "X-API-Key: $EACHLABS_API_KEY" \
  -H "Accept: text/event-stream" \
  -d '{
    "message": "Generate subtitles for this video and burn them directly into the video. Use white text with black background box, Arial font, positioned at bottom center. Output a new video file with embedded subtitles.",
    "mode": "max",
    "file_urls": ["https://example.com/social-media-clip.mp4"]
  }'
将字幕直接渲染到视频文件中(硬编码/内嵌),无需播放器支持即可显示。
bash
curl -X POST https://sense.eachlabs.run/chat \\
  -H "Content-Type: application/json" \\
  -H "X-API-Key: $EACHLABS_API_KEY" \\
  -H "Accept: text/event-stream" \\
  -d '{
    "message": "Generate subtitles for this video and burn them directly into the video. Use white text with black background box, Arial font, positioned at bottom center. Output a new video file with embedded subtitles.",
    "mode": "max",
    "file_urls": ["https://example.com/social-media-clip.mp4"]
  }'

8. Word-by-Word Karaoke Style

8. 逐词卡拉OK风格

Create karaoke-style subtitles with word-by-word highlighting, perfect for music videos and lyric content.
bash
curl -X POST https://sense.eachlabs.run/chat \
  -H "Content-Type: application/json" \
  -H "X-API-Key: $EACHLABS_API_KEY" \
  -H "Accept: text/event-stream" \
  -d '{
    "message": "Create karaoke-style subtitles for this music video. Display lyrics with word-by-word highlighting as they are sung. Use a gradient color change effect (from white to yellow) for the currently sung word. Center the text on screen.",
    "mode": "max",
    "file_urls": ["https://example.com/music-video.mp4"]
  }'
创建带逐词高亮的卡拉OK风格字幕,非常适合音乐视频和歌词内容。
bash
curl -X POST https://sense.eachlabs.run/chat \\
  -H "Content-Type: application/json" \\
  -H "X-API-Key: $EACHLABS_API_KEY" \\
  -H "Accept: text/event-stream" \\
  -d '{
    "message": "Create karaoke-style subtitles for this music video. Display lyrics with word-by-word highlighting as they are sung. Use a gradient color change effect (from white to yellow) for the currently sung word. Center the text on screen.",
    "mode": "max",
    "file_urls": ["https://example.com/music-video.mp4"]
  }'

9. Subtitle Timing Adjustment

9. 字幕时间轴调整

Fine-tune subtitle timing for better synchronization with audio.
bash
undefined
微调字幕时间轴,实现与音频的更好同步。
bash
undefined

First, upload video and generate initial subtitles

First, upload video and generate initial subtitles

curl -X POST https://sense.eachlabs.run/chat
-H "Content-Type: application/json"
-H "X-API-Key: $EACHLABS_API_KEY"
-H "Accept: text/event-stream"
-d '{ "message": "Generate subtitles for this video", "mode": "max", "session_id": "subtitle-timing-project", "file_urls": ["https://example.com/video-with-delay.mp4"] }'
curl -X POST https://sense.eachlabs.run/chat \ -H "Content-Type: application/json" \ -H "X-API-Key: $EACHLABS_API_KEY" \ -H "Accept: text/event-stream" \ -d '{ "message": "Generate subtitles for this video", "mode": "max", "session_id": "subtitle-timing-project", "file_urls": ["https://example.com/video-with-delay.mp4"] }'

Then adjust timing in the same session

Then adjust timing in the same session

curl -X POST https://sense.eachlabs.run/chat
-H "Content-Type: application/json"
-H "X-API-Key: $EACHLABS_API_KEY"
-H "Accept: text/event-stream"
-d '{ "message": "The subtitles are appearing 500 milliseconds too early. Shift all subtitle timings forward by 500ms and regenerate the SRT file.", "session_id": "subtitle-timing-project" }'
undefined
curl -X POST https://sense.eachlabs.run/chat \ -H "Content-Type: application/json" \ -H "X-API-Key: $EACHLABS_API_KEY" \ -H "Accept: text/event-stream" \ -d '{ "message": "The subtitles are appearing 500 milliseconds too early. Shift all subtitle timings forward by 500ms and regenerate the SRT file.", "session_id": "subtitle-timing-project" }'
undefined

10. Batch Subtitle Generation

10. 批量字幕生成

Generate subtitles for multiple videos in a single workflow.
bash
curl -X POST https://sense.eachlabs.run/chat \
  -H "Content-Type: application/json" \
  -H "X-API-Key: $EACHLABS_API_KEY" \
  -H "Accept: text/event-stream" \
  -d '{
    "message": "Generate English subtitles for all these videos. Output SRT files for each. Use consistent formatting across all videos: max 2 lines, 42 characters per line, minimum 1 second display time per subtitle.",
    "mode": "max",
    "file_urls": [
      "https://example.com/episode-01.mp4",
      "https://example.com/episode-02.mp4",
      "https://example.com/episode-03.mp4"
    ]
  }'
在单个工作流中为多个视频生成字幕。
bash
curl -X POST https://sense.eachlabs.run/chat \\
  -H "Content-Type: application/json" \\
  -H "X-API-Key: $EACHLABS_API_KEY" \\
  -H "Accept: text/event-stream" \\
  -d '{
    "message": "Generate English subtitles for all these videos. Output SRT files for each. Use consistent formatting across all videos: max 2 lines, 42 characters per line, minimum 1 second display time per subtitle.",
    "mode": "max",
    "file_urls": [
      "https://example.com/episode-01.mp4",
      "https://example.com/episode-02.mp4",
      "https://example.com/episode-03.mp4"
    ]
  }'

Best Practices

最佳实践

Transcription Quality

转录质量

  • Clear Audio: Best results with clear speech and minimal background noise
  • Language Hint: Specify the source language for better accuracy
  • Speaker Count: Mention number of speakers for better diarization
  • Context: Provide context about the content (technical terms, names) for accuracy
  • 清晰音频:语音清晰、背景噪音最小的音频效果最佳
  • 语言提示:指定源语言可提高准确性
  • 说话人数量:说明说话人数量可优化分离效果
  • 上下文:提供内容上下文(专业术语、名称)以提升准确性

Subtitle Formatting

字幕格式

  • Line Length: Keep lines under 42 characters for readability
  • Duration: Each subtitle should display for 1-7 seconds
  • Lines Per Subtitle: Maximum 2 lines per subtitle block
  • Reading Speed: Target 150-180 words per minute for comfortable reading
  • 行长度:每行不超过42个字符,确保可读性
  • 显示时长:每个字幕块显示1-7秒
  • 每行字幕数:每个字幕块最多2行
  • 阅读速度:目标为每分钟150-180词,保证舒适阅读

Animated Captions

动画字幕

  • Font Choice: Bold, sans-serif fonts work best for short-form content
  • Contrast: Use outlines or shadows for visibility on any background
  • Position: Keep safe zones clear for platform UI elements
  • Animation: Subtle animations are more readable than dramatic effects
  • 字体选择:粗体无衬线字体最适合短视频内容
  • 对比度:使用描边或阴影确保在任何背景下都可见
  • 位置:避开平台UI元素的安全区域
  • 动画效果:微妙的动画比夸张效果更易读

Translation

翻译

  • Cultural Adaptation: Request localization, not just translation
  • Timing Flexibility: Some languages need more time to read
  • Character Limits: CJK languages often need fewer characters per line
  • 文化适配:请求本地化而非单纯翻译
  • 时间轴灵活性:部分语言需要更长的阅读时间
  • 字符限制:中日韩语言通常每行需要更少字符

Prompt Tips for Subtitle Generation

字幕生成提示词技巧

When requesting subtitles, include these details in your prompt:
  1. Source Language: What language is spoken in the video?
  2. Target Format: SRT, VTT, burned-in, or animated?
  3. Style Requirements: Font, color, position, animation
  4. Speaker Info: Number of speakers, roles if known
  5. Special Terms: Technical vocabulary, names, brands
  6. Output Languages: Single language or multiple translations
请求字幕时,请在提示词中包含以下细节:
  1. 源语言:视频中使用的是哪种语言?
  2. 目标格式:SRT、VTT、内嵌式还是动画字幕?
  3. 样式要求:字体、颜色、位置、动画效果
  4. 说话人信息:说话人数量,已知的角色
  5. 特殊术语:专业词汇、名称、品牌
  6. 输出语言:单一语言还是多种翻译版本

Example Prompt Structure

提示词结构示例

"Generate [format] subtitles for this video.
Source language: [language].
[Number] speakers: [roles if known].
Style: [font, color, position requirements].
Special terms to recognize: [names, technical words].
Additional requirements: [line length, timing, etc.]"
"Generate [format] subtitles for this video.
Source language: [language].
[Number] speakers: [roles if known].
Style: [font, color, position requirements].
Special terms to recognize: [names, technical words].
Additional requirements: [line length, timing, etc.]"

Mode Selection

模式选择

Ask your users before generating:
"Do you want fast & cheap, or high quality?"
ModeBest ForSpeedQuality
max
Final subtitles, professional content, accuracy-criticalSlowerHighest
eco
Quick drafts, review copies, bulk processingFasterGood
生成前请询问用户:
"您需要快速低成本的版本,还是高质量版本?"
模式最佳适用场景速度质量
max
最终字幕、专业内容、对准确性要求高的场景较慢最高
eco
快速草稿、审阅副本、批量处理较快良好

Multi-Turn Subtitle Refinement

多轮字幕优化

Use
session_id
to iterate on subtitle generation:
bash
undefined
使用
session_id
迭代优化字幕生成:
bash
undefined

Initial subtitle generation

Initial subtitle generation

curl -X POST https://sense.eachlabs.run/chat
-H "Content-Type: application/json"
-H "X-API-Key: $EACHLABS_API_KEY"
-H "Accept: text/event-stream"
-d '{ "message": "Generate subtitles for this video with speaker identification", "session_id": "subtitle-project-001", "file_urls": ["https://example.com/interview.mp4"] }'
curl -X POST https://sense.eachlabs.run/chat \ -H "Content-Type: application/json" \ -H "X-API-Key: $EACHLABS_API_KEY" \ -H "Accept: text/event-stream" \ -d '{ "message": "Generate subtitles for this video with speaker identification", "session_id": "subtitle-project-001", "file_urls": ["https://example.com/interview.mp4"] }'

Refine based on feedback

Refine based on feedback

curl -X POST https://sense.eachlabs.run/chat
-H "Content-Type: application/json"
-H "X-API-Key: $EACHLABS_API_KEY"
-H "Accept: text/event-stream"
-d '{ "message": "Change Speaker 1 label to John and Speaker 2 to Sarah. Also fix the spelling of TensorFlow wherever it appears.", "session_id": "subtitle-project-001" }'
curl -X POST https://sense.eachlabs.run/chat \ -H "Content-Type: application/json" \ -H "X-API-Key: $EACHLABS_API_KEY" \ -H "Accept: text/event-stream" \ -d '{ "message": "Change Speaker 1 label to John and Speaker 2 to Sarah. Also fix the spelling of TensorFlow wherever it appears.", "session_id": "subtitle-project-001" }'

Add styling and export

Add styling and export

curl -X POST https://sense.eachlabs.run/chat
-H "Content-Type: application/json"
-H "X-API-Key: $EACHLABS_API_KEY"
-H "Accept: text/event-stream"
-d '{ "message": "Now create a burned-in version with the corrected subtitles. Use yellow text for John and cyan for Sarah.", "session_id": "subtitle-project-001" }'
undefined
curl -X POST https://sense.eachlabs.run/chat \ -H "Content-Type: application/json" \ -H "X-API-Key: $EACHLABS_API_KEY" \ -H "Accept: text/event-stream" \ -d '{ "message": "Now create a burned-in version with the corrected subtitles. Use yellow text for John and cyan for Sarah.", "session_id": "subtitle-project-001" }'
undefined

Language Support

语言支持

each::sense supports subtitle generation in 50+ languages including:
LanguageCodeNotes
EnglishenUS, UK, AU variants
SpanishesLatin American and European
FrenchfrFrance and Canadian
Germande
Japaneseja
Koreanko
ChinesezhSimplified and Traditional
ArabicarRTL support
Hindihi
PortugueseptBrazilian and European
each::sense支持50多种语言的字幕生成,包括:
语言代码说明
英语en支持美国、英国、澳大利亚变体
西班牙语es支持拉美和欧洲变体
法语fr支持法国和加拿大变体
德语de
日语ja
韩语ko
中文zh支持简体和繁体
阿拉伯语ar支持从右到左显示
印地语hi
葡萄牙语pt支持巴西和欧洲变体

Error Handling

错误处理

ErrorCauseSolution
Failed to create prediction: HTTP 422
Insufficient balanceTop up at eachlabs.ai
Transcription quality lowPoor audio qualityProvide cleaner audio source
Language detection failedMixed languages or unclear speechSpecify source language explicitly
TimeoutLong video or complex processingSet client timeout to minimum 10 minutes
错误原因解决方案
Failed to create prediction: HTTP 422
余额不足在eachlabs.ai充值
转录质量低音频质量差提供更清晰的音频源
语言检测失败混合语言或语音不清晰明确指定源语言
超时视频过长或处理复杂将客户端超时设置为至少10分钟

Client Configuration

客户端配置

Important: Subtitle generation can take significant time for long videos.
  • Minimum timeout: 10 minutes (600 seconds)
  • Recommended: Set timeout based on video length (2-3 minutes per minute of video)
  • Streaming: Use SSE event handling to show progress
重要提示:长视频的字幕生成可能需要较长时间。
  • 最小超时时间:10分钟(600秒)
  • 推荐设置:根据视频长度设置超时(每1分钟视频对应2-3分钟超时)
  • 流式处理:使用SSE事件处理显示进度

Related Skills

相关功能

  • each-sense
    - Core API documentation
  • video-generation
    - Generate videos with built-in captions
  • voice-audio
    - Audio processing and speech synthesis
  • video-edit
    - Video editing and post-production
  • each-sense
    - 核心API文档
  • video-generation
    - 生成带内置字幕的视频
  • voice-audio
    - 音频处理与语音合成
  • video-edit
    - 视频编辑与后期制作",