transcribe-video

Compare original and translation side by side

🇺🇸

Original

English
🇨🇳

Translation

Chinese

Video Transcription Skill

视频转录技能

Generate subtitles and transcripts from
$ARGUMENTS
(a video or audio file path, optionally followed by a language code like
en-US
or
es-ES
) using AWS Transcribe.
Outputs
.srt
,
.vtt
, and
.txt
files next to the source file.
使用AWS Transcribe从
$ARGUMENTS
(视频或音频文件路径,可选择性跟随语言代码,如
en-US
es-ES
)生成字幕和转录文本。
在源文件所在目录输出
.srt
.vtt
.txt
格式文件。

Process

流程

  1. Verify prerequisites - check
    ffmpeg
    and
    aws
    CLI are installed and configured
  2. Extract audio from the video as MP3 using ffmpeg
  3. Create temporary S3 bucket, upload audio
  4. Run AWS Transcribe job with SRT and VTT subtitle output
  5. Download results and generate plain text transcript
  6. Clean up all AWS resources - delete S3 bucket, Transcribe job, and temp files. No recurring costs.
  1. 验证前提条件 - 检查
    ffmpeg
    aws
    CLI是否已安装并配置
  2. 提取音频:使用ffmpeg将视频中的音频提取为MP3格式
  3. 创建临时S3存储桶,上传提取的音频文件
  4. 运行AWS Transcribe任务,生成SRT和VTT格式的字幕
  5. 下载结果并生成纯文本转录文本
  6. 清理所有AWS资源 - 删除S3存储桶、Transcribe任务和临时文件,避免产生持续费用。

Prerequisites

前提条件

  • ffmpeg
    installed (
    brew install ffmpeg
    )
  • aws
    CLI installed and configured with valid credentials (
    brew install awscli && aws configure
    )
  • AWS credentials need permissions for:
    s3:*
    (create/delete buckets),
    transcribe:*
    (start/delete jobs)
  • 已安装
    ffmpeg
    (执行
    brew install ffmpeg
    安装)
  • 已安装
    aws
    CLI并配置有效凭证(执行
    brew install awscli && aws configure
    完成安装和配置)
  • AWS凭证需要具备以下权限:
    s3:*
    (创建/删除存储桶)、
    transcribe:*
    (启动/删除任务)

Step-by-Step

分步指南

Step 1: Extract audio

步骤1:提取音频

bash
ffmpeg -i "input.mp4" -vn -acodec mp3 -q:a 2 "/tmp/transcribe-audio.mp3" -y
bash
ffmpeg -i "input.mp4" -vn -acodec mp3 -q:a 2 "/tmp/transcribe-audio.mp3" -y

Step 2: Create temp S3 bucket and upload

步骤2:创建临时S3存储桶并上传音频

bash
BUCKET="tmp-transcribe-$(date +%s)"
aws s3 mb "s3://$BUCKET" --region us-east-1
aws s3 cp "/tmp/transcribe-audio.mp3" "s3://$BUCKET/audio.mp3"
bash
BUCKET="tmp-transcribe-$(date +%s)"
aws s3 mb "s3://$BUCKET" --region us-east-1
aws s3 cp "/tmp/transcribe-audio.mp3" "s3://$BUCKET/audio.mp3"

Step 3: Start transcription job

步骤3:启动转录任务

bash
JOB_NAME="tmp-job-$(date +%s)"
aws transcribe start-transcription-job \
  --transcription-job-name "$JOB_NAME" \
  --language-code en-US \
  --media-format mp3 \
  --media "MediaFileUri=s3://$BUCKET/audio.mp3" \
  --subtitles "Formats=srt,vtt" \
  --output-bucket-name "$BUCKET" \
  --region us-east-1
Language codes:
en-US
,
es-ES
,
fr-FR
,
de-DE
,
pt-BR
,
ja-JP
,
zh-CN
,
it-IT
,
ko-KR
, etc. Default to
en-US
if not specified.
bash
JOB_NAME="tmp-job-$(date +%s)"
aws transcribe start-transcription-job \
  --transcription-job-name "$JOB_NAME" \
  --language-code en-US \
  --media-format mp3 \
  --media "MediaFileUri=s3://$BUCKET/audio.mp3" \
  --subtitles "Formats=srt,vtt" \
  --output-bucket-name "$BUCKET" \
  --region us-east-1
语言代码
en-US
es-ES
fr-FR
de-DE
pt-BR
ja-JP
zh-CN
it-IT
ko-KR
等。若未指定,默认使用
en-US

Step 4: Poll until complete

步骤4:轮询等待任务完成

bash
while true; do
  STATUS=$(aws transcribe get-transcription-job \
    --transcription-job-name "$JOB_NAME" \
    --region us-east-1 \
    --query 'TranscriptionJob.TranscriptionJobStatus' \
    --output text)
  if [ "$STATUS" = "COMPLETED" ] || [ "$STATUS" = "FAILED" ]; then break; fi
  sleep 5
done
bash
while true; do
  STATUS=$(aws transcribe get-transcription-job \
    --transcription-job-name "$JOB_NAME" \
    --region us-east-1 \
    --query 'TranscriptionJob.TranscriptionJobStatus' \
    --output text)
  if [ "$STATUS" = "COMPLETED" ] || [ "$STATUS" = "FAILED" ]; then break; fi
  sleep 5
done

Step 5: Download subtitle files

步骤5:下载字幕文件

Save
.srt
and
.vtt
next to the original file:
bash
aws s3 cp "s3://$BUCKET/$JOB_NAME.srt" "/path/to/input.srt"
aws s3 cp "s3://$BUCKET/$JOB_NAME.vtt" "/path/to/input.vtt"
.srt
.vtt
文件保存至原文件所在目录:
bash
aws s3 cp "s3://$BUCKET/$JOB_NAME.srt" "/path/to/input.srt"
aws s3 cp "s3://$BUCKET/$JOB_NAME.vtt" "/path/to/input.vtt"

Step 6: Generate plain text transcript

步骤6:生成纯文本转录文本

Download the JSON result and extract the full transcript text:
bash
aws s3 cp "s3://$BUCKET/$JOB_NAME.json" "/tmp/transcribe-result.json"
Then use a tool to extract the
.results.transcripts[0].transcript
field from the JSON and save it as a
.txt
file next to the original.
下载JSON格式的结果文件并提取完整的转录文本:
bash
aws s3 cp "s3://$BUCKET/$JOB_NAME.json" "/tmp/transcribe-result.json"
随后使用工具从JSON文件中提取
.results.transcripts[0].transcript
字段,并保存为
.txt
文件至原文件所在目录。

Step 7: Clean up everything

步骤7:清理所有资源

IMPORTANT: Always clean up to avoid recurring S3 storage costs.
bash
undefined
重要提示:务必执行清理操作,避免产生持续的S3存储费用。
bash
undefined

Delete S3 bucket and all contents

删除S3存储桶及其中所有内容

aws s3 rb "s3://$BUCKET" --force --region us-east-1
aws s3 rb "s3://$BUCKET" --force --region us-east-1

Delete the transcription job

删除转录任务

aws transcribe delete-transcription-job --transcription-job-name "$JOB_NAME" --region us-east-1
aws transcribe delete-transcription-job --transcription-job-name "$JOB_NAME" --region us-east-1

Delete temp audio file

删除临时音频文件

rm -f "/tmp/transcribe-audio.mp3" "/tmp/transcribe-result.json"
undefined
rm -f "/tmp/transcribe-audio.mp3" "/tmp/transcribe-result.json"
undefined

Real-World Results (Reference)

实际运行结果(参考)

From actual transcription runs:
VideoDurationAudio SizeTranscribe TimeSubtitle Segments
X/Twitter clip2:402.5 MB~20 seconds83
Screen recording18:4511.4 MB~60 seconds500+
以下是实际转录任务的运行数据:
视频类型时长音频大小转录耗时字幕片段数
X/Twitter 短视频2:402.5 MB~20秒83
屏幕录制视频18:4511.4 MB~60秒500+

Key Insights

关键要点

  1. AWS Transcribe is fast - even 19-minute videos complete in about a minute
  2. Short-form content (tweets, reels) transcribes almost instantly
  3. Cost is negligible - AWS Transcribe charges ~$0.024/min, so a 19-min video costs ~$0.46
  4. Cleanup is critical - always delete the S3 bucket to avoid storage charges
  5. SRT is most compatible - works with most video players and editors; VTT is better for web
  1. AWS Transcribe速度快 - 即使是19分钟的视频也仅需约1分钟完成转录
  2. 短内容(推文、短视频)几乎可以即时完成转录
  3. 成本极低 - AWS Transcribe的收费约为每分钟0.024美元,因此19分钟的视频仅需约0.46美元
  4. 清理至关重要 - 务必删除S3存储桶以避免产生存储费用
  5. SRT格式兼容性最强 - 适用于大多数视频播放器和编辑器;VTT格式更适合网页场景

Output Files

输出文件

original-video.mp4
original-video.srt          # Subtitles with timestamps (most compatible)
original-video.vtt          # Web-optimized subtitles (for HTML5 <track>)
original-video.txt          # Plain text transcript (no timestamps)
original-video.mp4
original-video.srt          # 带时间戳的字幕(兼容性最强)
original-video.vtt          # 网页优化字幕(适用于HTML5 <track>标签)
original-video.txt          # 纯文本转录文本(无时间戳)

After Transcription

转录完成后

  1. Verify all output files exist:
    ls -lh /path/to/original-video.{srt,vtt,txt}
  2. Report the number of subtitle segments and total duration
  3. Confirm all AWS resources have been cleaned up (no S3 buckets, no Transcribe jobs remaining)
  1. 验证所有输出文件是否存在:执行
    ls -lh /path/to/original-video.{srt,vtt,txt}
  2. 报告字幕片段数量和总时长
  3. 确认所有AWS资源已清理完毕(无剩余S3存储桶和Transcribe任务)