transcribe-video

Compare original and translation side by side

🇺🇸

Original

English

🇨🇳

Translation

Chinese

Video Transcription Skill

视频转录技能

Generate subtitles and transcripts from

$ARGUMENTS

(a video or audio file path, optionally followed by a language code like

en-US

es-ES

) using AWS Transcribe.

Outputs

.srt

.vtt

, and

.txt

files next to the source file.

使用AWS Transcribe从

$ARGUMENTS

（视频或音频文件路径，可选择性跟随语言代码，如

en-US

或

es-ES

）生成字幕和转录文本。

在源文件所在目录输出

.srt

、

.vtt

和

.txt

格式文件。

Process

流程

Verify prerequisites - check
```
ffmpeg
```
and
```
aws
```
CLI are installed and configured
Extract audio from the video as MP3 using ffmpeg
Create temporary S3 bucket, upload audio
Run AWS Transcribe job with SRT and VTT subtitle output
Download results and generate plain text transcript
Clean up all AWS resources - delete S3 bucket, Transcribe job, and temp files. No recurring costs.

验证前提条件 - 检查
```
ffmpeg
```
和
```
aws
```
CLI是否已安装并配置
提取音频：使用ffmpeg将视频中的音频提取为MP3格式
创建临时S3存储桶，上传提取的音频文件
运行AWS Transcribe任务，生成SRT和VTT格式的字幕
下载结果并生成纯文本转录文本
清理所有AWS资源 - 删除S3存储桶、Transcribe任务和临时文件，避免产生持续费用。

Prerequisites

前提条件

```
ffmpeg
```
installed (
```
brew install ffmpeg
```
)
```
aws
```
CLI installed and configured with valid credentials (
```
brew install awscli && aws configure
```
)
AWS credentials need permissions for:
```
s3:*
```
(create/delete buckets),
```
transcribe:*
```
(start/delete jobs)

已安装
```
ffmpeg
```
（执行
```
brew install ffmpeg
```
安装）
已安装
```
aws
```
CLI并配置有效凭证（执行
```
brew install awscli && aws configure
```
完成安装和配置）
AWS凭证需要具备以下权限：
```
s3:*
```
（创建/删除存储桶）、
```
transcribe:*
```
（启动/删除任务）

Step-by-Step

分步指南

Step 1: Extract audio

步骤1：提取音频

bash

ffmpeg -i "input.mp4" -vn -acodec mp3 -q:a 2 "/tmp/transcribe-audio.mp3" -y

bash

ffmpeg -i "input.mp4" -vn -acodec mp3 -q:a 2 "/tmp/transcribe-audio.mp3" -y

Step 2: Create temp S3 bucket and upload

步骤2：创建临时S3存储桶并上传音频

bash

BUCKET="tmp-transcribe-$(date +%s)"
aws s3 mb "s3://$BUCKET" --region us-east-1
aws s3 cp "/tmp/transcribe-audio.mp3" "s3://$BUCKET/audio.mp3"

bash

BUCKET="tmp-transcribe-$(date +%s)"
aws s3 mb "s3://$BUCKET" --region us-east-1
aws s3 cp "/tmp/transcribe-audio.mp3" "s3://$BUCKET/audio.mp3"

Step 3: Start transcription job

步骤3：启动转录任务

bash

JOB_NAME="tmp-job-$(date +%s)"
aws transcribe start-transcription-job \
  --transcription-job-name "$JOB_NAME" \
  --language-code en-US \
  --media-format mp3 \
  --media "MediaFileUri=s3://$BUCKET/audio.mp3" \
  --subtitles "Formats=srt,vtt" \
  --output-bucket-name "$BUCKET" \
  --region us-east-1

Language codes:

en-US

es-ES

fr-FR

de-DE

pt-BR

ja-JP

zh-CN

it-IT

ko-KR

, etc. Default to

en-US

if not specified.

bash

JOB_NAME="tmp-job-$(date +%s)"
aws transcribe start-transcription-job \
  --transcription-job-name "$JOB_NAME" \
  --language-code en-US \
  --media-format mp3 \
  --media "MediaFileUri=s3://$BUCKET/audio.mp3" \
  --subtitles "Formats=srt,vtt" \
  --output-bucket-name "$BUCKET" \
  --region us-east-1

语言代码：

en-US

、

es-ES

、

fr-FR

、

de-DE

、

pt-BR

、

ja-JP

、

zh-CN

、

it-IT

、

ko-KR

等。若未指定，默认使用

en-US

。

Step 4: Poll until complete

步骤4：轮询等待任务完成

bash

while true; do
  STATUS=$(aws transcribe get-transcription-job \
    --transcription-job-name "$JOB_NAME" \
    --region us-east-1 \
    --query 'TranscriptionJob.TranscriptionJobStatus' \
    --output text)
  if [ "$STATUS" = "COMPLETED" ] || [ "$STATUS" = "FAILED" ]; then break; fi
  sleep 5
done

bash

while true; do
  STATUS=$(aws transcribe get-transcription-job \
    --transcription-job-name "$JOB_NAME" \
    --region us-east-1 \
    --query 'TranscriptionJob.TranscriptionJobStatus' \
    --output text)
  if [ "$STATUS" = "COMPLETED" ] || [ "$STATUS" = "FAILED" ]; then break; fi
  sleep 5
done

Step 5: Download subtitle files

步骤5：下载字幕文件

Save

.srt

and

.vtt

next to the original file:

bash

aws s3 cp "s3://$BUCKET/$JOB_NAME.srt" "/path/to/input.srt"
aws s3 cp "s3://$BUCKET/$JOB_NAME.vtt" "/path/to/input.vtt"

将

.srt

和

.vtt

文件保存至原文件所在目录：

bash

aws s3 cp "s3://$BUCKET/$JOB_NAME.srt" "/path/to/input.srt"
aws s3 cp "s3://$BUCKET/$JOB_NAME.vtt" "/path/to/input.vtt"

Step 6: Generate plain text transcript

步骤6：生成纯文本转录文本

Download the JSON result and extract the full transcript text:

bash

aws s3 cp "s3://$BUCKET/$JOB_NAME.json" "/tmp/transcribe-result.json"

Then use a tool to extract the

.results.transcripts[0].transcript

field from the JSON and save it as a

.txt

file next to the original.

下载JSON格式的结果文件并提取完整的转录文本：

bash

aws s3 cp "s3://$BUCKET/$JOB_NAME.json" "/tmp/transcribe-result.json"

随后使用工具从JSON文件中提取

.results.transcripts[0].transcript

字段，并保存为

.txt

文件至原文件所在目录。

Step 7: Clean up everything

步骤7：清理所有资源

IMPORTANT: Always clean up to avoid recurring S3 storage costs.

bash

undefined

重要提示：务必执行清理操作，避免产生持续的S3存储费用。

bash

undefined

Delete S3 bucket and all contents

删除S3存储桶及其中所有内容

aws s3 rb "s3://$BUCKET" --force --region us-east-1

Delete the transcription job

删除转录任务

aws transcribe delete-transcription-job --transcription-job-name "$JOB_NAME" --region us-east-1

Delete temp audio file

删除临时音频文件

rm -f "/tmp/transcribe-audio.mp3" "/tmp/transcribe-result.json"

undefined

rm -f "/tmp/transcribe-audio.mp3" "/tmp/transcribe-result.json"

undefined

Real-World Results (Reference)

实际运行结果（参考）

From actual transcription runs:

Video	Duration	Audio Size	Transcribe Time	Subtitle Segments
X/Twitter clip	2:40	2.5 MB	~20 seconds	83
Screen recording	18:45	11.4 MB	~60 seconds	500+

以下是实际转录任务的运行数据：

视频类型	时长	音频大小	转录耗时	字幕片段数
X/Twitter 短视频	2:40	2.5 MB	~20秒	83
屏幕录制视频	18:45	11.4 MB	~60秒	500+

Key Insights

关键要点

AWS Transcribe is fast - even 19-minute videos complete in about a minute
Short-form content (tweets, reels) transcribes almost instantly
Cost is negligible - AWS Transcribe charges ~$0.024/min, so a 19-min video costs ~$0.46
Cleanup is critical - always delete the S3 bucket to avoid storage charges
SRT is most compatible - works with most video players and editors; VTT is better for web

AWS Transcribe速度快 - 即使是19分钟的视频也仅需约1分钟完成转录
短内容（推文、短视频）几乎可以即时完成转录
成本极低 - AWS Transcribe的收费约为每分钟0.024美元，因此19分钟的视频仅需约0.46美元
清理至关重要 - 务必删除S3存储桶以避免产生存储费用
SRT格式兼容性最强 - 适用于大多数视频播放器和编辑器；VTT格式更适合网页场景

Output Files

输出文件

original-video.mp4
original-video.srt          # Subtitles with timestamps (most compatible)
original-video.vtt          # Web-optimized subtitles (for HTML5 <track>)
original-video.txt          # Plain text transcript (no timestamps)

original-video.mp4
original-video.srt          # 带时间戳的字幕（兼容性最强）
original-video.vtt          # 网页优化字幕（适用于HTML5 <track>标签）
original-video.txt          # 纯文本转录文本（无时间戳）

After Transcription

转录完成后

Verify all output files exist:

ls -lh /path/to/original-video.{srt,vtt,txt}

Report the number of subtitle segments and total duration
Confirm all AWS resources have been cleaned up (no S3 buckets, no Transcribe jobs remaining)

验证所有输出文件是否存在：执行
```
ls -lh /path/to/original-video.{srt,vtt,txt}
```
报告字幕片段数量和总时长
确认所有AWS资源已清理完毕（无剩余S3存储桶和Transcribe任务）