Loading...
Loading...
Compare original and translation side by side
/mk-youtube-audio-transcribe <audio_file> [model] [language] [--force]/mk-youtube-audio-transcribe <audio_file> [model] [language] [--force]| Parameter | Required | Default | Description |
|---|---|---|---|
| audio_file | Yes | - | Path to audio file |
| model | No | auto | Model: auto, tiny, base, small, medium, large-v3, belle-zh, kotoba-ja |
| language | No | auto | Language code: en, ja, zh, auto (auto-detect) |
| --force | No | false | Force re-transcribe even if cached file exists |
| 参数 | 是否必填 | 默认值 | 说明 |
|---|---|---|---|
| audio_file | 是 | - | 音频文件路径 |
| model | 否 | auto | 模型选项:auto, tiny, base, small, medium, large-v3, belle-zh, kotoba-ja |
| language | 否 | auto | 语言代码:en, ja, zh, auto(自动检测) |
| --force | 否 | false | 即使存在缓存文件,也强制重新转录 |
/mk-youtube-audio-transcribe /path/to/audio/video.m4a/mk-youtube-audio-transcribe video.m4a auto zh/mk-youtube-audio-transcribe video.m4a auto ja/mk-youtube-audio-transcribe audio.mp3 small en/mk-youtube-audio-transcribe podcast.wav medium ja/mk-youtube-audio-transcribe /path/to/audio/video.m4a/mk-youtube-audio-transcribe video.m4a auto zh/mk-youtube-audio-transcribe video.m4a auto ja/mk-youtube-audio-transcribe audio.mp3 small en/mk-youtube-audio-transcribe podcast.wav medium ja{baseDir}/scripts/transcribe.sh "<audio_file>" "<model>" "<language>"{baseDir}/data/<filename>.json{baseDir}/data/<filename>.txt┌─────────────────────────────┐
│ transcribe.sh │
│ audio_file, [model], [lang]│
└──────────────┬──────────────┘
│
▼
┌─────────────────────────────┐
│ ffmpeg: convert to WAV │
│ 16kHz, mono, pcm_s16le │
└──────────────┬──────────────┘
│
▼
┌─────────────────────────────┐
│ whisper-cli: transcribe │
│ with Metal acceleration │
└──────────────┬──────────────┘
│
▼
┌─────────────────────────────┐
│ Save to files │
│ .json (full) + .txt │
└──────────────┬──────────────┘
│
▼
┌─────────────────────────────┐
│ Return file paths │
│ {file_path, text_file_path}│
└─────────────────────────────┘{baseDir}/scripts/transcribe.sh "<audio_file>" "<model>" "<language>"{baseDir}/data/<filename>.json{baseDir}/data/<filename>.txt┌─────────────────────────────┐
│ transcribe.sh │
│ audio_file, [model], [lang]│
└──────────────┬──────────────┘
│
▼
┌─────────────────────────────┐
│ ffmpeg: 转换为WAV格式 │
│ 16kHz, 单声道, pcm_s16le │
└──────────────┬──────────────┘
│
▼
┌─────────────────────────────┐
│ whisper-cli: 执行转录 │
│ 启用Metal加速 │
└──────────────┬──────────────┘
│
▼
┌─────────────────────────────┐
│ 保存至文件 │
│ .json(完整内容) + .txt │
└──────────────┬──────────────┘
│
▼
┌─────────────────────────────┐
│ 返回文件路径 │
│ {file_path, text_file_path}│
└─────────────────────────────┘{
"status": "success",
"file_path": "{baseDir}/data/20091025__VIDEO_ID.json",
"text_file_path": "{baseDir}/data/20091025__VIDEO_ID.txt",
"language": "en",
"duration": "3:32",
"model": "medium",
"char_count": 12345,
"line_count": 100,
"text_char_count": 10000,
"text_line_count": 50,
"cached": false,
"video_id": "dQw4w9WgXcQ",
"title": "Video Title",
"channel": "Channel Name",
"url": "https://www.youtube.com/watch?v=dQw4w9WgXcQ"
}{
"status": "success",
"file_path": "{baseDir}/data/20091025__VIDEO_ID.json",
"cached": true,
...
}{
"status": "error",
"message": "Error description"
}{
"status": "error",
"error_code": "UNKNOWN_MODEL",
"message": "Unknown model: invalid-name",
"available_models": ["tiny", "base", "small", "medium", "large-v3", "large-v3-turbo", "belle-zh", "kotoba-ja", "kotoba-ja-q5"]
}UNKNOWN_MODELavailable_models{
"status": "error",
"error_code": "MODEL_NOT_FOUND",
"message": "Model 'medium' not found. Please download it first.",
"model": "medium",
"model_size": "1.4GB",
"download_url": "https://huggingface.co/...",
"download_command": "curl -L --progress-bar -o '/path/to/models/ggml-medium.bin' 'https://...' 2>&1"
}MODEL_NOT_FOUNDdownload_commandtimeout: 1800000{
"status": "error",
"error_code": "MODEL_CORRUPTED",
"message": "Model 'medium' is corrupted or incomplete. Please re-download.",
"model": "medium",
"model_size": "1.4GB",
"expected_sha256": "6c14d5adee5f86394037b4e4e8b59f1673b6cee10e3cf0b11bbdbee79c156208",
"actual_sha256": "def456...",
"model_path": "/path/to/models/ggml-medium.bin",
"download_command": "rm '/path/to/models/ggml-medium.bin' && curl -L --progress-bar -o '/path/to/models/ggml-medium.bin' 'https://...' 2>&1"
}MODEL_CORRUPTEDdownload_commandtimeout: 1800000{
"status": "success",
"file_path": "{baseDir}/data/20091025__VIDEO_ID.json",
"text_file_path": "{baseDir}/data/20091025__VIDEO_ID.txt",
"language": "en",
"duration": "3:32",
"model": "medium",
"char_count": 12345,
"line_count": 100,
"text_char_count": 10000,
"text_line_count": 50,
"cached": false,
"video_id": "dQw4w9WgXcQ",
"title": "Video Title",
"channel": "Channel Name",
"url": "https://www.youtube.com/watch?v=dQw4w9WgXcQ"
}{
"status": "success",
"file_path": "{baseDir}/data/20091025__VIDEO_ID.json",
"cached": true,
...
}{
"status": "error",
"message": "错误描述"
}{
"status": "error",
"error_code": "UNKNOWN_MODEL",
"message": "Unknown model: invalid-name",
"available_models": ["tiny", "base", "small", "medium", "large-v3", "large-v3-turbo", "belle-zh", "kotoba-ja", "kotoba-ja-q5"]
}UNKNOWN_MODELavailable_models{
"status": "error",
"error_code": "MODEL_NOT_FOUND",
"message": "Model 'medium' not found. Please download it first.",
"model": "medium",
"model_size": "1.4GB",
"download_url": "https://huggingface.co/...",
"download_command": "curl -L --progress-bar -o '/path/to/models/ggml-medium.bin' 'https://...' 2>&1"
}MODEL_NOT_FOUNDdownload_commandtimeout: 1800000{
"status": "error",
"error_code": "MODEL_CORRUPTED",
"message": "Model 'medium' is corrupted or incomplete. Please re-download.",
"model": "medium",
"model_size": "1.4GB",
"expected_sha256": "6c14d5adee5f86394037b4e4e8b59f1673b6cee10e3cf0b11bbdbee79c156208",
"actual_sha256": "def456...",
"model_path": "/path/to/models/ggml-medium.bin",
"download_command": "rm '/path/to/models/ggml-medium.bin' && curl -L --progress-bar -o '/path/to/models/ggml-medium.bin' 'https://...' 2>&1"
}MODEL_CORRUPTEDdownload_commandtimeout: 1800000| Field | Description |
|---|---|
| Absolute path to JSON file (with segments) |
| Absolute path to plain text file |
| Detected language code |
| Audio duration |
| Model used for transcription |
| Character count of JSON file |
| Line count of JSON file |
| Character count of plain text file |
| Line count of plain text file |
| YouTube video ID (from centralized metadata store) |
| Video title (from centralized metadata store) |
| Channel name (from centralized metadata store) |
| Full video URL (from centralized metadata store) |
| 字段 | 说明 |
|---|---|
| 包含分段信息的JSON文件绝对路径 |
| 纯文本文件的绝对路径 |
| 检测到的语言代码 |
| 音频时长 |
| 用于转录的模型 |
| JSON文件的字符数 |
| JSON文件的行数 |
| 纯文本文件的字符数 |
| 纯文本文件的行数 |
| YouTube视频ID(来自集中元数据存储) |
| 视频标题(来自集中元数据存储) |
| 频道名称(来自集中元数据存储) |
| 完整视频URL(来自集中元数据存储) |
{YYYYMMDD}__{video_id}.{ext}20091025__dQw4w9WgXcQ.json{YYYYMMDD}__{video_id}.{ext}20091025__dQw4w9WgXcQ.jsonfile_path{
"text": "Full transcription text...",
"language": "en",
"duration": "3:32",
"model": "medium",
"segments": [
{
"start": "00:00:00.000",
"end": "00:00:05.000",
"text": "First segment..."
}
]
}file_path{
"text": "完整转录文本...",
"language": "en",
"duration": "3:32",
"model": "medium",
"segments": [
{
"start": "00:00:00.000",
"end": "00:00:05.000",
"text": "第一段内容..."
}
]
}| Model | Size | RAM | Speed | Accuracy |
|---|---|---|---|---|
| auto | - | - | - | Auto-select based on language (default) |
| tiny | 74MB | ~273MB | Fastest | Low |
| base | 141MB | ~388MB | Fast | Medium |
| small | 465MB | ~852MB | Moderate | Good |
| medium | 1.4GB | ~2.1GB | Slow | High |
| large-v3 | 2.9GB | ~3.9GB | Slowest | Best |
| large-v3-turbo | 1.5GB | ~2.1GB | Moderate | High (optimized for speed) |
| 模型 | 大小 | 内存占用 | 速度 | 准确率 |
|---|---|---|---|---|
| auto | - | - | - | 根据语言自动选择(默认) |
| tiny | 74MB | ~273MB | 最快 | 低 |
| base | 141MB | ~388MB | 快 | 中等 |
| small | 465MB | ~852MB | 适中 | 良好 |
| medium | 1.4GB | ~2.1GB | 慢 | 高 |
| large-v3 | 2.9GB | ~3.9GB | 最慢 | 最佳 |
| large-v3-turbo | 1.5GB | ~2.1GB | 适中 | 高(针对速度优化) |
| Model | Language | Size | Description |
|---|---|---|---|
| belle-zh | Chinese | 1.5GB | BELLE-2 Chinese-specialized model |
| kotoba-ja | Japanese | 1.4GB | kotoba-tech Japanese-specialized model |
| kotoba-ja-q5 | Japanese | 513MB | Quantized version (faster, smaller) |
| 模型 | 语言 | 大小 | 说明 |
|---|---|---|---|
| belle-zh | 中文 | 1.5GB | BELLE-2中文专用模型 |
| kotoba-ja | 日文 | 1.4GB | kotoba-tech日文专用模型 |
| kotoba-ja-q5 | 日文 | 513MB | 量化版本(更快、体积更小) |
auto| Language | Auto-Selected Model |
|---|---|
| zh | belle-zh (Chinese-specialized) |
| ja | kotoba-ja (Japanese-specialized) |
| others | medium (general purpose) |
/mk-youtube-audio-transcribe video.m4a auto zhbelle-zhauto| 语言 | 自动选择的模型 |
|---|---|
| zh | belle-zh(中文专用) |
| ja | kotoba-ja(日文专用) |
| 其他 | medium(通用) |
/mk-youtube-audio-transcribe video.m4a auto zhbelle-zhcached: true--forcefile_pathtext_file_pathMODEL_NOT_FOUNDcached: true--forcefile_pathtext_file_pathMODEL_NOT_FOUNDMODEL_NOT_FOUNDdownload_commandtimeout: 1800000undefinedMODEL_NOT_FOUNDdownload_commandtimeout: 1800000undefinedundefinedundefined/mk-youtube-transcript-summarizetext_file_path/mk-youtube-transcript-summarize <text_file_path>/mk-youtube-transcript-summarizetext_file_path/mk-youtube-transcript-summarize/mk-youtube-transcript-summarize <text_file_path>/mk-youtube-transcript-summarize