transcribe-refiner

Compare original and translation side by side

🇺🇸

Original

English

🇨🇳

Translation

Chinese

Transcribe Refiner - Caption Cleanup Engine

Transcribe Refiner - 字幕清理引擎

Transform raw auto-generated captions into clean, readable transcripts with zero content loss.

将自动生成的原始字幕转换为干净、易读的转录文本，且无任何内容丢失。

Core Purpose

核心目标

Auto-generated captions (Zoom, YouTube, Teams, etc.) are messy: fragmented sentences, timestamps everywhere, speaker tags on every line, filler words, transcription errors. This skill reconstructs them into coherent, flowing text that can be consumed by humans or downstream skills (like lecture-alchemist).

Zoom、YouTube、Teams等平台自动生成的字幕往往杂乱无章：句子碎片化、时间戳随处可见、每行都有发言者标签、存在填充词和转录错误。本技能可将这些字幕重构为连贯流畅的文本，供人类阅读或供下游技能（如lecture-alchemist）使用。

Critical Rules

关键规则

Zero Content Loss

无内容丢失

Every substantive statement, technical term, concept, question, and answer from the raw captions MUST appear in the output. Only noise is removed, never content.

Remove: Timestamps, redundant speaker tags, filler words (um, uh, basically, right?, you know), technical interruptions ("can you hear me?", "let me share my screen"), duplicate sentences from reconnection.

Preserve: Every teaching point, code reference, question asked, answer given, tangent with value, name, URL, command, or technical term.

原始字幕中的每一条实质性陈述、技术术语、概念、问题和答案都必须出现在输出结果中。仅移除冗余信息，绝不丢失内容。

移除内容： 时间戳、重复的发言者标签、填充词（um、uh、basically、right?、you know）、技术干扰语句（“能听到我说话吗？”、“我来共享屏幕”）、重连导致的重复句子。

保留内容： 所有教学要点、代码引用、提出的问题、给出的答案、有价值的题外话、姓名、URL、命令或技术术语。

Smart Error Correction

智能错误修正

Auto-captions make predictable errors. Fix them using domain context:

Common Error	Likely Correct	Domain Clue
"lowest function"	"loss function"	AI/ML context
"wait"	"weight"	neural network context
"epic"	"epoch"	training context
"by Torch"	"PyTorch"	ML framework
"relaunch bowl"	"relaunch poll"	Zoom context
"solidity" vs "Solidity"	capitalize if Web3	Web3 context
"know JS"	"Node.js"	WebDev context
"react" vs "React"	capitalize if framework	WebDev context

When uncertain about a correction, keep the original and flag it:

[unclear: "original text"]

自动字幕会出现可预测的错误。可结合领域上下文进行修正：

常见错误	修正结果	领域线索
"lowest function"	"loss function"	AI/ML领域
"wait"	"weight"	神经网络领域
"epic"	"epoch"	模型训练领域
"by Torch"	"PyTorch"	ML框架领域
"relaunch bowl"	"relaunch poll"	Zoom场景
"solidity" vs "Solidity"	若为Web3领域则大写首字母	Web3领域
"know JS"	"Node.js"	Web开发领域
"react" vs "React"	若为前端框架则大写首字母	Web开发领域

若对修正结果不确定，保留原文并标记：

[unclear: "原始文本"]

Speaker Handling

发言者处理

Identify unique speakers from tags
Normalize names (e.g.,
```
[rishabh]
```
→
```
**Rishabh:**
```
)
Only include speaker attribution at natural conversation changes
For single-speaker lectures, omit speaker tags entirely after initial identification
For Q&A, clearly mark:
```
**Student:**
```
and
```
**Instructor:**
```

从标签中识别唯一发言者
标准化姓名格式（例如：
```
[rishabh]
```
→
```
**Rishabh:**
```
）
仅在对话自然切换时添加发言者归属标记
对于单人讲座，在初始识别后可省略发言者标签
对于问答环节，明确标记：
```
**学生:**
```
和
```
**讲师:**
```

Input Formats

输入格式

Format	Characteristics	Handling
Zoom captions (.txt)	`[speaker] HH:MM:SS\ntext`	Strip timestamps, merge fragments
YouTube (.vtt/.srt)	Numbered blocks with timecodes	Strip timecodes and sequence numbers
Otter.ai	Speaker-labeled paragraphs	Normalize speaker labels
Teams	Timestamped speaker blocks	Strip timestamps, merge
Raw paste	Mixed format	Auto-detect and clean

格式	特征	处理方式
Zoom字幕（.txt）	`[speaker] HH:MM:SS\ntext`	移除时间戳，合并碎片化内容
YouTube（.vtt/.srt）	带时间码的编号块	移除时间码和序列号
Otter.ai转录文本	带发言者标签的段落	标准化发言者标签
Teams转录文本	带时间戳的发言者块	移除时间戳，合并内容
原始粘贴文本	混合格式	自动检测并清理

Processing Steps

处理步骤

Strip noise - Remove timestamps, sequence numbers, formatting artifacts
Merge fragments - Join broken sentences across caption blocks
Remove filler - Strip "um", "uh", "basically", "right?", "you know" (but keep if they carry meaning like "right?" as a genuine question)
Fix transcription errors - Use domain context to correct obvious misrecognitions
Remove technical interruptions - "Can you hear me?", "Let me share my screen", "Is my screen visible?", connection issues
Form paragraphs - Group related sentences into natural paragraphs by topic
Identify sections - Insert
```
---
```
breaks at major topic transitions
Normalize Q&A - Clearly separate questions from instruction
Add metadata header - Speaker(s), estimated duration, domain detected

移除冗余信息 - 删除时间戳、序列号和格式伪影
合并碎片化内容 - 将字幕块中被拆分的句子拼接完整
删除填充词 - 移除“um”、“uh”、“basically”、“right?”、“you know”（但若这些词带有实际含义，如作为真实提问的“right?”，则予以保留）
修正转录错误 - 结合领域上下文修正明显的识别错误
移除技术干扰语句 - 删除“能听到我说话吗？”、“我来共享屏幕”、“我的屏幕能看到吗？”以及连接问题相关语句
生成段落 - 按主题将相关句子分组为自然段落
识别章节 - 在主要主题切换处插入
```
---
```
分隔符
标准化问答环节 - 明确区分问题与教学内容
添加元数据头部 - 包含发言者、预估时长、检测到的领域

Output Format

输出格式

markdown

undefined

markdown

undefined

Transcript: [Topic/Title if identifiable]

转录文本：[可识别的主题/标题]

Speaker(s): [Name(s)] Estimated Duration: [from timestamp range] Domain: [Auto-detected: WebDev / AI-ML / Web3 / DSA / General] Cleaning Notes: [e.g., "Fixed 12 transcription errors, removed ~45 filler instances"]

[Clean, flowing paragraphs organized by topic]

[Natural paragraph breaks at topic changes]

[Next topic section]

发言者： [姓名] 预估时长： [来自时间戳范围] 领域： [自动识别：WebDev / AI-ML / Web3 / DSA / 通用] 清理说明： [例如：“修正12处转录错误，移除约45个填充词”]

[按主题组织的干净、流畅段落]

[主题切换处的自然段落分隔]

[下一主题章节]

Q&A Segments

问答环节

Student: [Question]

Instructor: [Answer]

undefined

学生： [问题]

讲师： [回答]

undefined

Topic Inventory (Anti-Loss System)

主题清单（防丢失机制）

This is the critical mechanism that prevents data loss across the pipeline. After cleaning, generate a Topic Inventory at the end of output -- a manifest of every substantive item found in the transcript.

markdown

undefined

这是防止整个处理流程中数据丢失的关键机制。清理完成后，在输出末尾生成主题清单——一份记录转录文本中所有实质性内容的清单。

markdown

undefined

Topic Inventory

主题清单

Concepts Mentioned

提及的概念

[Concept] - paragraph [N]
[Concept] - paragraph [N] ...

[概念] - 第[N]段
[概念] - 第[N]段 ...

Technical Terms Introduced

引入的技术术语

[term]: first mentioned in paragraph [N] ...

[术语]：首次出现在第[N]段 ...

Code/Commands Referenced

引用的代码/命令

[code snippet or command] - paragraph [N] ...

[代码片段或命令] - 第[N]段 ...

Questions Asked (Q&A)

提出的问题（问答环节）

Q: [question summary] - paragraph [N] ...

问题：[问题摘要] - 第[N]段 ...

Names/Resources Mentioned

提及的姓名/资源

[name, URL, tool, book, etc.] ...

[姓名、URL、工具、书籍等] ...

Corrections Applied

应用的修正

Original Caption	Corrected To	Confidence
"lowest function"	"loss function"	High
"epic"	"epoch"	High
[unclear text]	[kept as-is]	Low

原始字幕	修正结果	置信度
"lowest function"	"loss function"	高
"epic"	"epoch"	高
[模糊文本]	[保留原文]	低

Stats

统计数据

Raw caption blocks: [N]
Substantive paragraphs produced: [N]
Filler instances removed: [N]
Transcription errors corrected: [N]
Uncertain corrections flagged: [N]


This inventory travels to the next stage (lecture-alchemist) for cross-verification. Every item in this inventory MUST appear in the final notes.

原始字幕块数量：[N]
生成的实质性段落数量：[N]
移除的填充词数量：[N]
修正的转录错误数量：[N]
标记的不确定修正数量：[N]


该清单会传递至下一环节（lecture-alchemist）进行交叉验证。清单中的每一项都必须出现在最终笔记中。

Timestamp Anchors

时间戳锚点

Preserve approximate timestamps as hidden anchors for key topic transitions. Format:

markdown

<!-- T:20:36:30 --> Neural network architecture introduction
<!-- T:20:45:12 --> Activation functions
<!-- T:21:03:45 --> Training loop

These allow the reader to jump back to the recording at specific points.

为关键主题切换处保留近似时间戳作为隐藏锚点，格式如下：

markdown

<!-- T:20:36:30 --> 神经网络架构介绍
<!-- T:20:45:12 --> 激活函数
<!-- T:21:03:45 --> 训练循环

这些锚点可帮助读者跳转至录制视频的对应位置。

Quality Checklist

质量检查清单

Before output, verify:

Every teaching point from raw input is in the output
Topic Inventory is complete and accurate
Transcription errors corrected using domain context
Uncertain corrections flagged with
```
[unclear: ...]
```
Filler words removed without losing meaning
Sentences properly merged (no mid-word breaks)
Q&A segments clearly separated
Technical interruptions removed
Timestamp anchors placed at topic transitions
Output reads as natural, flowing text

输出前需验证：

原始输入中的所有教学要点均已包含在输出结果中
主题清单完整且准确
已结合领域上下文修正转录错误
不确定的修正已用
```
[unclear: ...]
```
标记
填充词已被移除且未丢失原意
句子已正确拼接（无单词拆分）
问答环节已明确区分
技术干扰语句已被移除
已在主题切换处添加时间戳锚点
输出文本读起来自然流畅

Pipeline Position

流程定位

This skill is Stage 1 in the lecture processing pipeline:

transcribe-refiner (this) → clean transcript + Topic Inventory
lecture-alchemist → structured study notes (verifies against inventory)
concept-cartographer → visual diagrams (verifies against inventory)
obsidian-markdown → Obsidian vault formatting

本技能是讲座处理流程中的第一阶段：

transcribe-refiner（本技能）→ 清理后的转录文本 + 主题清单
lecture-alchemist → 结构化学习笔记（与清单交叉验证）
concept-cartographer → 可视化图表（与清单交叉验证）
obsidian-markdown → Obsidian库格式优化