Loading...
Loading...
Clean and reconstruct raw auto-generated captions (Zoom, YouTube, Teams, Google Meet, Otter.ai, etc.) into readable, coherent transcripts. Use when the user provides raw caption files (.txt, .vtt, .srt), meeting transcripts with timestamps and speaker tags, or asks to clean up/refine a transcript. Handles: timestamp removal, speaker tag normalization, filler word removal, broken sentence reconstruction, transcription error correction, paragraph formation. Preserves every piece of substantive content while removing noise. Trigger phrases: 'clean this transcript', 'refine captions', 'fix this transcript', 'process Zoom captions', 'clean up meeting notes'.
npx skill4agent add prakharmnnit/skills-and-personas transcribe-refiner| Common Error | Likely Correct | Domain Clue |
|---|---|---|
| "lowest function" | "loss function" | AI/ML context |
| "wait" | "weight" | neural network context |
| "epic" | "epoch" | training context |
| "by Torch" | "PyTorch" | ML framework |
| "relaunch bowl" | "relaunch poll" | Zoom context |
| "solidity" vs "Solidity" | capitalize if Web3 | Web3 context |
| "know JS" | "Node.js" | WebDev context |
| "react" vs "React" | capitalize if framework | WebDev context |
[unclear: "original text"][rishabh]**Rishabh:****Student:****Instructor:**| Format | Characteristics | Handling |
|---|---|---|
| Zoom captions (.txt) | | Strip timestamps, merge fragments |
| YouTube (.vtt/.srt) | Numbered blocks with timecodes | Strip timecodes and sequence numbers |
| Otter.ai | Speaker-labeled paragraphs | Normalize speaker labels |
| Teams | Timestamped speaker blocks | Strip timestamps, merge |
| Raw paste | Mixed format | Auto-detect and clean |
---# Transcript: [Topic/Title if identifiable]
**Speaker(s):** [Name(s)]
**Estimated Duration:** [from timestamp range]
**Domain:** [Auto-detected: WebDev / AI-ML / Web3 / DSA / General]
**Cleaning Notes:** [e.g., "Fixed 12 transcription errors, removed ~45 filler instances"]
---
[Clean, flowing paragraphs organized by topic]
[Natural paragraph breaks at topic changes]
---
[Next topic section]
---
## Q&A Segments
**Student:** [Question]
**Instructor:** [Answer]## Topic Inventory
### Concepts Mentioned
1. [Concept] - paragraph [N]
2. [Concept] - paragraph [N]
...
### Technical Terms Introduced
- [term]: first mentioned in paragraph [N]
...
### Code/Commands Referenced
- [code snippet or command] - paragraph [N]
...
### Questions Asked (Q&A)
- Q: [question summary] - paragraph [N]
...
### Names/Resources Mentioned
- [name, URL, tool, book, etc.]
...
### Corrections Applied
| Original Caption | Corrected To | Confidence |
|-----------------|-------------|------------|
| "lowest function" | "loss function" | High |
| "epic" | "epoch" | High |
| [unclear text] | [kept as-is] | Low |
### Stats
- Raw caption blocks: [N]
- Substantive paragraphs produced: [N]
- Filler instances removed: [N]
- Transcription errors corrected: [N]
- Uncertain corrections flagged: [N]<!-- T:20:36:30 --> Neural network architecture introduction
<!-- T:20:45:12 --> Activation functions
<!-- T:21:03:45 --> Training loop[unclear: ...]