Podcast Content Editing

Generate transcript → Generate review draft (topic-level outline + sentence-level deletion suggestions) → User review → Execute editing

Quick Start

User: Help me cut the nonsense in the podcast
User: content edit
User: Generate a transcript and mark the content to be deleted

Input

Audio/video files
(Optional) List of speaker names, e.g.,
```
["Maia", "响歌歌", "安安"]
```

Output

Review Draft - A file containing: content outline + full transcript + deletion marks
After Confirmation - Execute editing

Note: No separate transcript file will be output, as the review draft already includes the full transcript

Two Levels for Collaborative Use

Level	Position	Granularity	Applicable Scenario
Topic-level	First part of the review draft (content outline)	5-30 minutes per block	Quick rough cut, delete entire off-topic/chit-chat segments
Sentence-level	Third part of the review draft (main text)	Inline sentence marking	Fine adjustment, view context

Workflow

1. Transcribe audio (FunASR + sentence-level timestamps + speaker diarization)
    ↓
2. Silence detection (FFmpeg silencedetect, identify long blank segments)
    ↓
3. Generate transcript (with speaker labels)
    ↓
4. AI analysis: Identify topic structure + mark suggested deletions
    ↓
5. Output review draft (content outline + silence segments + deletion suggestions)
    ↓
【User directly modifies deletion marks on the review draft】
    ↓
6. Execute editing → /podcastcut-edit (parse deletion marks from review draft)

Technical Notes

Feature	Implementation
Transcription	FunASR (must use full model path, see code below)
Timestamps	Sentence-level (automatically returns `sentence_info` )
Speaker Diarization	Built-in CAM++ model in FunASR
Silence Detection	FFmpeg `silencedetect` (threshold -40dB, minimum duration 3s)

⚠️ Must Use Script for Transcription

Do not write your own code, directly call the existing script:

bash

# Transcribe (output podcast_transcript.json)
python ~/.claude/skills/podcastcut-content/scripts/transcribe.py <audio file> <output directory>

# Generate transcript (output podcast_逐字稿.md)
python ~/.claude/skills/podcastcut-content/scripts/generate_transcript.py \
    <transcript.json> <output.md> '{"0":"响歌歌","1":"麦雅","2":"安安"}'

Why can't you write your own code? FunASR requires the full model path + VAD + Punc + Speaker four models to obtain

sentence_info

. Simplified writing (e.g.,

model="paraformer-zh"

) will cause transcription failure.

For performance references and common issues, see
tips/转录最佳实践.md

Why use sentence-level instead of character-level? Character-level + speaker diarization is unstable for long audio (OOM, alignment errors after segmentation)

This Skill only deletes entire sentences, more fine-grained deletions (half sentences, filler words) are left to

/podcastcut-transcribe

Transcript Format

markdown

**Maia** 00:05
Let's start.

**响歌歌** 00:06
Really?

**Maia** 00:08
Great, OK. Hello everyone, welcome to today's 5:15 Podcast, I'm host Maia. Today we're going to talk about something fun.

**响歌歌** 00:20
I'm host 响歌歌. Alright, let's get started.

Format Rules

Element	Format
Speaker	`Name` in bold
Timestamp	`MM:SS` or `HH:MM:SS` (start time of the speaker)
Content	Content from the same speaker is concatenated, no line breaks per sentence
Speaker Change	Add a blank line

Review Draft Format

The review draft integrates topic-level outlines and sentence-level deletion suggestions, allowing users to complete the review in one file.

markdown

# Podcast Review Draft

**File**: podcast.mp3
**Total Duration**: 2:08:07

---

## I. Content Outline (Topic-level)

| # | Topic | Time | Duration | AI Suggestion | Reason |
|---|------|------|------|--------|------|
| 1 | Opening small talk + technical debugging | 00:00 - 04:45 | 04:45 | 🗑️ Delete | Recording preparation, technical issues |
| 2 | Official opening + guest introduction | 04:45 - 07:00 | 02:15 | ✅ Keep | Official start of the podcast |
| 3 | Chit-chat: Guest background | 05:48 - 07:01 | 01:13 | 🗑️ Delete | Irrelevant to the topic |
| 4 | Topic discussion | 07:01 - 40:00 | 32:59 | ✅ Keep | Core content |
| 5 | Recording discussion (middle) | 49:59 - 51:13 | 01:14 | 🗑️ Delete | Discussion on editing matters |

**Statistics**: Suggested to keep 2:03:21 | Suggested to delete 08:12

**Operation**: `Delete topics 1, 3, 5` or `Keep only topics 2, 4`

---

## II. Silence Segments

| # | Time | Duration | Location Description |
|---|------|------|----------|
| 1 | 12:34 - 12:48 | 00:14 | Between Topic 2 and Topic 3 |
| 2 | 35:20 - 35:58 | 00:38 | Guest's thinking pause |
| 3 | 1:02:15 - 1:02:45 | 00:30 | Mid-recording disconnection/silence |

**Statistics**: Total 3 silence segments, total duration 01:22

**Operation**: `Delete all silence` or `Delete silence 1, 3` (keep 2, may be intentional pause)

---

## III. Statistics

- Total sentences: 3390
- Suggested deletions: 377 instances
- Silence segments: 3 instances (01:22)

### By Type

- Opening small talk: 31 instances
- Chit-chat - personal background: 23 instances
- Technical debugging: 15 instances
- Recording discussion: 6 instances
- Privacy - company name: 5 instances
- Privacy - location: 4 instances
- Privacy - school name: 3 instances

---

## IV. Main Text (Transcript + Deletion Marks)

**⚠️ Must include the full transcript, from the first sentence to the last, no content can be omitted!**

Incorrect practice: `(The following content is topic discussion, keep...)` ← Not allowed!
Correct practice: Output all sentences, regardless of whether they are marked for deletion

Full transcript, content suggested for deletion by AI is marked with ~~strikethrough~~ and the reason is noted. Content from the same speaker is concatenated, no line breaks per sentence.

**响歌歌** 00:00
~~Alright, right, you shouldn't hear any noise. Because I remember last time we used this there was noise, just like that psychological counseling session, um, right, this time it should be better.~~ `[Delete: Opening small talk]`

**麦雅** 00:23
~~I'll turn on this dog, okay~~ `[Delete: Opening small talk]`

**安安** 00:27
~~I'll also turn on my self-introduction.~~ `[Delete: Opening small talk]`

...

**麦雅** 04:50
Hello everyone, welcome to today's 5:15 Podcast, I'm host 麦雅.

**响歌歌** 04:58
I'm host 响歌歌.

**麦雅** 05:02
Today we have a special guest, 安安.

**安安** 05:08
Hello everyone, I'm 安安.

...

**安安** 15:32
~~When I worked at Google before~~ `[Delete: Privacy - company name]` When I worked before, I encountered a similar situation.

...

**麦雅** 49:59
~~Should we cut this segment?~~ `[Delete: Recording discussion]`

**响歌歌** 50:02
~~Hmm, let's check later.~~ `[Delete: Recording discussion]`

**安安** 50:05
~~I think we can keep it.~~ `[Delete: Recording discussion]`

...

(Full transcript continues...)

Structure Description

Section	Content	Purpose
I. Content Outline	Topic-level table	Quickly understand structure, delete entire blocks
II. Silence Segments	List of long blank segments	Delete silent paragraphs
III. Statistics	Summary of deletion counts by type	Quickly see the scale of deletions
IV. Main Text	Full transcript + inline deletion marks	View context, review sentence by sentence

Topic Identification Rules

Topic Type	Identification Method
Opening small talk	Content before the official opening ("Hello everyone")
Official opening	Paragraph starting with "Hello/Hello everyone"
Chit-chat	Discussion of personal background irrelevant to the topic
Topic discussion	Core content around the podcast topic
Recording discussion	Paragraphs discussing editing, content selection
Closing	Concluding remarks like "Alright, that's it for today"

Deletion Types

⚠️ Division of Labor Principle

Skill	Focus	Processed Content	Timestamp Granularity
`/podcastcut-content`	Content semantics	Opening, off-topic, privacy, redundancy, long silence	Sentence-level
`/podcastcut-transcribe`	Verbal error technology	Filler words, verbal errors, short pauses, half-sentence deletion	Character-level

This Skill focuses on content level: What to delete and what to keep is a semantic judgment, deleting entire sentences. Verbal error identification is technical level: Requires more fine-grained rules (repeated characters, pause patterns), using character-level timestamps.

Why this division of labor?

Sentence-level transcription + speaker diarization = accurate speaker identification
Character-level transcription + speaker diarization = easy speaker misalignment (OOM for long audio, alignment difficulties after segmentation)
First delete large segments (sentence-level), then process the remaining content (character-level)

Content Deletion Types (Processed by This Skill)

Type	Mark	Example
Opening small talk	`[Delete: Opening small talk]`	"Shall we start?" "Can you hear me?"
Closing chit-chat	`[Delete: Closing chit-chat]`	"Alright, that's it" "Bye"
Recording-related	`[Delete: Recording-related]`	"Re-record this segment" "Cut this later"
Off-topic content	`[Delete: Off-topic]`	Discussion irrelevant to the topic
Redundant repetition	`[Delete: Redundant]`	Large segments repeating the same point
Privacy - company	`[Delete: Privacy - company name]`	"I work at Google"
Privacy - personal name	`[Delete: Privacy - personal name]`	"My colleague Zhang San said"
Privacy - location	`[Delete: Privacy - location]`	"I live in xxx"
Long silence	Listed separately in the second part of the review draft	Silence over 3 seconds

Verbal Error Deletion Types (Processed by /podcastcut-transcribe)

Type	Description
Fillers/modal particles	"Um", "I mean", "Then", "Right right right"
Verbal errors	Misspoken words and corrections
Short pauses	Small pauses within sentences (< 3 seconds)

Note: Long silence (≥3 seconds) is processed by this Skill, short pauses are processed by

/podcastcut-transcribe

Why not process fillers here? Identification of fillers requires more fine-grained rules (continuous repetition, pause patterns), which is technical rather than content semantic.

AI Analysis Method

⚠️ Must Use Claude for Semantic Analysis

Keyword matching is not sufficient! Rule-based methods cannot identify:

Semantic off-topic/chit-chat (no obvious keywords)
Chit-chat after guest introduction (where they live, graduation year, school conditions)
Hidden recording discussions (no keywords like "cut")

Must use Claude to analyze the transcript in segments, prioritize quality.

Analysis Workflow

1. Split the transcript into 15-minute segments
2. Send each segment to Claude for analysis, identify content suggested for deletion
3. Claude returns: sentence index + deletion type + reason
4. Merge results from all segments, generate review draft

Claude Analysis Prompt

For each transcript segment, use the following prompt:

You are a podcast content review assistant. Analyze the following transcript and identify sentences suggested for deletion.

## Deletion Types

1. **Opening small talk**: Chit-chat, technical debugging before the official opening ("Hello everyone")
2. **Recording discussion**: Discussion of editing, recording status, technical issues, "Should we cut this segment"
3. **Privacy - company name**: Mention of specific company names (Google, Meta, ByteDance, etc.)
4. **Privacy - school name**: Mention of specific school names (Stanford, Tsinghua, etc.)
5. **Privacy - location**: Mention of specific locations (Palo Alto, Silicon Valley, etc.)
6. **Privacy - personal name**: Mention of specific personal names (non-public figures)
7. **Off-topic/chit-chat**: Discussion irrelevant to the podcast topic (personal background chit-chat, geographic discussion, etc.)
8. **Redundant repetition**: Repeating the same point multiple times, large segments of repetition

## Output Format

For each sentence suggested for deletion, output:
- Sentence timestamp
- Deletion type
- Reason (brief description)

Only mark sentences that need deletion, skip those that don't.

## Transcript

{transcript_segment}

Detailed Deletion Type Explanations

Type	Identification Key Points
Opening small talk	All content before the official opening, including technical debugging and chit-chat
Recording discussion	"Cut", "Did we record that?", "This is too sensitive", "Cut later"
Privacy information	Company names, school names, locations, personal names
Off-topic/chit-chat	Irrelevant to the topic: where they live, when they arrived, how the school is
Redundant repetition	The same meaning repeated more than 3 times

Chit-chat Detection Focus

Chit-chat after guest introduction is particularly easy to miss, watch for these signals:

Sudden appearance of place names, school names, years
"Which area do you live in" "When did you arrive" "How is it over there"
Multiple consecutive sentences discussing non-topic content (geography, school, city comparison)

Recording Discussion Detection Focus

Not just at the opening! May appear throughout the podcast:

Technical issues: "Can you hear me", "Disconnected", "Headphone battery dead"
Content concerns: "Too low-key", "Don't want to share", "Don't mention details"
Editing discussion: "Cut later", "Should we keep this"

Silence Detection Method

Use FFmpeg's

silencedetect

filter to detect large blank segments.

Detection Command

bash

ffmpeg -i video.mp4 -af "silencedetect=noise=-40dB:d=3" -f null - 2>&1 | grep silencedetect

Parameter	Description	Recommended Value
`noise`	Silence threshold (volume below this is considered silence)	-40dB
`d`	Minimum silence duration (seconds)	3 (content editing focuses on large blank segments)

Output Parsing

[silencedetect @ 0x...] silence_start: 752.341
[silencedetect @ 0x...] silence_end: 766.512 | silence_duration: 14.171

Parse

silence_start

and

silence_end

to generate the list of silence segments.

Threshold Selection

Scenario	noise	d (minimum duration)
Content editing (this Skill)	-40dB	3 seconds
Verbal error identification (fine-grained)	-50dB	0.5 seconds

Why use 3 seconds? Pauses shorter than 3 seconds may be natural thinking gaps, not recommended for deletion.

Output Files

podcast_transcript.json        # Sentence-level timestamps + speakers (for editing use)
podcast_审查稿.md              # Review draft (includes full transcript + deletion marks)

⚠️ Only output the review draft, no separate transcript file

The fourth section "Main Text" of the review draft is the full transcript, no need to output separately.

Sentence-level JSON Format

json

{
  "file": "podcast.mp3",
  "duration": 3600.5,
  "sentences": [
    {"text": "Hello everyone,", "start": 0.50, "end": 1.20, "spk": 0},
    {"text": "Welcome to today's podcast.", "start": 1.20, "end": 2.80, "spk": 0},
    {"text": "I'm host Xiao Ming.", "start": 2.80, "end": 3.90, "spk": 1},
    ...
  ]
}

⚠️ Review Draft and Deletion List Must Be Synchronized

Users may directly modify deletion marks in the review draft (add/remove strikethrough), making the deletion list outdated.

Rules:

The review draft is the final source for user review
Before executing editing, re-parse deletion marks from the review draft
Do not rely on the potentially outdated
```
podcast_删除清单.json
```

Parsing Method: Scan the text marked with

~~strikethrough~~

in the review draft and match the timestamps in transcript.json.

Relationship with Other Skills

/podcastcut-content     → Content editing (semantic level) ← This Skill
/podcastcut-edit        → Execute editing
/podcastcut-transcribe  → Verbal error identification (technical level, optional)
/podcastcut-subtitle    → Generate subtitles

Recommended Workflow:

Original video
    ↓
/podcastcut-content  ← Mark large segments (small talk, off-topic, redundancy, privacy)
    ↓
/podcastcut-edit     ← Execute deletion, output v2
    ↓
【Optional】Need to process verbal errors?
    ↓ Yes
/podcastcut-transcribe  ← Identify verbal errors, filler words, silence
    ↓
/podcastcut-edit        ← Execute deletion, output v3
    ↓
Completed

Why delete content first then process verbal errors?

After deleting large segments, the video becomes shorter
Verbal error identification transcription is faster, review scope is smaller
No need to process verbal errors in deleted large segments

Speaker Diarization

FunASR has built-in speaker diarization functionality (

spk_model="cam++"

), automatically outputting speaker IDs.

Workflow

FunASR transcription (enable spk_model)
    ↓
Output sentences with speaker IDs (speaker 0, speaker 1...)
    ↓
Search self-introduction phrases to confirm the real name corresponding to the ID
    ↓
Replace with real names when generating the review draft

⚠️ Speaker Mapping Confirmation Method

Do not directly use the order provided by the user! Must search self-introduction phrases in the transcription results to confirm:

python

# Search key phrases to determine speaker mapping
key_phrases = ["I'm the host", "I'm xxx", "Hello everyone, I'm"]
for s in sentences:
    for phrase in key_phrases:
        if phrase in s['text']:
            print(f"spk{s['spk']}: {s['text']}")  # Confirm who the spk ID corresponds to

Common Issues

Issue	Cause	Solution
Same person divided into multiple IDs	FunASR identification instability	Map multiple IDs to the same name
More IDs than actual speakers	As above	Merge redundant IDs based on self-introduction
Speaker ID misalignment after segmented transcription	Each segment's ID resets independently	Prioritize full transcription, avoid segmentation

Limitations

Condition	Effect
2-10 person conversation	Good effect
Audio < 30s	Decreased effect
More than 10 people	Decreased effect
Segmented transcription	Speaker IDs may be inconsistent

Progress TodoList

Create at startup:

- [ ] Transcribe audio (FunASR, sentence-level + speaker diarization)
- [ ] Silence detection (FFmpeg silencedetect)
- [ ] Generate transcript
- [ ] AI analysis: Identify topic structure + mark suggested deletions
- [ ] Output review draft (including silence segments)
- [ ] Wait for user confirmation

Example Dialogue

User: Help me cut the nonsense in the podcast, speakers are Maia and 响歌歌

AI: Alright, I'll process this podcast.
    1. Transcribing audio...
    2. Detecting silence...
    3. Generating transcript...
    4. Analyzing content...

    Review draft generated: podcast_审查稿.md

    === Content Outline (Topic-level) ===
    | # | Topic | Duration | AI Suggestion |
    |---|------|------|--------|
    | 1 | Opening small talk | 04:45 | 🗑️ Delete |
    | 2 | Official opening | 02:15 | ✅ Keep |
    | 3 | Chit-chat: Guest background | 01:13 | 🗑️ Delete |
    | 4 | Topic discussion | 32:59 | ✅ Keep |

    === Sentence-level Statistics (By Type) ===
    - Opening small talk: 12 instances
    - Recording discussion: 8 instances
    - Privacy information: 5 instances
    - Off-topic chit-chat: 3 instances

    Please check the deletion marks in the review draft, and tell me to execute editing after adjustment.

User: [Added/removed some deletion marks in the review draft] Alright, cut according to the review draft

AI: Alright, parsing deletion marks from the review draft...
    - Found 25 deletion marks
    - Total deletion duration: 06:32
    Executing editing...

Feedback Records

2026-01-31 (Late Night)

Speaker ID misalignment caused by segmented transcription: Speaker IDs reset independently for each segment during segmented transcription, resulting in different IDs for the same person after merging
- Cause: 2-hour audio split into 13 segments to avoid OOM, each segment's speaker ID starts from 0
- Solution: Prioritize full transcription (2-hour audio takes about 16 minutes, no OOM)
- Updated: Added "Segmented vs Full Transcription" section to tips/转录最佳实践.md
FunASR may identify the same person as multiple IDs: 3-person conversation identified as 4 speaker IDs
- Performance: 响歌歌 was split into spk1 (60 sentences) and spk3 (560 sentences)
- Solution: Search self-introduction phrases ("I'm host xxx") to confirm mapping, merge multiple IDs
- Updated: Added confirmation method and common issues to the "Speaker Diarization" section of SKILL.md
Updated performance data: 2-hour podcast tested to take 16 minutes, 3390 sentences (previous estimate of 12 minutes, 800 sentences was low)
- Updated: Performance reference table in tips/转录最佳实践.md

2026-01-31 (Evening)

Incomplete review draft content: AI took a shortcut and wrote "(The following content is topic discussion, keep...)" instead of the full transcript
- Updated: Clearly marked in the "Section IV" of the review draft format that full content must be output, no omissions allowed
Redundant transcript file output: Users only need one review draft (which already includes the full transcript)
- Updated: Removed
```
podcast_逐字稿.md
```
  from the output files section, clearly stated to only output the review draft

2026-01-31

Deleted "User Confirmation Method" section: Users actually operate by directly modifying deletion marks in the review draft, no need for command-based operations
- Old workflow: Users input commands like "Delete topics 1, 3" "Delete all silence"
- Actual workflow: Users add/remove
```
~~strikethrough~~
```
  in the review draft, then say "Cut according to the review draft"
- Updated: Flowchart, example dialogue, removed command-based operation instructions
FunASR call parameter error caused transcription failure: Using simplified model names cannot obtain
```
sentence_info
```
- Incorrect writing:
```
model="paraformer-zh"
```
  +
```
spk_model="cam++"
```
  +
```
sentence_timestamp=True
```
- Correct writing: Must use full model path + VAD + Punc + Speaker four models
- Cause: SKILL.md only wrote simplified parameters, actual execution used incorrect API due to "free play"
- Updated: Included complete call code in SKILL.md, marked comparison between incorrect and correct writing
- Lesson: Executable code must be fully written in SKILL.md, cannot only write parameter names and let AI assemble it

2026-01-25

Reverted to sentence-level timestamps: Character-level + speaker diarization is unstable for long audio
- Issue: After character-level transcription and merging speaker information, speaker alignment errors occurred ("I'm host 麦雅" was attributed to 响歌歌)
- Cause:
  1. Speaker diarization OOM for long audio (2-hour audio → 234MB WAV)
  2. Segmented speaker diarization returned 0 sentences (API format issue)
  3. Character-level transcription has no punctuation, sentence boundaries are unnatural
- Updated: Reverted to sentence-level timestamps, this Skill only deletes entire sentences
- Half-sentence deletion is left to
```
/podcastcut-transcribe
```
  (character-level)

2026-01-24 (Evening)

Attempted to upgrade to character-level timestamps: Solved the problem that sentence-level cannot accurately delete parts of sentences
- Issue: Deleting "Um, I can talk about why" would also delete the second half "This episode's guest was actually invited by 响歌歌"
- Attempt: Use 30s segmentation +
```
timestamp_granularity="character"
```
  to obtain character-level timestamps
- Result: Character-level transcription succeeded, but speaker diarization failed, leading to speaker alignment errors
- Final decision: Revert to sentence-level (see 2026-01-25 feedback)

2026-01-24

Added silence detection function: Use FFmpeg silencedetect to identify long blank segments (≥3 seconds), listed separately in the review draft for user confirmation to delete
Review draft marked for deletion but not actually cut: Entire sentences marked for deletion in the review draft, but only part of the content was in the deletion list
- Case: Review draft
```
~~Um, this is why specifically...~~
```
  , deletion list only had
```
Um,
```
- Updated: Emphasized that the review draft and deletion list must be synchronized, re-parse from the review draft before executing editing

2026-01-18 (Afternoon)

Adjusted transcript/review draft format: Content from the same speaker is concatenated, no line breaks per sentence
- Original format: One line per sentence
- New format: All sentences from the same speaker are in one paragraph
- Advantage: More compact, better reading experience

2026-01-18

Must use Claude for semantic analysis: Rule-based keyword matching is not sufficient in quality
- Issue: Cannot identify semantic off-topic/chit-chat, hidden recording discussions
- Updated: Added "AI Analysis Method" section, clearly stated that Claude must be used to analyze the transcript in segments
- Included: Analysis workflow, Claude prompt template, detailed deletion type explanations

2026-01-17 (Evening)

Adjusted format of Section II of the review draft: Changed to full transcript + inline deletion marks
- Original format: Grouped by topic → deletion suggestions listed under each topic (table form)
- New format: Full transcript (main text), deleted content marked with
```
~~strikethrough~~
```
  +
```
[Delete: Reason]
```
  inline
- Advantage: Retains full context, users can see the context before deciding whether to delete

2026-01-17 (Afternoon)

Large segment deletion not clean: Consecutive sentences marked for deletion, but sentence-by-sentence deletion retained blank spaces between sentences
- Cause: Each sentence processed independently, no merging of consecutive deletions with the same reason
- Updated: Clarified division of labor, this Skill focuses on sentence-level, filler words/verbal fillers processed by
```
/podcastcut-transcribe
```
- Editing rules have been synchronized to
```
/podcastcut-edit
```
Do not mark filler words at sentence-level: Sentence-level timestamps are not precise enough, deleting filler words is easy to cause accidental deletion
- Updated: Divided deletion types into "sentence-level" and "character-level", clarified division of labor

2026-01-17 (Morning)

Recording discussions and technical debugging may appear anywhere in the podcast, not just at the opening
- Updated: Recording-related detection changed to full-process detection, added technical issue keywords and continuous paragraph detection
Chit-chat after guest introduction (where they live, when they arrived, how the school is) is easy to miss
- Updated: Added signal patterns (place names, school names, year-related conversations) to off-topic/chit-chat detection
Sentence-by-sentence review is inefficient, users want to see the global structure and delete entire blocks
- Added: Review draft integrates topic-level outline and sentence-level deletion suggestions, complete review in one file

podcastcut-content

NPX Install

Tags

SKILL.md Content (Chinese)

Podcast Content Editing

Quick Start

Input

Output

Two Levels for Collaborative Use

Workflow

Technical Notes

⚠️ Must Use Script for Transcription

Transcript Format

Format Rules

Review Draft Format

Structure Description

Topic Identification Rules

Deletion Types

⚠️ Division of Labor Principle

Content Deletion Types (Processed by This Skill)

Verbal Error Deletion Types (Processed by /podcastcut-transcribe)

AI Analysis Method

⚠️ Must Use Claude for Semantic Analysis

Analysis Workflow

Claude Analysis Prompt

Detailed Deletion Type Explanations

Chit-chat Detection Focus

Recording Discussion Detection Focus

Silence Detection Method

Detection Command

Output Parsing

Threshold Selection

Output Files

Sentence-level JSON Format

⚠️ Review Draft and Deletion List Must Be Synchronized

Relationship with Other Skills

Speaker Diarization

Workflow

⚠️ Speaker Mapping Confirmation Method

Common Issues

Limitations

Progress TodoList

Example Dialogue

Feedback Records

2026-01-31 (Late Night)

2026-01-31 (Evening)

2026-01-31

2026-01-25

2026-01-24 (Evening)

2026-01-24

2026-01-18 (Afternoon)

2026-01-18

2026-01-17 (Evening)

2026-01-17 (Afternoon)

2026-01-17 (Morning)