audio-voice-recovery

Compare original and translation side by side

🇺🇸

Original

English

🇨🇳

Translation

Chinese

Forensic Audio Research Audio Voice Recovery Best Practices

法医音频研究：语音恢复最佳实践

Comprehensive audio forensics and voice recovery guide providing CSI-level capabilities for recovering voice from low-quality, low-volume, or damaged audio recordings. Contains 45 rules across 8 categories, prioritized by impact to guide audio enhancement, forensic analysis, and transcription workflows.

这份全面的音频取证与语音恢复指南提供了CSI级别的能力，可从低质量、低音量或受损的录音中恢复语音。指南包含8个类别共45条规则，按影响优先级排序，为音频增强、法医分析和转录工作流提供指导。

When to Apply

适用场景

Reference these guidelines when:

Recovering voice from noisy or low-quality recordings
Enhancing audio for transcription or legal evidence
Performing forensic audio authentication
Analyzing recordings for tampering or splices
Building automated audio processing pipelines
Transcribing difficult or degraded speech

在以下场景中可参考本指南：

从嘈杂或低质量录音中恢复语音
增强音频以用于转录或法律证据
进行法医音频认证
分析录音是否存在篡改或拼接
构建自动化音频处理流水线
转录难识别或受损的语音

Rule Categories by Priority

按优先级划分的规则类别


signal-
noise-
spectral-
voice-
temporal-
transcribe-
forensic-
tool-

Priority	Category	Impact	Prefix	Rules
1	Signal Preservation & Analysis	CRITICAL	`signal-`	5
2	Noise Profiling & Estimation	CRITICAL	`noise-`	5
3	Spectral Processing	HIGH	`spectral-`	6
4	Voice Isolation & Enhancement	HIGH	`voice-`	7
5	Temporal Processing	MEDIUM-HIGH	`temporal-`	5
6	Transcription & Recognition	MEDIUM	`transcribe-`	5
7	Forensic Authentication	MEDIUM	`forensic-`	5
8	Tool Integration & Automation	LOW-MEDIUM	`tool-`	7


signal-
noise-
spectral-
voice-
temporal-
transcribe-
forensic-
tool-

优先级	类别	影响级别	前缀	规则数量
1	信号保留与分析	关键	`signal-`	5
2	噪声分析与估计	关键	`noise-`	5
3	频谱处理	高	`spectral-`	6
4	语音分离与增强	高	`voice-`	7
5	时域处理	中高	`temporal-`	5
6	转录与识别	中	`transcribe-`	5
7	法医认证	中	`forensic-`	5
8	工具集成与自动化	中低	`tool-`	7

Quick Reference

快速参考

1. Signal Preservation & Analysis (CRITICAL)

1. 信号保留与分析（关键）

```
signal-preserve-original
```
- Never modify original recording
```
signal-lossless-format
```
- Use lossless formats for processing
```
signal-sample-rate
```
- Preserve native sample rate
```
signal-bit-depth
```
- Use maximum bit depth for processing
```
signal-analyze-first
```
- Analyze before processing

```
signal-preserve-original
```
- 绝不修改原始录音
```
signal-lossless-format
```
- 使用无损格式进行处理
```
signal-sample-rate
```
- 保留原生采样率
```
signal-bit-depth
```
- 使用最大位深度进行处理
```
signal-analyze-first
```
- 先分析再处理

2. Noise Profiling & Estimation (CRITICAL)

2. 噪声分析与估计（关键）

```
noise-profile-silence
```
- Extract noise profile from silent segments
```
noise-identify-type
```
- Identify noise type before reduction
```
noise-adaptive-estimation
```
- Use adaptive estimation for non-stationary noise
```
noise-snr-assessment
```
- Measure SNR before and after
```
noise-avoid-overprocessing
```
- Avoid over-processing and musical artifacts

```
noise-profile-silence
```
- 从静音片段提取噪声特征
```
noise-identify-type
```
- 降噪前先识别噪声类型
```
noise-adaptive-estimation
```
- 对非平稳噪声使用自适应估计
```
noise-snr-assessment
```
- 处理前后测量信噪比（SNR）
```
noise-avoid-overprocessing
```
- 避免过度处理产生音乐伪影

3. Spectral Processing (HIGH)

3. 频谱处理（高）

```
spectral-subtraction
```
- Apply spectral subtraction for stationary noise
```
spectral-wiener-filter
```
- Use Wiener filter for optimal noise estimation
```
spectral-notch-filter
```
- Apply notch filters for tonal interference
```
spectral-band-limiting
```
- Apply frequency band limiting for speech
```
spectral-equalization
```
- Use forensic equalization to restore intelligibility
```
spectral-declip
```
- Repair clipped audio before other processing

```
spectral-subtraction
```
- 对平稳噪声应用谱减法
```
spectral-wiener-filter
```
- 使用维纳滤波器优化噪声估计
```
spectral-notch-filter
```
- 对 tonal 干扰应用陷波滤波器
```
spectral-band-limiting
```
- 对语音应用频率带限制
```
spectral-equalization
```
- 使用法医均衡器恢复可懂度
```
spectral-declip
```
- 在其他处理前修复削波音频

4. Voice Isolation & Enhancement (HIGH)

4. 语音分离与增强（高）

```
voice-rnnoise
```
- Use RNNoise for real-time ML denoising
```
voice-dialogue-isolate
```
- Use source separation for complex backgrounds
```
voice-formant-preserve
```
- Preserve formants during pitch manipulation
```
voice-dereverb
```
- Apply dereverberation for room echo
```
voice-enhance-speech
```
- Use AI speech enhancement services for quick results
```
voice-vad-segment
```
- Use VAD for targeted processing
```
voice-frequency-boost
```
- Boost frequency regions for specific phonemes

```
voice-rnnoise
```
- 使用 RNNoise 进行实时机器学习降噪
```
voice-dialogue-isolate
```
- 使用源分离处理复杂背景
```
voice-formant-preserve
```
- 音调调整时保留共振峰
```
voice-dereverb
```
- 应用去混响处理房间回声
```
voice-enhance-speech
```
- 使用AI语音增强服务快速获得结果
```
voice-vad-segment
```
- 使用VAD（语音活动检测）进行针对性处理
```
voice-frequency-boost
```
- 针对特定音素提升频率区域

5. Temporal Processing (MEDIUM-HIGH)

5. 时域处理（中高）

```
temporal-dynamic-range
```
- Use dynamic range compression for level consistency
```
temporal-noise-gate
```
- Apply noise gate to silence non-speech segments
```
temporal-time-stretch
```
- Use time stretching for intelligibility
```
temporal-transient-repair
```
- Repair transient damage (clicks, pops, dropouts)
```
temporal-silence-trim
```
- Trim silence and normalize before export

```
temporal-dynamic-range
```
- 使用动态范围压缩保持电平一致性
```
temporal-noise-gate
```
- 应用噪声门静音非语音片段
```
temporal-time-stretch
```
- 使用时间拉伸提升可懂度
```
temporal-transient-repair
```
- 修复瞬态损伤（咔哒声、爆音、信号丢失）
```
temporal-silence-trim
```
- 导出前修剪静音并归一化

6. Transcription & Recognition (MEDIUM)

6. 转录与识别（中）

```
transcribe-whisper
```
- Use Whisper for noise-robust transcription
```
transcribe-multipass
```
- Use multi-pass transcription for difficult audio
```
transcribe-segment
```
- Segment audio for targeted transcription
```
transcribe-confidence
```
- Track confidence scores for uncertain words
```
transcribe-hallucination
```
- Detect and filter ASR hallucinations

```
transcribe-whisper
```
- 使用 Whisper 进行抗噪转录
```
transcribe-multipass
```
- 对难处理音频使用多轮转录
```
transcribe-segment
```
- 分割音频进行针对性转录
```
transcribe-confidence
```
- 跟踪不确定词汇的置信度分数
```
transcribe-hallucination
```
- 检测并过滤ASR（自动语音识别）幻觉

7. Forensic Authentication (MEDIUM)

7. 法医认证（中）

```
forensic-enf-analysis
```
- Use ENF analysis for timestamp verification
```
forensic-metadata
```
- Extract and verify audio metadata
```
forensic-tampering
```
- Detect audio tampering and splices
```
forensic-chain-custody
```
- Document chain of custody for evidence
```
forensic-speaker-id
```
- Extract speaker characteristics for identification

```
forensic-enf-analysis
```
- 使用ENF（电网频率）分析验证时间戳
```
forensic-metadata
```
- 提取并验证音频元数据
```
forensic-tampering
```
- 检测音频篡改与拼接
```
forensic-chain-custody
```
- 记录证据的保管链
```
forensic-speaker-id
```
- 提取说话人特征用于识别

8. Tool Integration & Automation (LOW-MEDIUM)

8. 工具集成与自动化（中低）

```
tool-ffmpeg-essentials
```
- Master essential FFmpeg audio commands
```
tool-sox-commands
```
- Use SoX for advanced audio manipulation
```
tool-python-pipeline
```
- Build Python audio processing pipelines
```
tool-audacity-workflow
```
- Use Audacity for visual analysis and manual editing
```
tool-install-guide
```
- Install audio forensic toolchain
```
tool-batch-automation
```
- Automate batch processing workflows
```
tool-quality-assessment
```
- Measure audio quality metrics

```
tool-ffmpeg-essentials
```
- 掌握FFmpeg核心音频命令
```
tool-sox-commands
```
- 使用SoX进行高级音频处理
```
tool-python-pipeline
```
- 构建Python音频处理流水线
```
tool-audacity-workflow
```
- 使用Audacity进行可视化分析与手动编辑
```
tool-install-guide
```
- 安装法医音频工具链
```
tool-batch-automation
```
- 自动化批量处理工作流
```
tool-quality-assessment
```
- 测量音频质量指标

Essential Tools

必备工具

Tool	Purpose	Install
FFmpeg	Format conversion, filtering	`brew install ffmpeg`
SoX	Noise profiling, effects	`brew install sox`
Whisper	Speech transcription	`pip install openai-whisper`
librosa	Python audio analysis	`pip install librosa`
noisereduce	ML noise reduction	`pip install noisereduce`
Audacity	Visual editing	`brew install audacity`

工具	用途	安装命令
FFmpeg	格式转换、过滤	`brew install ffmpeg`
SoX	噪声分析、效果处理	`brew install sox`
Whisper	语音转录	`pip install openai-whisper`
librosa	Python音频分析	`pip install librosa`
noisereduce	机器学习降噪	`pip install noisereduce`
Audacity	可视化编辑	`brew install audacity`

Workflow Scripts (Recommended)

1) Analyze and capture baseline metrics

1) 分析并捕获基准指标

python3 skills/.experimental/audio-voice-recovery/scripts/preflight_audio.py evidence.wav --out preflight.json

2) Generate a workflow plan template

2) 生成工作流计划模板

python3 skills/.experimental/audio-voice-recovery/scripts/plan_from_preflight.py --preflight preflight.json --out plan.md

3) Compare baseline vs processed metrics

3) 比较基准与处理后音频的指标

python3 skills/.experimental/audio-voice-recovery/scripts/compare_audio.py
--before evidence.wav
--after enhanced.wav
--format md
--out comparison.md

undefined

python3 skills/.experimental/audio-voice-recovery/scripts/compare_audio.py
--before evidence.wav
--after enhanced.wav
--format md
--out comparison.md

undefined

Forensic Preflight Workflow (Do This Before Any Changes)

法医预检工作流（任何修改前必须执行）

Align preflight with SWGDE Best Practices for the Enhancement of Digital Audio (20-a-001) and SWGDE Best Practices for Forensic Audio (08-a-001). Establish an objective baseline state and plan the workflow so processing does not introduce clipping, artifacts, or false "done" confidence. Use

scripts/preflight_audio.py

to capture baseline metrics and preserve the report with the case file.

Capture and record before processing:

Record evidence identity and integrity: path, filename, file size, SHA-256 checksum, source, format/container, codec
Record signal integrity: sample rate, bit depth, channels, duration
Measure baseline loudness and levels: LUFS/LKFS, true peak, peak, RMS, dynamic range, DC offset
Detect clipping and document clipped-sample percentage, peak headroom, exact time ranges
Identify noise profile: stationary vs non-stationary, dominant noise bands, SNR estimate
Locate the region of interest (ROI) and document time ranges and changes over time
Inspect spectral content and estimate speech-band energy and intelligibility risk
Scan for temporal defects: dropouts, discontinuities, splices, drift
Evaluate channel correlation and phase anomalies (if stereo)
Extract and preserve metadata: timestamps, device/model tags, embedded notes

Procedure:

Prepare a forensic working copy, verify hashes, and preserve the original untouched.
Locate ROI and target signal; document exact time ranges and changes across the recording.
Assess challenges to intelligibility and signal quality; map challenges to mitigation strategies.
Identify required processing and plan a workflow order that avoids unwanted artifacts. Generate a plan draft with
```
scripts/plan_from_preflight.py
```
and complete it with case-specific decisions.
Measure baseline loudness and true peak per ITU-R BS.1770 / EBU R 128 and record peak/RMS/DC offset.
Detect clipping and dropouts; if clipping is present, declip first or pause and document limitations.
Inspect spectral content and noise type; collect representative noise profile segments and estimate SNR.
If stereo, evaluate channel correlation and phase; document anomalies.
Create a baseline listening log (multiple devices) and define success criteria for intelligibility and listenability.

Failure-pattern guardrails:

Do not process until every preflight field is captured.
Document every process, setting, software version, and time segment to enable repeatability.
Compare each processed output to the unprocessed input and assess progress toward intelligibility and listenability.
Avoid over-processing; review removed signal (filter residue) to avoid removing target signal components.
Keep intermediate files uncompressed and preserve sample rate/bit depth when moving between tools.
Perform a final review against the original; if unsatisfactory, revise or stop and report limitations.
If the request is not achievable, communicate limitations and do not declare completion.
Require objective metrics and A/B listening before declaring completion.
Do not rely solely on objective metrics; corroborate with critical listening.
Take listening breaks to avoid ear fatigue during extended reviews.

预检流程需符合SWGDE《数字音频增强最佳实践》（20-a-001）与SWGDE《法医音频最佳实践》（08-a-001）。建立客观基准状态并规划工作流，确保处理过程不会引入削波、伪影或错误的"完成"置信度。使用

scripts/preflight_audio.py

捕获基准指标，并将报告与案件文件一同保存。

处理前需捕获并记录：

证据标识与完整性：路径、文件名、文件大小、SHA-256校验和、来源、格式/容器、编解码器
信号完整性：采样率、位深度、声道数、时长
测量基准响度与电平：LUFS/LKFS、真峰值、峰值、RMS、动态范围、直流偏移
检测削波并记录削波样本百分比、峰值余量、精确时间范围
识别噪声特征：平稳/非平稳、主导噪声频段、SNR估计值
定位感兴趣区域（ROI）并记录时间范围及随时间的变化
检查频谱内容并估计语音频段能量与可懂度风险
扫描时域缺陷：信号丢失、不连续、拼接、漂移
评估声道相关性与相位异常（若为立体声）
提取并保存元数据：时间戳、设备/型号标签、嵌入注释

步骤：

准备法医工作副本，验证哈希值，保留原始文件未被修改。
定位ROI与目标信号；记录精确时间范围及录音中的变化。
评估可懂度与信号质量的挑战；将挑战映射到缓解策略。
确定所需处理并规划工作流顺序，避免产生不必要的伪影。使用
```
scripts/plan_from_preflight.py
```
生成计划草稿，并结合案件特定决策完善。
根据ITU-R BS.1770 / EBU R 128测量基准响度与真峰值，并记录峰值/RMS/直流偏移。
检测削波与信号丢失；若存在削波，先进行修复或暂停并记录限制。
检查频谱内容与噪声类型；收集代表性噪声特征片段并估计SNR。
若为立体声，评估声道相关性与相位；记录异常情况。
创建基准监听日志（使用多个设备），并定义可懂度与可听度的成功标准。

失败模式防护措施：

除非所有预检字段都已捕获，否则不得进行处理。
记录每一个处理步骤、设置、软件版本与时间片段，确保可重复性。
将每个处理后的输出与未处理的输入进行比较，评估向可懂度与可听度进展的情况。
避免过度处理；检查被移除的信号（滤波器残留），以免移除目标信号组件。
保留未压缩的中间文件，在工具间切换时保留采样率/位深度。
与原始文件进行最终审查；若不满意，修改或停止并报告限制。
若请求无法实现，沟通限制条件并不得宣布完成。
宣布完成前需有客观指标与A/B监听结果。
不得仅依赖客观指标；需结合关键监听进行佐证。
长时间审查时需休息，避免耳朵疲劳。

Quick Enhancement Pipeline

快速增强流水线

bash

undefined

bash

undefined

1. Analyze original (run preflight and capture baseline metrics)

1. 分析原始音频（运行预检并捕获基准指标）

python3 skills/.experimental/audio-voice-recovery/scripts/preflight_audio.py evidence.wav --out preflight.json

2. Create working copy with checksum

2. 创建带校验和的工作副本

cp evidence.wav working.wav sha256sum evidence.wav > evidence.sha256

3. Apply enhancement

3. 应用增强处理

ffmpeg -i working.wav -af "
highpass=f=80,
adeclick=w=55:o=75,
afftdn=nr=12:nf=-30:nt=w,
equalizer=f=2500:t=q:w=1:g=3,
loudnorm=I=-16:TP=-1.5:LRA=11
" enhanced.wav

4. Transcribe

4. 转录音频

whisper enhanced.wav --model large-v3 --language en

5. Verify original unchanged

5. 验证原始文件未被修改

sha256sum -c evidence.sha256

6. Verify improvement (objective comparison + A/B listening)

6. 验证改进效果（客观比较 + A/B监听）

python3 skills/.experimental/audio-voice-recovery/scripts/compare_audio.py
--before evidence.wav
--after enhanced.wav
--format md
--out comparison.md

undefined

python3 skills/.experimental/audio-voice-recovery/scripts/compare_audio.py
--before evidence.wav
--after enhanced.wav
--format md
--out comparison.md

undefined

How to Use

使用方法

Read individual reference files for detailed explanations and code examples:

Section definitions - Category structure and impact levels
Rule template - Template for adding new rules

阅读单个参考文件获取详细说明与代码示例：

Section definitions - 类别结构与影响级别说明
Rule template - 添加新规则的模板

Reference Files

参考文件

File	Description
AGENTS.md	Complete compiled guide with all rules
references/_sections.md	Category definitions and ordering
assets/templates/_template.md	Template for new rules
metadata.json	Version and reference information

文件	描述
AGENTS.md	包含所有规则的完整编译指南
references/_sections.md	类别定义与排序说明
assets/templates/_template.md	新规则模板
metadata.json	版本与参考信息

audio-voice-recovery

Original

Translation

Forensic Audio Research Audio Voice Recovery Best Practices

法医音频研究：语音恢复最佳实践

When to Apply

适用场景

Rule Categories by Priority

按优先级划分的规则类别

Quick Reference

快速参考

1. Signal Preservation & Analysis (CRITICAL)

1. 信号保留与分析（关键）

2. Noise Profiling & Estimation (CRITICAL)

2. 噪声分析与估计（关键）

3. Spectral Processing (HIGH)

3. 频谱处理（高）

4. Voice Isolation & Enhancement (HIGH)

4. 语音分离与增强（高）

5. Temporal Processing (MEDIUM-HIGH)

5. 时域处理（中高）

6. Transcription & Recognition (MEDIUM)

6. 转录与识别（中）

7. Forensic Authentication (MEDIUM)

7. 法医认证（中）

8. Tool Integration & Automation (LOW-MEDIUM)

8. 工具集成与自动化（中低）

Essential Tools

必备工具

Workflow Scripts (Recommended)

推荐工作流脚本

1) Analyze and capture baseline metrics

1) 分析并捕获基准指标

2) Generate a workflow plan template

2) 生成工作流计划模板

3) Compare baseline vs processed metrics

3) 比较基准与处理后音频的指标

Forensic Preflight Workflow (Do This Before Any Changes)

法医预检工作流（任何修改前必须执行）

Quick Enhancement Pipeline

快速增强流水线

1. Analyze original (run preflight and capture baseline metrics)

1. 分析原始音频（运行预检并捕获基准指标）

2. Create working copy with checksum

2. 创建带校验和的工作副本

3. Apply enhancement

3. 应用增强处理

4. Transcribe

4. 转录音频

5. Verify original unchanged

5. 验证原始文件未被修改

6. Verify improvement (objective comparison + A/B listening)

6. 验证改进效果（客观比较 + A/B监听）

How to Use

使用方法

Reference Files

参考文件