audio-voice-recovery

Compare original and translation side by side

🇺🇸

Original

English
🇨🇳

Translation

Chinese

Forensic Audio Research Audio Voice Recovery Best Practices

法医音频研究:语音恢复最佳实践

Comprehensive audio forensics and voice recovery guide providing CSI-level capabilities for recovering voice from low-quality, low-volume, or damaged audio recordings. Contains 45 rules across 8 categories, prioritized by impact to guide audio enhancement, forensic analysis, and transcription workflows.
这份全面的音频取证与语音恢复指南提供了CSI级别的能力,可从低质量、低音量或受损的录音中恢复语音。指南包含8个类别共45条规则,按影响优先级排序,为音频增强、法医分析和转录工作流提供指导。

When to Apply

适用场景

Reference these guidelines when:
  • Recovering voice from noisy or low-quality recordings
  • Enhancing audio for transcription or legal evidence
  • Performing forensic audio authentication
  • Analyzing recordings for tampering or splices
  • Building automated audio processing pipelines
  • Transcribing difficult or degraded speech
在以下场景中可参考本指南:
  • 从嘈杂或低质量录音中恢复语音
  • 增强音频以用于转录或法律证据
  • 进行法医音频认证
  • 分析录音是否存在篡改或拼接
  • 构建自动化音频处理流水线
  • 转录难识别或受损的语音

Rule Categories by Priority

按优先级划分的规则类别

PriorityCategoryImpactPrefixRules
1Signal Preservation & AnalysisCRITICAL
signal-
5
2Noise Profiling & EstimationCRITICAL
noise-
5
3Spectral ProcessingHIGH
spectral-
6
4Voice Isolation & EnhancementHIGH
voice-
7
5Temporal ProcessingMEDIUM-HIGH
temporal-
5
6Transcription & RecognitionMEDIUM
transcribe-
5
7Forensic AuthenticationMEDIUM
forensic-
5
8Tool Integration & AutomationLOW-MEDIUM
tool-
7
优先级类别影响级别前缀规则数量
1信号保留与分析关键
signal-
5
2噪声分析与估计关键
noise-
5
3频谱处理
spectral-
6
4语音分离与增强
voice-
7
5时域处理中高
temporal-
5
6转录与识别
transcribe-
5
7法医认证
forensic-
5
8工具集成与自动化中低
tool-
7

Quick Reference

快速参考

1. Signal Preservation & Analysis (CRITICAL)

1. 信号保留与分析(关键)

  • signal-preserve-original
    - Never modify original recording
  • signal-lossless-format
    - Use lossless formats for processing
  • signal-sample-rate
    - Preserve native sample rate
  • signal-bit-depth
    - Use maximum bit depth for processing
  • signal-analyze-first
    - Analyze before processing
  • signal-preserve-original
    - 绝不修改原始录音
  • signal-lossless-format
    - 使用无损格式进行处理
  • signal-sample-rate
    - 保留原生采样率
  • signal-bit-depth
    - 使用最大位深度进行处理
  • signal-analyze-first
    - 先分析再处理

2. Noise Profiling & Estimation (CRITICAL)

2. 噪声分析与估计(关键)

  • noise-profile-silence
    - Extract noise profile from silent segments
  • noise-identify-type
    - Identify noise type before reduction
  • noise-adaptive-estimation
    - Use adaptive estimation for non-stationary noise
  • noise-snr-assessment
    - Measure SNR before and after
  • noise-avoid-overprocessing
    - Avoid over-processing and musical artifacts
  • noise-profile-silence
    - 从静音片段提取噪声特征
  • noise-identify-type
    - 降噪前先识别噪声类型
  • noise-adaptive-estimation
    - 对非平稳噪声使用自适应估计
  • noise-snr-assessment
    - 处理前后测量信噪比(SNR)
  • noise-avoid-overprocessing
    - 避免过度处理产生音乐伪影

3. Spectral Processing (HIGH)

3. 频谱处理(高)

  • spectral-subtraction
    - Apply spectral subtraction for stationary noise
  • spectral-wiener-filter
    - Use Wiener filter for optimal noise estimation
  • spectral-notch-filter
    - Apply notch filters for tonal interference
  • spectral-band-limiting
    - Apply frequency band limiting for speech
  • spectral-equalization
    - Use forensic equalization to restore intelligibility
  • spectral-declip
    - Repair clipped audio before other processing
  • spectral-subtraction
    - 对平稳噪声应用谱减法
  • spectral-wiener-filter
    - 使用维纳滤波器优化噪声估计
  • spectral-notch-filter
    - 对 tonal 干扰应用陷波滤波器
  • spectral-band-limiting
    - 对语音应用频率带限制
  • spectral-equalization
    - 使用法医均衡器恢复可懂度
  • spectral-declip
    - 在其他处理前修复削波音频

4. Voice Isolation & Enhancement (HIGH)

4. 语音分离与增强(高)

  • voice-rnnoise
    - Use RNNoise for real-time ML denoising
  • voice-dialogue-isolate
    - Use source separation for complex backgrounds
  • voice-formant-preserve
    - Preserve formants during pitch manipulation
  • voice-dereverb
    - Apply dereverberation for room echo
  • voice-enhance-speech
    - Use AI speech enhancement services for quick results
  • voice-vad-segment
    - Use VAD for targeted processing
  • voice-frequency-boost
    - Boost frequency regions for specific phonemes
  • voice-rnnoise
    - 使用 RNNoise 进行实时机器学习降噪
  • voice-dialogue-isolate
    - 使用源分离处理复杂背景
  • voice-formant-preserve
    - 音调调整时保留共振峰
  • voice-dereverb
    - 应用去混响处理房间回声
  • voice-enhance-speech
    - 使用AI语音增强服务快速获得结果
  • voice-vad-segment
    - 使用VAD(语音活动检测)进行针对性处理
  • voice-frequency-boost
    - 针对特定音素提升频率区域

5. Temporal Processing (MEDIUM-HIGH)

5. 时域处理(中高)

  • temporal-dynamic-range
    - Use dynamic range compression for level consistency
  • temporal-noise-gate
    - Apply noise gate to silence non-speech segments
  • temporal-time-stretch
    - Use time stretching for intelligibility
  • temporal-transient-repair
    - Repair transient damage (clicks, pops, dropouts)
  • temporal-silence-trim
    - Trim silence and normalize before export
  • temporal-dynamic-range
    - 使用动态范围压缩保持电平一致性
  • temporal-noise-gate
    - 应用噪声门静音非语音片段
  • temporal-time-stretch
    - 使用时间拉伸提升可懂度
  • temporal-transient-repair
    - 修复瞬态损伤(咔哒声、爆音、信号丢失)
  • temporal-silence-trim
    - 导出前修剪静音并归一化

6. Transcription & Recognition (MEDIUM)

6. 转录与识别(中)

  • transcribe-whisper
    - Use Whisper for noise-robust transcription
  • transcribe-multipass
    - Use multi-pass transcription for difficult audio
  • transcribe-segment
    - Segment audio for targeted transcription
  • transcribe-confidence
    - Track confidence scores for uncertain words
  • transcribe-hallucination
    - Detect and filter ASR hallucinations
  • transcribe-whisper
    - 使用 Whisper 进行抗噪转录
  • transcribe-multipass
    - 对难处理音频使用多轮转录
  • transcribe-segment
    - 分割音频进行针对性转录
  • transcribe-confidence
    - 跟踪不确定词汇的置信度分数
  • transcribe-hallucination
    - 检测并过滤ASR(自动语音识别)幻觉

7. Forensic Authentication (MEDIUM)

7. 法医认证(中)

  • forensic-enf-analysis
    - Use ENF analysis for timestamp verification
  • forensic-metadata
    - Extract and verify audio metadata
  • forensic-tampering
    - Detect audio tampering and splices
  • forensic-chain-custody
    - Document chain of custody for evidence
  • forensic-speaker-id
    - Extract speaker characteristics for identification
  • forensic-enf-analysis
    - 使用ENF(电网频率)分析验证时间戳
  • forensic-metadata
    - 提取并验证音频元数据
  • forensic-tampering
    - 检测音频篡改与拼接
  • forensic-chain-custody
    - 记录证据的保管链
  • forensic-speaker-id
    - 提取说话人特征用于识别

8. Tool Integration & Automation (LOW-MEDIUM)

8. 工具集成与自动化(中低)

  • tool-ffmpeg-essentials
    - Master essential FFmpeg audio commands
  • tool-sox-commands
    - Use SoX for advanced audio manipulation
  • tool-python-pipeline
    - Build Python audio processing pipelines
  • tool-audacity-workflow
    - Use Audacity for visual analysis and manual editing
  • tool-install-guide
    - Install audio forensic toolchain
  • tool-batch-automation
    - Automate batch processing workflows
  • tool-quality-assessment
    - Measure audio quality metrics
  • tool-ffmpeg-essentials
    - 掌握FFmpeg核心音频命令
  • tool-sox-commands
    - 使用SoX进行高级音频处理
  • tool-python-pipeline
    - 构建Python音频处理流水线
  • tool-audacity-workflow
    - 使用Audacity进行可视化分析与手动编辑
  • tool-install-guide
    - 安装法医音频工具链
  • tool-batch-automation
    - 自动化批量处理工作流
  • tool-quality-assessment
    - 测量音频质量指标

Essential Tools

必备工具

ToolPurposeInstall
FFmpegFormat conversion, filtering
brew install ffmpeg
SoXNoise profiling, effects
brew install sox
WhisperSpeech transcription
pip install openai-whisper
librosaPython audio analysis
pip install librosa
noisereduceML noise reduction
pip install noisereduce
AudacityVisual editing
brew install audacity
工具用途安装命令
FFmpeg格式转换、过滤
brew install ffmpeg
SoX噪声分析、效果处理
brew install sox
Whisper语音转录
pip install openai-whisper
librosaPython音频分析
pip install librosa
noisereduce机器学习降噪
pip install noisereduce
Audacity可视化编辑
brew install audacity

Workflow Scripts (Recommended)

推荐工作流脚本

Use the bundled scripts to generate objective baselines, create a workflow plan, and verify results.
  • scripts/preflight_audio.py
    - Generate a forensic preflight report (JSON or Markdown).
  • scripts/plan_from_preflight.py
    - Create a workflow plan template from the preflight report.
  • scripts/compare_audio.py
    - Compare objective metrics between baseline and processed audio.
Example usage:
bash
undefined
使用附带的脚本生成客观基准、创建工作流计划并验证结果。
  • scripts/preflight_audio.py
    - 生成法医预检报告(JSON或Markdown格式)。
  • scripts/plan_from_preflight.py
    - 根据预检报告创建工作流计划模板。
  • scripts/compare_audio.py
    - 比较基准音频与处理后音频的客观指标。
使用示例:
bash
undefined

1) Analyze and capture baseline metrics

1) 分析并捕获基准指标

python3 skills/.experimental/audio-voice-recovery/scripts/preflight_audio.py evidence.wav --out preflight.json
python3 skills/.experimental/audio-voice-recovery/scripts/preflight_audio.py evidence.wav --out preflight.json

2) Generate a workflow plan template

2) 生成工作流计划模板

python3 skills/.experimental/audio-voice-recovery/scripts/plan_from_preflight.py --preflight preflight.json --out plan.md
python3 skills/.experimental/audio-voice-recovery/scripts/plan_from_preflight.py --preflight preflight.json --out plan.md

3) Compare baseline vs processed metrics

3) 比较基准与处理后音频的指标

python3 skills/.experimental/audio-voice-recovery/scripts/compare_audio.py
--before evidence.wav
--after enhanced.wav
--format md
--out comparison.md
undefined
python3 skills/.experimental/audio-voice-recovery/scripts/compare_audio.py
--before evidence.wav
--after enhanced.wav
--format md
--out comparison.md
undefined

Forensic Preflight Workflow (Do This Before Any Changes)

法医预检工作流(任何修改前必须执行)

Align preflight with SWGDE Best Practices for the Enhancement of Digital Audio (20-a-001) and SWGDE Best Practices for Forensic Audio (08-a-001). Establish an objective baseline state and plan the workflow so processing does not introduce clipping, artifacts, or false "done" confidence. Use
scripts/preflight_audio.py
to capture baseline metrics and preserve the report with the case file.
Capture and record before processing:
  • Record evidence identity and integrity: path, filename, file size, SHA-256 checksum, source, format/container, codec
  • Record signal integrity: sample rate, bit depth, channels, duration
  • Measure baseline loudness and levels: LUFS/LKFS, true peak, peak, RMS, dynamic range, DC offset
  • Detect clipping and document clipped-sample percentage, peak headroom, exact time ranges
  • Identify noise profile: stationary vs non-stationary, dominant noise bands, SNR estimate
  • Locate the region of interest (ROI) and document time ranges and changes over time
  • Inspect spectral content and estimate speech-band energy and intelligibility risk
  • Scan for temporal defects: dropouts, discontinuities, splices, drift
  • Evaluate channel correlation and phase anomalies (if stereo)
  • Extract and preserve metadata: timestamps, device/model tags, embedded notes
Procedure:
  1. Prepare a forensic working copy, verify hashes, and preserve the original untouched.
  2. Locate ROI and target signal; document exact time ranges and changes across the recording.
  3. Assess challenges to intelligibility and signal quality; map challenges to mitigation strategies.
  4. Identify required processing and plan a workflow order that avoids unwanted artifacts. Generate a plan draft with
    scripts/plan_from_preflight.py
    and complete it with case-specific decisions.
  5. Measure baseline loudness and true peak per ITU-R BS.1770 / EBU R 128 and record peak/RMS/DC offset.
  6. Detect clipping and dropouts; if clipping is present, declip first or pause and document limitations.
  7. Inspect spectral content and noise type; collect representative noise profile segments and estimate SNR.
  8. If stereo, evaluate channel correlation and phase; document anomalies.
  9. Create a baseline listening log (multiple devices) and define success criteria for intelligibility and listenability.
Failure-pattern guardrails:
  • Do not process until every preflight field is captured.
  • Document every process, setting, software version, and time segment to enable repeatability.
  • Compare each processed output to the unprocessed input and assess progress toward intelligibility and listenability.
  • Avoid over-processing; review removed signal (filter residue) to avoid removing target signal components.
  • Keep intermediate files uncompressed and preserve sample rate/bit depth when moving between tools.
  • Perform a final review against the original; if unsatisfactory, revise or stop and report limitations.
  • If the request is not achievable, communicate limitations and do not declare completion.
  • Require objective metrics and A/B listening before declaring completion.
  • Do not rely solely on objective metrics; corroborate with critical listening.
  • Take listening breaks to avoid ear fatigue during extended reviews.
预检流程需符合SWGDE《数字音频增强最佳实践》(20-a-001)与SWGDE《法医音频最佳实践》(08-a-001)。 建立客观基准状态并规划工作流,确保处理过程不会引入削波、伪影或错误的"完成"置信度。 使用
scripts/preflight_audio.py
捕获基准指标,并将报告与案件文件一同保存。
处理前需捕获并记录:
  • 证据标识与完整性:路径、文件名、文件大小、SHA-256校验和、来源、格式/容器、编解码器
  • 信号完整性:采样率、位深度、声道数、时长
  • 测量基准响度与电平:LUFS/LKFS、真峰值、峰值、RMS、动态范围、直流偏移
  • 检测削波并记录削波样本百分比、峰值余量、精确时间范围
  • 识别噪声特征:平稳/非平稳、主导噪声频段、SNR估计值
  • 定位感兴趣区域(ROI)并记录时间范围及随时间的变化
  • 检查频谱内容并估计语音频段能量与可懂度风险
  • 扫描时域缺陷:信号丢失、不连续、拼接、漂移
  • 评估声道相关性与相位异常(若为立体声)
  • 提取并保存元数据:时间戳、设备/型号标签、嵌入注释
步骤:
  1. 准备法医工作副本,验证哈希值,保留原始文件未被修改。
  2. 定位ROI与目标信号;记录精确时间范围及录音中的变化。
  3. 评估可懂度与信号质量的挑战;将挑战映射到缓解策略。
  4. 确定所需处理并规划工作流顺序,避免产生不必要的伪影。使用
    scripts/plan_from_preflight.py
    生成计划草稿,并结合案件特定决策完善。
  5. 根据ITU-R BS.1770 / EBU R 128测量基准响度与真峰值,并记录峰值/RMS/直流偏移。
  6. 检测削波与信号丢失;若存在削波,先进行修复或暂停并记录限制。
  7. 检查频谱内容与噪声类型;收集代表性噪声特征片段并估计SNR。
  8. 若为立体声,评估声道相关性与相位;记录异常情况。
  9. 创建基准监听日志(使用多个设备),并定义可懂度与可听度的成功标准。
失败模式防护措施:
  • 除非所有预检字段都已捕获,否则不得进行处理。
  • 记录每一个处理步骤、设置、软件版本与时间片段,确保可重复性。
  • 将每个处理后的输出与未处理的输入进行比较,评估向可懂度与可听度进展的情况。
  • 避免过度处理;检查被移除的信号(滤波器残留),以免移除目标信号组件。
  • 保留未压缩的中间文件,在工具间切换时保留采样率/位深度。
  • 与原始文件进行最终审查;若不满意,修改或停止并报告限制。
  • 若请求无法实现,沟通限制条件并不得宣布完成。
  • 宣布完成前需有客观指标与A/B监听结果。
  • 不得仅依赖客观指标;需结合关键监听进行佐证。
  • 长时间审查时需休息,避免耳朵疲劳。

Quick Enhancement Pipeline

快速增强流水线

bash
undefined
bash
undefined

1. Analyze original (run preflight and capture baseline metrics)

1. 分析原始音频(运行预检并捕获基准指标)

python3 skills/.experimental/audio-voice-recovery/scripts/preflight_audio.py evidence.wav --out preflight.json
python3 skills/.experimental/audio-voice-recovery/scripts/preflight_audio.py evidence.wav --out preflight.json

2. Create working copy with checksum

2. 创建带校验和的工作副本

cp evidence.wav working.wav sha256sum evidence.wav > evidence.sha256
cp evidence.wav working.wav sha256sum evidence.wav > evidence.sha256

3. Apply enhancement

3. 应用增强处理

ffmpeg -i working.wav -af "
highpass=f=80,
adeclick=w=55:o=75,
afftdn=nr=12:nf=-30:nt=w,
equalizer=f=2500:t=q:w=1:g=3,
loudnorm=I=-16:TP=-1.5:LRA=11
" enhanced.wav
ffmpeg -i working.wav -af "
highpass=f=80,
adeclick=w=55:o=75,
afftdn=nr=12:nf=-30:nt=w,
equalizer=f=2500:t=q:w=1:g=3,
loudnorm=I=-16:TP=-1.5:LRA=11
" enhanced.wav

4. Transcribe

4. 转录音频

whisper enhanced.wav --model large-v3 --language en
whisper enhanced.wav --model large-v3 --language en

5. Verify original unchanged

5. 验证原始文件未被修改

sha256sum -c evidence.sha256
sha256sum -c evidence.sha256

6. Verify improvement (objective comparison + A/B listening)

6. 验证改进效果(客观比较 + A/B监听)

python3 skills/.experimental/audio-voice-recovery/scripts/compare_audio.py
--before evidence.wav
--after enhanced.wav
--format md
--out comparison.md
undefined
python3 skills/.experimental/audio-voice-recovery/scripts/compare_audio.py
--before evidence.wav
--after enhanced.wav
--format md
--out comparison.md
undefined

How to Use

使用方法

Read individual reference files for detailed explanations and code examples:
  • Section definitions - Category structure and impact levels
  • Rule template - Template for adding new rules
阅读单个参考文件获取详细说明与代码示例:
  • Section definitions - 类别结构与影响级别说明
  • Rule template - 添加新规则的模板

Reference Files

参考文件

FileDescription
AGENTS.mdComplete compiled guide with all rules
references/_sections.mdCategory definitions and ordering
assets/templates/_template.mdTemplate for new rules
metadata.jsonVersion and reference information
文件描述
AGENTS.md包含所有规则的完整编译指南
references/_sections.md类别定义与排序说明
assets/templates/_template.md新规则模板
metadata.json版本与参考信息