audio-analyzer

Compare original and translation side by side

🇺🇸

Original

English
🇨🇳

Translation

Chinese

Audio Analyzer

音频分析器

A comprehensive toolkit for analyzing audio files. Extract detailed information about audio including tempo, musical key, frequency content, loudness metrics, and generate professional visualizations.
一款用于分析音频文件的综合性工具包。可提取音频的详细信息,包括速度、调式、频率内容、响度指标,并生成专业可视化图表。

Quick Start

快速开始

python
from scripts.audio_analyzer import AudioAnalyzer
python
from scripts.audio_analyzer import AudioAnalyzer

Analyze an audio file

分析音频文件

analyzer = AudioAnalyzer("song.mp3") analyzer.analyze()
analyzer = AudioAnalyzer("song.mp3") analyzer.analyze()

Get all analysis results

获取所有分析结果

results = analyzer.get_results() print(f"BPM: {results['tempo']['bpm']}") print(f"Key: {results['key']['key']} {results['key']['mode']}")
results = analyzer.get_results() print(f"BPM: {results['tempo']['bpm']}") print(f"调式: {results['key']['key']} {results['key']['mode']}")

Generate visualizations

生成可视化图表

analyzer.plot_waveform("waveform.png") analyzer.plot_spectrogram("spectrogram.png")
analyzer.plot_waveform("waveform.png") analyzer.plot_spectrogram("spectrogram.png")

Full report

完整报告

analyzer.save_report("analysis_report.json")
undefined
analyzer.save_report("analysis_report.json")
undefined

Features

功能特性

  • Tempo/BPM Detection: Accurate beat tracking with confidence score
  • Key Detection: Musical key and mode (major/minor) identification
  • Frequency Analysis: Spectrum, dominant frequencies, frequency bands
  • Loudness Metrics: RMS, peak, LUFS, dynamic range
  • Waveform Visualization: Multi-channel waveform plots
  • Spectrogram: Time-frequency visualization with customization
  • Chromagram: Pitch class visualization for harmonic analysis
  • Beat Grid: Visual beat markers overlaid on waveform
  • Export Formats: JSON report, PNG/SVG visualizations
  • 速度/BPM检测: 带置信度评分的精准节拍追踪
  • 调式检测: 识别音乐调式与模式(大调/小调)
  • 频率分析: 频谱、主导频率、频段分析
  • 响度指标: RMS、峰值、LUFS、动态范围
  • 波形可视化: 多通道波形图
  • 频谱图: 可自定义的时频可视化
  • 色度图: 用于和声分析的音高类别可视化
  • 节拍网格: 波形图上叠加可视化节拍标记
  • 导出格式: JSON报告、PNG/SVG可视化图表

API Reference

API参考

Initialization

初始化

python
undefined
python
undefined

From file

从文件初始化

analyzer = AudioAnalyzer("audio.mp3")
analyzer = AudioAnalyzer("audio.mp3")

With custom sample rate

自定义采样率

analyzer = AudioAnalyzer("audio.wav", sr=44100)
undefined
analyzer = AudioAnalyzer("audio.wav", sr=44100)
undefined

Analysis Methods

分析方法

python
undefined
python
undefined

Run full analysis

运行完整分析

analyzer.analyze()
analyzer.analyze()

Individual analyses

单独分析

analyzer.analyze_tempo() # BPM and beat positions analyzer.analyze_key() # Musical key detection analyzer.analyze_loudness() # RMS, peak, LUFS analyzer.analyze_frequency() # Spectrum analysis analyzer.analyze_dynamics() # Dynamic range
undefined
analyzer.analyze_tempo() # BPM和节拍位置 analyzer.analyze_key() # 音乐调式检测 analyzer.analyze_loudness() # RMS、峰值、LUFS analyzer.analyze_frequency() # 频谱分析 analyzer.analyze_dynamics() # 动态范围
undefined

Results Access

结果获取

python
undefined
python
undefined

Get all results as dict

获取所有结果(字典格式)

results = analyzer.get_results()
results = analyzer.get_results()

Individual results

获取单独结果

tempo = analyzer.get_tempo() # {'bpm': 120, 'confidence': 0.85, 'beats': [...]} key = analyzer.get_key() # {'key': 'C', 'mode': 'major', 'confidence': 0.72} loudness = analyzer.get_loudness() # {'rms_db': -14.2, 'peak_db': -0.5, 'lufs': -14.0} freq = analyzer.get_frequency() # {'dominant_freq': 440, 'spectrum': [...]}
undefined
tempo = analyzer.get_tempo() # {'bpm': 120, 'confidence': 0.85, 'beats': [...]} key = analyzer.get_key() # {'key': 'C', 'mode': 'major', 'confidence': 0.72} loudness = analyzer.get_loudness() # {'rms_db': -14.2, 'peak_db': -0.5, 'lufs': -14.0} freq = analyzer.get_frequency() # {'dominant_freq': 440, 'spectrum': [...]}
undefined

Visualization Methods

可视化方法

python
undefined
python
undefined

Waveform

波形图

analyzer.plot_waveform( output="waveform.png", figsize=(12, 4), color="#1f77b4", show_rms=True )
analyzer.plot_waveform( output="waveform.png", figsize=(12, 4), color="#1f77b4", show_rms=True )

Spectrogram

频谱图

analyzer.plot_spectrogram( output="spectrogram.png", figsize=(12, 6), cmap="magma", # viridis, plasma, inferno, magma freq_scale="log", # linear, log, mel max_freq=8000 # Hz )
analyzer.plot_spectrogram( output="spectrogram.png", figsize=(12, 6), cmap="magma", # viridis, plasma, inferno, magma freq_scale="log", # linear, log, mel max_freq=8000 # Hz )

Chromagram (pitch classes)

色度图(音高类别)

analyzer.plot_chromagram( output="chromagram.png", figsize=(12, 4) )
analyzer.plot_chromagram( output="chromagram.png", figsize=(12, 4) )

Onset strength / beat grid

onset强度/节拍网格

analyzer.plot_beats( output="beats.png", figsize=(12, 4), show_strength=True )
analyzer.plot_beats( output="beats.png", figsize=(12, 4), show_strength=True )

Combined dashboard

综合仪表盘

analyzer.plot_dashboard( output="dashboard.png", figsize=(14, 10) )
undefined
analyzer.plot_dashboard( output="dashboard.png", figsize=(14, 10) )
undefined

Export

导出

python
undefined
python
undefined

JSON report with all analysis

包含所有分析结果的JSON报告

analyzer.save_report("report.json")
analyzer.save_report("report.json")

Summary text

摘要文本

summary = analyzer.get_summary() print(summary)
undefined
summary = analyzer.get_summary() print(summary)
undefined

Analysis Details

分析详情

Tempo Detection

速度检测

Uses beat tracking algorithm to detect:
  • BPM: Beats per minute (tempo)
  • Beat positions: Timestamps of detected beats
  • Confidence: Reliability score (0-1)
python
tempo = analyzer.get_tempo()
使用节拍追踪算法检测:
  • BPM: 每分钟节拍数(速度)
  • 节拍位置: 检测到的节拍时间戳
  • 置信度: 可靠性评分(0-1)
python
tempo = analyzer.get_tempo()

{

{

'bpm': 128.0,

'bpm': 128.0,

'confidence': 0.89,

'confidence': 0.89,

'beats': [0.0, 0.469, 0.938, 1.406, ...], # seconds

'beats': [0.0, 0.469, 0.938, 1.406, ...], # 秒

'beat_count': 256

'beat_count': 256

}

}

undefined
undefined

Key Detection

调式检测

Analyzes harmonic content to identify:
  • Key: Root note (C, C#, D, etc.)
  • Mode: Major or minor
  • Confidence: Detection confidence
  • Key profile: Correlation with each key
python
key = analyzer.get_key()
分析和声内容以识别:
  • 调式: 根音(C, C#, D等)
  • 模式: 大调或小调
  • 置信度: 检测置信度
  • 调式轮廓: 与各调式的相关性
python
key = analyzer.get_key()

{

{

'key': 'A',

'key': 'A',

'mode': 'minor',

'mode': 'minor',

'confidence': 0.76,

'confidence': 0.76,

'profile': {'C': 0.12, 'C#': 0.08, ...}

'profile': {'C': 0.12, 'C#': 0.08, ...}

}

}

undefined
undefined

Loudness Metrics

响度指标

Comprehensive loudness analysis:
  • RMS dB: Root mean square level
  • Peak dB: Maximum sample level
  • LUFS: Integrated loudness (broadcast standard)
  • Dynamic Range: Difference between loud and quiet sections
python
loudness = analyzer.get_loudness()
综合性响度分析:
  • RMS dB: 均方根电平
  • 峰值dB: 最大采样电平
  • LUFS: 集成响度(广播标准)
  • 动态范围: 响亮与安静段落的差值
python
loudness = analyzer.get_loudness()

{

{

'rms_db': -14.2,

'rms_db': -14.2,

'peak_db': -0.3,

'peak_db': -0.3,

'lufs': -14.0,

'lufs': -14.0,

'dynamic_range_db': 12.5,

'dynamic_range_db': 12.5,

'crest_factor': 8.2

'crest_factor': 8.2

}

}

undefined
undefined

Frequency Analysis

频率分析

Spectrum analysis including:
  • Dominant frequency: Strongest frequency component
  • Frequency bands: Energy in bass, mid, treble
  • Spectral centroid: "Brightness" of audio
  • Spectral rolloff: Frequency below which 85% of energy exists
python
freq = analyzer.get_frequency()
频谱分析包括:
  • 主导频率: 最强的频率分量
  • 频段: 低音、中音、高音的能量
  • 频谱质心: 音频的“明亮度”
  • 频谱滚降: 85%能量所在的以下频率
python
freq = analyzer.get_frequency()

{

{

'dominant_freq': 440.0,

'dominant_freq': 440.0,

'spectral_centroid': 2150.3,

'spectral_centroid': 2150.3,

'spectral_rolloff': 4200.5,

'spectral_rolloff': 4200.5,

'bands': {

'bands': {

'sub_bass': -28.5, # 20-60 Hz

'sub_bass': -28.5, # 20-60 Hz

'bass': -18.2, # 60-250 Hz

'bass': -18.2, # 60-250 Hz

'low_mid': -12.1, # 250-500 Hz

'low_mid': -12.1, # 250-500 Hz

'mid': -10.8, # 500-2000 Hz

'mid': -10.8, # 500-2000 Hz

'high_mid': -14.3, # 2000-4000 Hz

'high_mid': -14.3, # 2000-4000 Hz

'high': -22.1 # 4000-20000 Hz

'high': -22.1 # 4000-20000 Hz

}

}

}

}

undefined
undefined

CLI Usage

CLI使用

bash
undefined
bash
undefined

Full analysis with all visualizations

完整分析并生成所有可视化图表

python audio_analyzer.py --input song.mp3 --output-dir ./analysis/
python audio_analyzer.py --input song.mp3 --output-dir ./analysis/

Just tempo and key

仅检测速度和调式

python audio_analyzer.py --input song.mp3 --analyze tempo key --output report.json
python audio_analyzer.py --input song.mp3 --analyze tempo key --output report.json

Generate specific visualization

生成指定可视化图表

python audio_analyzer.py --input song.mp3 --plot spectrogram --output spec.png
python audio_analyzer.py --input song.mp3 --plot spectrogram --output spec.png

Dashboard view

仪表盘视图

python audio_analyzer.py --input song.mp3 --dashboard --output dashboard.png
python audio_analyzer.py --input song.mp3 --dashboard --output dashboard.png

Batch analyze directory

批量分析目录

python audio_analyzer.py --input-dir ./songs/ --output-dir ./reports/
undefined
python audio_analyzer.py --input-dir ./songs/ --output-dir ./reports/
undefined

CLI Arguments

CLI参数

ArgumentDescriptionDefault
--input
Input audio fileRequired
--input-dir
Directory of audio files-
--output
Output file path-
--output-dir
Output directory
.
--analyze
Analysis types: tempo, key, loudness, frequency, all
all
--plot
Plot type: waveform, spectrogram, chromagram, beats, dashboard-
--format
Output format: json, txt
json
--sr
Sample rate for analysis
22050
参数描述默认值
--input
输入音频文件必填
--input-dir
音频文件目录-
--output
输出文件路径-
--output-dir
输出目录
.
--analyze
分析类型: tempo, key, loudness, frequency, all
all
--plot
图表类型: waveform, spectrogram, chromagram, beats, dashboard-
--format
输出格式: json, txt
json
--sr
分析采样率
22050

Examples

示例

Song Analysis

歌曲分析

python
analyzer = AudioAnalyzer("track.mp3")
analyzer.analyze()

print(f"Tempo: {analyzer.get_tempo()['bpm']:.1f} BPM")
print(f"Key: {analyzer.get_key()['key']} {analyzer.get_key()['mode']}")
print(f"Loudness: {analyzer.get_loudness()['lufs']:.1f} LUFS")

analyzer.plot_dashboard("track_analysis.png")
python
analyzer = AudioAnalyzer("track.mp3")
analyzer.analyze()

print(f"速度: {analyzer.get_tempo()['bpm']:.1f} BPM")
print(f"调式: {analyzer.get_key()['key']} {analyzer.get_key()['mode']}")
print(f"响度: {analyzer.get_loudness()['lufs']:.1f} LUFS")

analyzer.plot_dashboard("track_analysis.png")

Podcast Quality Check

播客质量检查

python
analyzer = AudioAnalyzer("podcast.mp3")
analyzer.analyze_loudness()

loudness = analyzer.get_loudness()
if loudness['lufs'] > -16:
    print("Warning: Audio may be too loud for podcast standards")
elif loudness['lufs'] < -20:
    print("Warning: Audio may be too quiet")
else:
    print("Loudness is within podcast standards (-16 to -20 LUFS)")
python
analyzer = AudioAnalyzer("podcast.mp3")
analyzer.analyze_loudness()

loudness = analyzer.get_loudness()
if loudness['lufs'] > -16:
    print("警告: 音频对于播客标准可能过响")
elif loudness['lufs'] < -20:
    print("警告: 音频可能过轻")
else:
    print("响度符合播客标准(-16至-20 LUFS)")

Batch Analysis

批量分析

python
import os
from scripts.audio_analyzer import AudioAnalyzer

results = []
for filename in os.listdir("./songs"):
    if filename.endswith(('.mp3', '.wav', '.flac')):
        analyzer = AudioAnalyzer(f"./songs/{filename}")
        analyzer.analyze()
        results.append({
            'file': filename,
            'bpm': analyzer.get_tempo()['bpm'],
            'key': f"{analyzer.get_key()['key']} {analyzer.get_key()['mode']}",
            'lufs': analyzer.get_loudness()['lufs']
        })
python
import os
from scripts.audio_analyzer import AudioAnalyzer

results = []
for filename in os.listdir("./songs"):
    if filename.endswith(('.mp3', '.wav', '.flac')):
        analyzer = AudioAnalyzer(f"./songs/{filename}")
        analyzer.analyze()
        results.append({
            'file': filename,
            'bpm': analyzer.get_tempo()['bpm'],
            'key': f"{analyzer.get_key()['key']} {analyzer.get_key()['mode']}",
            'lufs': analyzer.get_loudness()['lufs']
        })

Sort by BPM for DJ set

按BPM排序用于DJ set

results.sort(key=lambda x: x['bpm'])
undefined
results.sort(key=lambda x: x['bpm'])
undefined

Supported Formats

支持格式

Input formats (via librosa/soundfile):
  • MP3
  • WAV
  • FLAC
  • OGG
  • M4A/AAC
  • AIFF
Output formats:
  • JSON (analysis report)
  • PNG (visualizations)
  • SVG (visualizations)
  • TXT (summary)
输入格式(通过librosa/soundfile):
  • MP3
  • WAV
  • FLAC
  • OGG
  • M4A/AAC
  • AIFF
输出格式:
  • JSON(分析报告)
  • PNG(可视化图表)
  • SVG(可视化图表)
  • TXT(摘要)

Dependencies

依赖

librosa>=0.10.0
soundfile>=0.12.0
matplotlib>=3.7.0
numpy>=1.24.0
scipy>=1.10.0
librosa>=0.10.0
soundfile>=0.12.0
matplotlib>=3.7.0
numpy>=1.24.0
scipy>=1.10.0

Limitations

局限性

  • Key detection works best with melodic content (less accurate for drums/percussion)
  • BPM detection may struggle with free-tempo or complex time signatures
  • Very short clips (<5 seconds) may have reduced accuracy
  • LUFS calculation is simplified (not full ITU-R BS.1770-4)
  • 调式检测在旋律内容中表现最佳(对鼓点/打击乐的准确性较低)
  • BPM检测在自由速度或复杂拍号下可能存在困难
  • 极短片段(<5秒)的准确性可能降低
  • LUFS计算为简化版本(非完整ITU-R BS.1770-4标准)