audio-analyzer
Compare original and translation side by side
🇺🇸
Original
English🇨🇳
Translation
ChineseAudio Analyzer
音频分析器
A comprehensive toolkit for analyzing audio files. Extract detailed information about audio including tempo, musical key, frequency content, loudness metrics, and generate professional visualizations.
一款用于分析音频文件的综合性工具包。可提取音频的详细信息,包括速度、调式、频率内容、响度指标,并生成专业可视化图表。
Quick Start
快速开始
python
from scripts.audio_analyzer import AudioAnalyzerpython
from scripts.audio_analyzer import AudioAnalyzerAnalyze an audio file
分析音频文件
analyzer = AudioAnalyzer("song.mp3")
analyzer.analyze()
analyzer = AudioAnalyzer("song.mp3")
analyzer.analyze()
Get all analysis results
获取所有分析结果
results = analyzer.get_results()
print(f"BPM: {results['tempo']['bpm']}")
print(f"Key: {results['key']['key']} {results['key']['mode']}")
results = analyzer.get_results()
print(f"BPM: {results['tempo']['bpm']}")
print(f"调式: {results['key']['key']} {results['key']['mode']}")
Generate visualizations
生成可视化图表
analyzer.plot_waveform("waveform.png")
analyzer.plot_spectrogram("spectrogram.png")
analyzer.plot_waveform("waveform.png")
analyzer.plot_spectrogram("spectrogram.png")
Full report
完整报告
analyzer.save_report("analysis_report.json")
undefinedanalyzer.save_report("analysis_report.json")
undefinedFeatures
功能特性
- Tempo/BPM Detection: Accurate beat tracking with confidence score
- Key Detection: Musical key and mode (major/minor) identification
- Frequency Analysis: Spectrum, dominant frequencies, frequency bands
- Loudness Metrics: RMS, peak, LUFS, dynamic range
- Waveform Visualization: Multi-channel waveform plots
- Spectrogram: Time-frequency visualization with customization
- Chromagram: Pitch class visualization for harmonic analysis
- Beat Grid: Visual beat markers overlaid on waveform
- Export Formats: JSON report, PNG/SVG visualizations
- 速度/BPM检测: 带置信度评分的精准节拍追踪
- 调式检测: 识别音乐调式与模式(大调/小调)
- 频率分析: 频谱、主导频率、频段分析
- 响度指标: RMS、峰值、LUFS、动态范围
- 波形可视化: 多通道波形图
- 频谱图: 可自定义的时频可视化
- 色度图: 用于和声分析的音高类别可视化
- 节拍网格: 波形图上叠加可视化节拍标记
- 导出格式: JSON报告、PNG/SVG可视化图表
API Reference
API参考
Initialization
初始化
python
undefinedpython
undefinedFrom file
从文件初始化
analyzer = AudioAnalyzer("audio.mp3")
analyzer = AudioAnalyzer("audio.mp3")
With custom sample rate
自定义采样率
analyzer = AudioAnalyzer("audio.wav", sr=44100)
undefinedanalyzer = AudioAnalyzer("audio.wav", sr=44100)
undefinedAnalysis Methods
分析方法
python
undefinedpython
undefinedRun full analysis
运行完整分析
analyzer.analyze()
analyzer.analyze()
Individual analyses
单独分析
analyzer.analyze_tempo() # BPM and beat positions
analyzer.analyze_key() # Musical key detection
analyzer.analyze_loudness() # RMS, peak, LUFS
analyzer.analyze_frequency() # Spectrum analysis
analyzer.analyze_dynamics() # Dynamic range
undefinedanalyzer.analyze_tempo() # BPM和节拍位置
analyzer.analyze_key() # 音乐调式检测
analyzer.analyze_loudness() # RMS、峰值、LUFS
analyzer.analyze_frequency() # 频谱分析
analyzer.analyze_dynamics() # 动态范围
undefinedResults Access
结果获取
python
undefinedpython
undefinedGet all results as dict
获取所有结果(字典格式)
results = analyzer.get_results()
results = analyzer.get_results()
Individual results
获取单独结果
tempo = analyzer.get_tempo() # {'bpm': 120, 'confidence': 0.85, 'beats': [...]}
key = analyzer.get_key() # {'key': 'C', 'mode': 'major', 'confidence': 0.72}
loudness = analyzer.get_loudness() # {'rms_db': -14.2, 'peak_db': -0.5, 'lufs': -14.0}
freq = analyzer.get_frequency() # {'dominant_freq': 440, 'spectrum': [...]}
undefinedtempo = analyzer.get_tempo() # {'bpm': 120, 'confidence': 0.85, 'beats': [...]}
key = analyzer.get_key() # {'key': 'C', 'mode': 'major', 'confidence': 0.72}
loudness = analyzer.get_loudness() # {'rms_db': -14.2, 'peak_db': -0.5, 'lufs': -14.0}
freq = analyzer.get_frequency() # {'dominant_freq': 440, 'spectrum': [...]}
undefinedVisualization Methods
可视化方法
python
undefinedpython
undefinedWaveform
波形图
analyzer.plot_waveform(
output="waveform.png",
figsize=(12, 4),
color="#1f77b4",
show_rms=True
)
analyzer.plot_waveform(
output="waveform.png",
figsize=(12, 4),
color="#1f77b4",
show_rms=True
)
Spectrogram
频谱图
analyzer.plot_spectrogram(
output="spectrogram.png",
figsize=(12, 6),
cmap="magma", # viridis, plasma, inferno, magma
freq_scale="log", # linear, log, mel
max_freq=8000 # Hz
)
analyzer.plot_spectrogram(
output="spectrogram.png",
figsize=(12, 6),
cmap="magma", # viridis, plasma, inferno, magma
freq_scale="log", # linear, log, mel
max_freq=8000 # Hz
)
Chromagram (pitch classes)
色度图(音高类别)
analyzer.plot_chromagram(
output="chromagram.png",
figsize=(12, 4)
)
analyzer.plot_chromagram(
output="chromagram.png",
figsize=(12, 4)
)
Onset strength / beat grid
onset强度/节拍网格
analyzer.plot_beats(
output="beats.png",
figsize=(12, 4),
show_strength=True
)
analyzer.plot_beats(
output="beats.png",
figsize=(12, 4),
show_strength=True
)
Combined dashboard
综合仪表盘
analyzer.plot_dashboard(
output="dashboard.png",
figsize=(14, 10)
)
undefinedanalyzer.plot_dashboard(
output="dashboard.png",
figsize=(14, 10)
)
undefinedExport
导出
python
undefinedpython
undefinedJSON report with all analysis
包含所有分析结果的JSON报告
analyzer.save_report("report.json")
analyzer.save_report("report.json")
Summary text
摘要文本
summary = analyzer.get_summary()
print(summary)
undefinedsummary = analyzer.get_summary()
print(summary)
undefinedAnalysis Details
分析详情
Tempo Detection
速度检测
Uses beat tracking algorithm to detect:
- BPM: Beats per minute (tempo)
- Beat positions: Timestamps of detected beats
- Confidence: Reliability score (0-1)
python
tempo = analyzer.get_tempo()使用节拍追踪算法检测:
- BPM: 每分钟节拍数(速度)
- 节拍位置: 检测到的节拍时间戳
- 置信度: 可靠性评分(0-1)
python
tempo = analyzer.get_tempo(){
{
'bpm': 128.0,
'bpm': 128.0,
'confidence': 0.89,
'confidence': 0.89,
'beats': [0.0, 0.469, 0.938, 1.406, ...], # seconds
'beats': [0.0, 0.469, 0.938, 1.406, ...], # 秒
'beat_count': 256
'beat_count': 256
}
}
undefinedundefinedKey Detection
调式检测
Analyzes harmonic content to identify:
- Key: Root note (C, C#, D, etc.)
- Mode: Major or minor
- Confidence: Detection confidence
- Key profile: Correlation with each key
python
key = analyzer.get_key()分析和声内容以识别:
- 调式: 根音(C, C#, D等)
- 模式: 大调或小调
- 置信度: 检测置信度
- 调式轮廓: 与各调式的相关性
python
key = analyzer.get_key(){
{
'key': 'A',
'key': 'A',
'mode': 'minor',
'mode': 'minor',
'confidence': 0.76,
'confidence': 0.76,
'profile': {'C': 0.12, 'C#': 0.08, ...}
'profile': {'C': 0.12, 'C#': 0.08, ...}
}
}
undefinedundefinedLoudness Metrics
响度指标
Comprehensive loudness analysis:
- RMS dB: Root mean square level
- Peak dB: Maximum sample level
- LUFS: Integrated loudness (broadcast standard)
- Dynamic Range: Difference between loud and quiet sections
python
loudness = analyzer.get_loudness()综合性响度分析:
- RMS dB: 均方根电平
- 峰值dB: 最大采样电平
- LUFS: 集成响度(广播标准)
- 动态范围: 响亮与安静段落的差值
python
loudness = analyzer.get_loudness(){
{
'rms_db': -14.2,
'rms_db': -14.2,
'peak_db': -0.3,
'peak_db': -0.3,
'lufs': -14.0,
'lufs': -14.0,
'dynamic_range_db': 12.5,
'dynamic_range_db': 12.5,
'crest_factor': 8.2
'crest_factor': 8.2
}
}
undefinedundefinedFrequency Analysis
频率分析
Spectrum analysis including:
- Dominant frequency: Strongest frequency component
- Frequency bands: Energy in bass, mid, treble
- Spectral centroid: "Brightness" of audio
- Spectral rolloff: Frequency below which 85% of energy exists
python
freq = analyzer.get_frequency()频谱分析包括:
- 主导频率: 最强的频率分量
- 频段: 低音、中音、高音的能量
- 频谱质心: 音频的“明亮度”
- 频谱滚降: 85%能量所在的以下频率
python
freq = analyzer.get_frequency(){
{
'dominant_freq': 440.0,
'dominant_freq': 440.0,
'spectral_centroid': 2150.3,
'spectral_centroid': 2150.3,
'spectral_rolloff': 4200.5,
'spectral_rolloff': 4200.5,
'bands': {
'bands': {
'sub_bass': -28.5, # 20-60 Hz
'sub_bass': -28.5, # 20-60 Hz
'bass': -18.2, # 60-250 Hz
'bass': -18.2, # 60-250 Hz
'low_mid': -12.1, # 250-500 Hz
'low_mid': -12.1, # 250-500 Hz
'mid': -10.8, # 500-2000 Hz
'mid': -10.8, # 500-2000 Hz
'high_mid': -14.3, # 2000-4000 Hz
'high_mid': -14.3, # 2000-4000 Hz
'high': -22.1 # 4000-20000 Hz
'high': -22.1 # 4000-20000 Hz
}
}
}
}
undefinedundefinedCLI Usage
CLI使用
bash
undefinedbash
undefinedFull analysis with all visualizations
完整分析并生成所有可视化图表
python audio_analyzer.py --input song.mp3 --output-dir ./analysis/
python audio_analyzer.py --input song.mp3 --output-dir ./analysis/
Just tempo and key
仅检测速度和调式
python audio_analyzer.py --input song.mp3 --analyze tempo key --output report.json
python audio_analyzer.py --input song.mp3 --analyze tempo key --output report.json
Generate specific visualization
生成指定可视化图表
python audio_analyzer.py --input song.mp3 --plot spectrogram --output spec.png
python audio_analyzer.py --input song.mp3 --plot spectrogram --output spec.png
Dashboard view
仪表盘视图
python audio_analyzer.py --input song.mp3 --dashboard --output dashboard.png
python audio_analyzer.py --input song.mp3 --dashboard --output dashboard.png
Batch analyze directory
批量分析目录
python audio_analyzer.py --input-dir ./songs/ --output-dir ./reports/
undefinedpython audio_analyzer.py --input-dir ./songs/ --output-dir ./reports/
undefinedCLI Arguments
CLI参数
| Argument | Description | Default |
|---|---|---|
| Input audio file | Required |
| Directory of audio files | - |
| Output file path | - |
| Output directory | |
| Analysis types: tempo, key, loudness, frequency, all | |
| Plot type: waveform, spectrogram, chromagram, beats, dashboard | - |
| Output format: json, txt | |
| Sample rate for analysis | |
| 参数 | 描述 | 默认值 |
|---|---|---|
| 输入音频文件 | 必填 |
| 音频文件目录 | - |
| 输出文件路径 | - |
| 输出目录 | |
| 分析类型: tempo, key, loudness, frequency, all | |
| 图表类型: waveform, spectrogram, chromagram, beats, dashboard | - |
| 输出格式: json, txt | |
| 分析采样率 | |
Examples
示例
Song Analysis
歌曲分析
python
analyzer = AudioAnalyzer("track.mp3")
analyzer.analyze()
print(f"Tempo: {analyzer.get_tempo()['bpm']:.1f} BPM")
print(f"Key: {analyzer.get_key()['key']} {analyzer.get_key()['mode']}")
print(f"Loudness: {analyzer.get_loudness()['lufs']:.1f} LUFS")
analyzer.plot_dashboard("track_analysis.png")python
analyzer = AudioAnalyzer("track.mp3")
analyzer.analyze()
print(f"速度: {analyzer.get_tempo()['bpm']:.1f} BPM")
print(f"调式: {analyzer.get_key()['key']} {analyzer.get_key()['mode']}")
print(f"响度: {analyzer.get_loudness()['lufs']:.1f} LUFS")
analyzer.plot_dashboard("track_analysis.png")Podcast Quality Check
播客质量检查
python
analyzer = AudioAnalyzer("podcast.mp3")
analyzer.analyze_loudness()
loudness = analyzer.get_loudness()
if loudness['lufs'] > -16:
print("Warning: Audio may be too loud for podcast standards")
elif loudness['lufs'] < -20:
print("Warning: Audio may be too quiet")
else:
print("Loudness is within podcast standards (-16 to -20 LUFS)")python
analyzer = AudioAnalyzer("podcast.mp3")
analyzer.analyze_loudness()
loudness = analyzer.get_loudness()
if loudness['lufs'] > -16:
print("警告: 音频对于播客标准可能过响")
elif loudness['lufs'] < -20:
print("警告: 音频可能过轻")
else:
print("响度符合播客标准(-16至-20 LUFS)")Batch Analysis
批量分析
python
import os
from scripts.audio_analyzer import AudioAnalyzer
results = []
for filename in os.listdir("./songs"):
if filename.endswith(('.mp3', '.wav', '.flac')):
analyzer = AudioAnalyzer(f"./songs/{filename}")
analyzer.analyze()
results.append({
'file': filename,
'bpm': analyzer.get_tempo()['bpm'],
'key': f"{analyzer.get_key()['key']} {analyzer.get_key()['mode']}",
'lufs': analyzer.get_loudness()['lufs']
})python
import os
from scripts.audio_analyzer import AudioAnalyzer
results = []
for filename in os.listdir("./songs"):
if filename.endswith(('.mp3', '.wav', '.flac')):
analyzer = AudioAnalyzer(f"./songs/{filename}")
analyzer.analyze()
results.append({
'file': filename,
'bpm': analyzer.get_tempo()['bpm'],
'key': f"{analyzer.get_key()['key']} {analyzer.get_key()['mode']}",
'lufs': analyzer.get_loudness()['lufs']
})Sort by BPM for DJ set
按BPM排序用于DJ set
results.sort(key=lambda x: x['bpm'])
undefinedresults.sort(key=lambda x: x['bpm'])
undefinedSupported Formats
支持格式
Input formats (via librosa/soundfile):
- MP3
- WAV
- FLAC
- OGG
- M4A/AAC
- AIFF
Output formats:
- JSON (analysis report)
- PNG (visualizations)
- SVG (visualizations)
- TXT (summary)
输入格式(通过librosa/soundfile):
- MP3
- WAV
- FLAC
- OGG
- M4A/AAC
- AIFF
输出格式:
- JSON(分析报告)
- PNG(可视化图表)
- SVG(可视化图表)
- TXT(摘要)
Dependencies
依赖
librosa>=0.10.0
soundfile>=0.12.0
matplotlib>=3.7.0
numpy>=1.24.0
scipy>=1.10.0librosa>=0.10.0
soundfile>=0.12.0
matplotlib>=3.7.0
numpy>=1.24.0
scipy>=1.10.0Limitations
局限性
- Key detection works best with melodic content (less accurate for drums/percussion)
- BPM detection may struggle with free-tempo or complex time signatures
- Very short clips (<5 seconds) may have reduced accuracy
- LUFS calculation is simplified (not full ITU-R BS.1770-4)
- 调式检测在旋律内容中表现最佳(对鼓点/打击乐的准确性较低)
- BPM检测在自由速度或复杂拍号下可能存在困难
- 极短片段(<5秒)的准确性可能降低
- LUFS计算为简化版本(非完整ITU-R BS.1770-4标准)