Audio Analyzer

音频分析器

A comprehensive toolkit for analyzing audio files. Extract detailed information about audio including tempo, musical key, frequency content, loudness metrics, and generate professional visualizations.

一款用于分析音频文件的综合性工具包。可提取音频的详细信息，包括速度、调式、频率内容、响度指标，并生成专业可视化图表。

Quick Start

快速开始

python

from scripts.audio_analyzer import AudioAnalyzer

python

from scripts.audio_analyzer import AudioAnalyzer

Analyze an audio file

分析音频文件

analyzer = AudioAnalyzer("song.mp3") analyzer.analyze()

Get all analysis results

获取所有分析结果

results = analyzer.get_results() print(f"BPM: {results['tempo']['bpm']}") print(f"Key: {results['key']['key']} {results['key']['mode']}")

results = analyzer.get_results() print(f"BPM: {results['tempo']['bpm']}") print(f"调式: {results['key']['key']} {results['key']['mode']}")

Generate visualizations

生成可视化图表

analyzer.plot_waveform("waveform.png") analyzer.plot_spectrogram("spectrogram.png")

Full report

完整报告

analyzer.save_report("analysis_report.json")

undefined

analyzer.save_report("analysis_report.json")

undefined

Features

功能特性

Tempo/BPM Detection: Accurate beat tracking with confidence score
Key Detection: Musical key and mode (major/minor) identification
Frequency Analysis: Spectrum, dominant frequencies, frequency bands
Loudness Metrics: RMS, peak, LUFS, dynamic range
Waveform Visualization: Multi-channel waveform plots
Spectrogram: Time-frequency visualization with customization
Chromagram: Pitch class visualization for harmonic analysis
Beat Grid: Visual beat markers overlaid on waveform
Export Formats: JSON report, PNG/SVG visualizations

速度/BPM检测: 带置信度评分的精准节拍追踪
调式检测: 识别音乐调式与模式（大调/小调）
频率分析: 频谱、主导频率、频段分析
响度指标: RMS、峰值、LUFS、动态范围
波形可视化: 多通道波形图
频谱图: 可自定义的时频可视化
色度图: 用于和声分析的音高类别可视化
节拍网格: 波形图上叠加可视化节拍标记
导出格式: JSON报告、PNG/SVG可视化图表

API Reference

API参考

Initialization

初始化

python

undefined

python

undefined

From file

从文件初始化

analyzer = AudioAnalyzer("audio.mp3")

With custom sample rate

自定义采样率

analyzer = AudioAnalyzer("audio.wav", sr=44100)

undefined

analyzer = AudioAnalyzer("audio.wav", sr=44100)

undefined

Analysis Methods

分析方法

python

undefined

python

undefined

Run full analysis

运行完整分析

analyzer.analyze()

Individual analyses

单独分析

analyzer.analyze_tempo() # BPM and beat positions analyzer.analyze_key() # Musical key detection analyzer.analyze_loudness() # RMS, peak, LUFS analyzer.analyze_frequency() # Spectrum analysis analyzer.analyze_dynamics() # Dynamic range

undefined

analyzer.analyze_tempo() # BPM和节拍位置 analyzer.analyze_key() # 音乐调式检测 analyzer.analyze_loudness() # RMS、峰值、LUFS analyzer.analyze_frequency() # 频谱分析 analyzer.analyze_dynamics() # 动态范围

undefined

Results Access

结果获取

python

undefined

python

undefined

Get all results as dict

获取所有结果（字典格式）

results = analyzer.get_results()

Individual results

获取单独结果

tempo = analyzer.get_tempo() # {'bpm': 120, 'confidence': 0.85, 'beats': [...]} key = analyzer.get_key() # {'key': 'C', 'mode': 'major', 'confidence': 0.72} loudness = analyzer.get_loudness() # {'rms_db': -14.2, 'peak_db': -0.5, 'lufs': -14.0} freq = analyzer.get_frequency() # {'dominant_freq': 440, 'spectrum': [...]}

undefined

tempo = analyzer.get_tempo() # {'bpm': 120, 'confidence': 0.85, 'beats': [...]} key = analyzer.get_key() # {'key': 'C', 'mode': 'major', 'confidence': 0.72} loudness = analyzer.get_loudness() # {'rms_db': -14.2, 'peak_db': -0.5, 'lufs': -14.0} freq = analyzer.get_frequency() # {'dominant_freq': 440, 'spectrum': [...]}

undefined

Visualization Methods

可视化方法

python

undefined

python

undefined

Waveform

波形图

analyzer.plot_waveform( output="waveform.png", figsize=(12, 4), color="#1f77b4", show_rms=True )

Spectrogram

频谱图

analyzer.plot_spectrogram( output="spectrogram.png", figsize=(12, 6), cmap="magma", # viridis, plasma, inferno, magma freq_scale="log", # linear, log, mel max_freq=8000 # Hz )

Chromagram (pitch classes)

色度图（音高类别）

analyzer.plot_chromagram( output="chromagram.png", figsize=(12, 4) )

Onset strength / beat grid

onset强度/节拍网格

analyzer.plot_beats( output="beats.png", figsize=(12, 4), show_strength=True )

Combined dashboard

综合仪表盘

analyzer.plot_dashboard( output="dashboard.png", figsize=(14, 10) )

undefined

analyzer.plot_dashboard( output="dashboard.png", figsize=(14, 10) )

undefined

Export

导出

python

undefined

python

undefined

JSON report with all analysis

包含所有分析结果的JSON报告

analyzer.save_report("report.json")

Summary text

摘要文本

summary = analyzer.get_summary() print(summary)

undefined

summary = analyzer.get_summary() print(summary)

undefined

Analysis Details

分析详情

Tempo Detection

速度检测

Uses beat tracking algorithm to detect:

BPM: Beats per minute (tempo)
Beat positions: Timestamps of detected beats
Confidence: Reliability score (0-1)

python

tempo = analyzer.get_tempo()

使用节拍追踪算法检测：

BPM: 每分钟节拍数（速度）
节拍位置: 检测到的节拍时间戳
置信度: 可靠性评分（0-1）

python

tempo = analyzer.get_tempo()

{

'bpm': 128.0,

'confidence': 0.89,

'beats': [0.0, 0.469, 0.938, 1.406, ...], # seconds

'beats': [0.0, 0.469, 0.938, 1.406, ...], # 秒

'beat_count': 256

}

undefined

undefined

Key Detection

调式检测

Analyzes harmonic content to identify:

Key: Root note (C, C#, D, etc.)
Mode: Major or minor
Confidence: Detection confidence
Key profile: Correlation with each key

python

key = analyzer.get_key()

分析和声内容以识别：

调式: 根音（C, C#, D等）
模式: 大调或小调
置信度: 检测置信度
调式轮廓: 与各调式的相关性

python

key = analyzer.get_key()

{

'key': 'A',

'mode': 'minor',

'confidence': 0.76,

'profile': {'C': 0.12, 'C#': 0.08, ...}

}

undefined

undefined

Loudness Metrics

响度指标

Comprehensive loudness analysis:

RMS dB: Root mean square level
Peak dB: Maximum sample level
LUFS: Integrated loudness (broadcast standard)
Dynamic Range: Difference between loud and quiet sections

python

loudness = analyzer.get_loudness()

综合性响度分析：

RMS dB: 均方根电平
峰值dB: 最大采样电平
LUFS: 集成响度（广播标准）
动态范围: 响亮与安静段落的差值

python

loudness = analyzer.get_loudness()

{

'rms_db': -14.2,

'peak_db': -0.3,

'lufs': -14.0,

'dynamic_range_db': 12.5,

'crest_factor': 8.2

}

undefined

undefined

Frequency Analysis

频率分析

Spectrum analysis including:

Dominant frequency: Strongest frequency component
Frequency bands: Energy in bass, mid, treble
Spectral centroid: "Brightness" of audio
Spectral rolloff: Frequency below which 85% of energy exists

python

freq = analyzer.get_frequency()

频谱分析包括：

主导频率: 最强的频率分量
频段: 低音、中音、高音的能量
频谱质心: 音频的“明亮度”
频谱滚降: 85%能量所在的以下频率

python

freq = analyzer.get_frequency()

{

'dominant_freq': 440.0,

'spectral_centroid': 2150.3,

'spectral_rolloff': 4200.5,

'bands': {

'sub_bass': -28.5, # 20-60 Hz

'bass': -18.2, # 60-250 Hz

'low_mid': -12.1, # 250-500 Hz

'mid': -10.8, # 500-2000 Hz

'high_mid': -14.3, # 2000-4000 Hz

'high': -22.1 # 4000-20000 Hz

}

undefined

undefined

CLI Usage

CLI使用

bash

undefined

bash

undefined

Full analysis with all visualizations

完整分析并生成所有可视化图表

python audio_analyzer.py --input song.mp3 --output-dir ./analysis/

Just tempo and key

仅检测速度和调式

python audio_analyzer.py --input song.mp3 --analyze tempo key --output report.json

Generate specific visualization

生成指定可视化图表

python audio_analyzer.py --input song.mp3 --plot spectrogram --output spec.png

Dashboard view

仪表盘视图

python audio_analyzer.py --input song.mp3 --dashboard --output dashboard.png

Batch analyze directory

批量分析目录

python audio_analyzer.py --input-dir ./songs/ --output-dir ./reports/

undefined

python audio_analyzer.py --input-dir ./songs/ --output-dir ./reports/

undefined

CLI Arguments

CLI参数

Argument	Description	Default
`--input`	Input audio file	Required
`--input-dir`	Directory of audio files	-
`--output`	Output file path	-
`--output-dir`	Output directory	`.`
`--analyze`	Analysis types: tempo, key, loudness, frequency, all	`all`
`--plot`	Plot type: waveform, spectrogram, chromagram, beats, dashboard	-
`--format`	Output format: json, txt	`json`
`--sr`	Sample rate for analysis	`22050`

参数	描述	默认值
`--input`	输入音频文件	必填
`--input-dir`	音频文件目录	-
`--output`	输出文件路径	-
`--output-dir`	输出目录	`.`
`--analyze`	分析类型: tempo, key, loudness, frequency, all	`all`
`--plot`	图表类型: waveform, spectrogram, chromagram, beats, dashboard	-
`--format`	输出格式: json, txt	`json`
`--sr`	分析采样率	`22050`

Examples

示例

Song Analysis

歌曲分析

python

analyzer = AudioAnalyzer("track.mp3")
analyzer.analyze()

print(f"Tempo: {analyzer.get_tempo()['bpm']:.1f} BPM")
print(f"Key: {analyzer.get_key()['key']} {analyzer.get_key()['mode']}")
print(f"Loudness: {analyzer.get_loudness()['lufs']:.1f} LUFS")

analyzer.plot_dashboard("track_analysis.png")

python

analyzer = AudioAnalyzer("track.mp3")
analyzer.analyze()

print(f"速度: {analyzer.get_tempo()['bpm']:.1f} BPM")
print(f"调式: {analyzer.get_key()['key']} {analyzer.get_key()['mode']}")
print(f"响度: {analyzer.get_loudness()['lufs']:.1f} LUFS")

analyzer.plot_dashboard("track_analysis.png")

Podcast Quality Check

播客质量检查

python

analyzer = AudioAnalyzer("podcast.mp3")
analyzer.analyze_loudness()

loudness = analyzer.get_loudness()
if loudness['lufs'] > -16:
    print("Warning: Audio may be too loud for podcast standards")
elif loudness['lufs'] < -20:
    print("Warning: Audio may be too quiet")
else:
    print("Loudness is within podcast standards (-16 to -20 LUFS)")

python

analyzer = AudioAnalyzer("podcast.mp3")
analyzer.analyze_loudness()

loudness = analyzer.get_loudness()
if loudness['lufs'] > -16:
    print("警告: 音频对于播客标准可能过响")
elif loudness['lufs'] < -20:
    print("警告: 音频可能过轻")
else:
    print("响度符合播客标准(-16至-20 LUFS)")

Batch Analysis

批量分析

python

import os
from scripts.audio_analyzer import AudioAnalyzer

results = []
for filename in os.listdir("./songs"):
    if filename.endswith(('.mp3', '.wav', '.flac')):
        analyzer = AudioAnalyzer(f"./songs/{filename}")
        analyzer.analyze()
        results.append({
            'file': filename,
            'bpm': analyzer.get_tempo()['bpm'],
            'key': f"{analyzer.get_key()['key']} {analyzer.get_key()['mode']}",
            'lufs': analyzer.get_loudness()['lufs']
        })

python

import os
from scripts.audio_analyzer import AudioAnalyzer

results = []
for filename in os.listdir("./songs"):
    if filename.endswith(('.mp3', '.wav', '.flac')):
        analyzer = AudioAnalyzer(f"./songs/{filename}")
        analyzer.analyze()
        results.append({
            'file': filename,
            'bpm': analyzer.get_tempo()['bpm'],
            'key': f"{analyzer.get_key()['key']} {analyzer.get_key()['mode']}",
            'lufs': analyzer.get_loudness()['lufs']
        })

Sort by BPM for DJ set

按BPM排序用于DJ set

results.sort(key=lambda x: x['bpm'])

undefined

results.sort(key=lambda x: x['bpm'])

undefined

Supported Formats

支持格式

Input formats (via librosa/soundfile):

MP3
WAV
FLAC
OGG
M4A/AAC
AIFF

Output formats:

JSON (analysis report)
PNG (visualizations)
SVG (visualizations)
TXT (summary)

输入格式（通过librosa/soundfile）：

MP3
WAV
FLAC
OGG
M4A/AAC
AIFF

输出格式：

JSON（分析报告）
PNG（可视化图表）
SVG（可视化图表）
TXT（摘要）

Dependencies

依赖

librosa>=0.10.0
soundfile>=0.12.0
matplotlib>=3.7.0
numpy>=1.24.0
scipy>=1.10.0

librosa>=0.10.0
soundfile>=0.12.0
matplotlib>=3.7.0
numpy>=1.24.0
scipy>=1.10.0

Limitations

局限性

Key detection works best with melodic content (less accurate for drums/percussion)
BPM detection may struggle with free-tempo or complex time signatures
Very short clips (<5 seconds) may have reduced accuracy
LUFS calculation is simplified (not full ITU-R BS.1770-4)

调式检测在旋律内容中表现最佳（对鼓点/打击乐的准确性较低）
BPM检测在自由速度或复杂拍号下可能存在困难
极短片段（<5秒）的准确性可能降低
LUFS计算为简化版本（非完整ITU-R BS.1770-4标准）

audio-analyzer

Original

Translation

Audio Analyzer

音频分析器

Quick Start

快速开始

Analyze an audio file

分析音频文件

Get all analysis results

获取所有分析结果

Generate visualizations

生成可视化图表

Full report

完整报告

Features

功能特性

API Reference

API参考

Initialization

初始化

From file

从文件初始化

With custom sample rate

自定义采样率

Analysis Methods

分析方法

Run full analysis

运行完整分析

Individual analyses

单独分析

Results Access

结果获取

Get all results as dict

获取所有结果（字典格式）

Individual results

获取单独结果

Visualization Methods

可视化方法

Waveform

波形图

Spectrogram

频谱图

Chromagram (pitch classes)

色度图（音高类别）

Onset strength / beat grid

onset强度/节拍网格

Combined dashboard

综合仪表盘

Export

导出

JSON report with all analysis

包含所有分析结果的JSON报告

Summary text

摘要文本

Analysis Details

分析详情

Tempo Detection

速度检测

{

{

'bpm': 128.0,

'bpm': 128.0,

'confidence': 0.89,

'confidence': 0.89,

'beats': [0.0, 0.469, 0.938, 1.406, ...], # seconds

'beats': [0.0, 0.469, 0.938, 1.406, ...], # 秒

'beat_count': 256

'beat_count': 256

}

}

Key Detection

调式检测

{

{

'key': 'A',

'key': 'A',

'mode': 'minor',

'mode': 'minor',

'confidence': 0.76,