Audio Analyzer

A comprehensive toolkit for analyzing audio files. Extract detailed information about audio including tempo, musical key, frequency content, loudness metrics, and generate professional visualizations.

Quick Start

python

from scripts.audio_analyzer import AudioAnalyzer

# Analyze an audio file
analyzer = AudioAnalyzer("song.mp3")
analyzer.analyze()

# Get all analysis results
results = analyzer.get_results()
print(f"BPM: {results['tempo']['bpm']}")
print(f"Key: {results['key']['key']} {results['key']['mode']}")

# Generate visualizations
analyzer.plot_waveform("waveform.png")
analyzer.plot_spectrogram("spectrogram.png")

# Full report
analyzer.save_report("analysis_report.json")

Features

Tempo/BPM Detection: Accurate beat tracking with confidence score
Key Detection: Musical key and mode (major/minor) identification
Frequency Analysis: Spectrum, dominant frequencies, frequency bands
Loudness Metrics: RMS, peak, LUFS, dynamic range
Waveform Visualization: Multi-channel waveform plots
Spectrogram: Time-frequency visualization with customization
Chromagram: Pitch class visualization for harmonic analysis
Beat Grid: Visual beat markers overlaid on waveform
Export Formats: JSON report, PNG/SVG visualizations

API Reference

Initialization

python

# From file
analyzer = AudioAnalyzer("audio.mp3")

# With custom sample rate
analyzer = AudioAnalyzer("audio.wav", sr=44100)

Analysis Methods

python

# Run full analysis
analyzer.analyze()

# Individual analyses
analyzer.analyze_tempo()      # BPM and beat positions
analyzer.analyze_key()        # Musical key detection
analyzer.analyze_loudness()   # RMS, peak, LUFS
analyzer.analyze_frequency()  # Spectrum analysis
analyzer.analyze_dynamics()   # Dynamic range

Results Access

python

# Get all results as dict
results = analyzer.get_results()

# Individual results
tempo = analyzer.get_tempo()        # {'bpm': 120, 'confidence': 0.85, 'beats': [...]}
key = analyzer.get_key()            # {'key': 'C', 'mode': 'major', 'confidence': 0.72}
loudness = analyzer.get_loudness()  # {'rms_db': -14.2, 'peak_db': -0.5, 'lufs': -14.0}
freq = analyzer.get_frequency()     # {'dominant_freq': 440, 'spectrum': [...]}

Visualization Methods

python

# Waveform
analyzer.plot_waveform(
    output="waveform.png",
    figsize=(12, 4),
    color="#1f77b4",
    show_rms=True
)

# Spectrogram
analyzer.plot_spectrogram(
    output="spectrogram.png",
    figsize=(12, 6),
    cmap="magma",           # viridis, plasma, inferno, magma
    freq_scale="log",       # linear, log, mel
    max_freq=8000           # Hz
)

# Chromagram (pitch classes)
analyzer.plot_chromagram(
    output="chromagram.png",
    figsize=(12, 4)
)

# Onset strength / beat grid
analyzer.plot_beats(
    output="beats.png",
    figsize=(12, 4),
    show_strength=True
)

# Combined dashboard
analyzer.plot_dashboard(
    output="dashboard.png",
    figsize=(14, 10)
)

Export

python

# JSON report with all analysis
analyzer.save_report("report.json")

# Summary text
summary = analyzer.get_summary()
print(summary)

Analysis Details

Tempo Detection

Uses beat tracking algorithm to detect:

BPM: Beats per minute (tempo)
Beat positions: Timestamps of detected beats
Confidence: Reliability score (0-1)

python

tempo = analyzer.get_tempo()
# {
#     'bpm': 128.0,
#     'confidence': 0.89,
#     'beats': [0.0, 0.469, 0.938, 1.406, ...],  # seconds
#     'beat_count': 256
# }

Key Detection

Analyzes harmonic content to identify:

Key: Root note (C, C#, D, etc.)
Mode: Major or minor
Confidence: Detection confidence
Key profile: Correlation with each key

python

key = analyzer.get_key()
# {
#     'key': 'A',
#     'mode': 'minor',
#     'confidence': 0.76,
#     'profile': {'C': 0.12, 'C#': 0.08, ...}
# }

Loudness Metrics

Comprehensive loudness analysis:

RMS dB: Root mean square level
Peak dB: Maximum sample level
LUFS: Integrated loudness (broadcast standard)
Dynamic Range: Difference between loud and quiet sections

python

loudness = analyzer.get_loudness()
# {
#     'rms_db': -14.2,
#     'peak_db': -0.3,
#     'lufs': -14.0,
#     'dynamic_range_db': 12.5,
#     'crest_factor': 8.2
# }

Frequency Analysis

Spectrum analysis including:

Dominant frequency: Strongest frequency component
Frequency bands: Energy in bass, mid, treble
Spectral centroid: "Brightness" of audio
Spectral rolloff: Frequency below which 85% of energy exists

python

freq = analyzer.get_frequency()
# {
#     'dominant_freq': 440.0,
#     'spectral_centroid': 2150.3,
#     'spectral_rolloff': 4200.5,
#     'bands': {
#         'sub_bass': -28.5,      # 20-60 Hz
#         'bass': -18.2,          # 60-250 Hz
#         'low_mid': -12.1,       # 250-500 Hz
#         'mid': -10.8,           # 500-2000 Hz
#         'high_mid': -14.3,      # 2000-4000 Hz
#         'high': -22.1           # 4000-20000 Hz
#     }
# }

CLI Usage

bash

# Full analysis with all visualizations
python audio_analyzer.py --input song.mp3 --output-dir ./analysis/

# Just tempo and key
python audio_analyzer.py --input song.mp3 --analyze tempo key --output report.json

# Generate specific visualization
python audio_analyzer.py --input song.mp3 --plot spectrogram --output spec.png

# Dashboard view
python audio_analyzer.py --input song.mp3 --dashboard --output dashboard.png

# Batch analyze directory
python audio_analyzer.py --input-dir ./songs/ --output-dir ./reports/

CLI Arguments

Argument	Description	Default
`--input`	Input audio file	Required
`--input-dir`	Directory of audio files	-
`--output`	Output file path	-
`--output-dir`	Output directory	`.`
`--analyze`	Analysis types: tempo, key, loudness, frequency, all	`all`
`--plot`	Plot type: waveform, spectrogram, chromagram, beats, dashboard	-
`--format`	Output format: json, txt	`json`
`--sr`	Sample rate for analysis	`22050`

Examples

Song Analysis

python

analyzer = AudioAnalyzer("track.mp3")
analyzer.analyze()

print(f"Tempo: {analyzer.get_tempo()['bpm']:.1f} BPM")
print(f"Key: {analyzer.get_key()['key']} {analyzer.get_key()['mode']}")
print(f"Loudness: {analyzer.get_loudness()['lufs']:.1f} LUFS")

analyzer.plot_dashboard("track_analysis.png")

Podcast Quality Check

python

analyzer = AudioAnalyzer("podcast.mp3")
analyzer.analyze_loudness()

loudness = analyzer.get_loudness()
if loudness['lufs'] > -16:
    print("Warning: Audio may be too loud for podcast standards")
elif loudness['lufs'] < -20:
    print("Warning: Audio may be too quiet")
else:
    print("Loudness is within podcast standards (-16 to -20 LUFS)")

Batch Analysis

python

import os
from scripts.audio_analyzer import AudioAnalyzer

results = []
for filename in os.listdir("./songs"):
    if filename.endswith(('.mp3', '.wav', '.flac')):
        analyzer = AudioAnalyzer(f"./songs/{filename}")
        analyzer.analyze()
        results.append({
            'file': filename,
            'bpm': analyzer.get_tempo()['bpm'],
            'key': f"{analyzer.get_key()['key']} {analyzer.get_key()['mode']}",
            'lufs': analyzer.get_loudness()['lufs']
        })

# Sort by BPM for DJ set
results.sort(key=lambda x: x['bpm'])

Supported Formats

Input formats (via librosa/soundfile):

MP3
WAV
FLAC
OGG
M4A/AAC
AIFF

Output formats:

JSON (analysis report)
PNG (visualizations)
SVG (visualizations)
TXT (summary)

Dependencies

librosa>=0.10.0
soundfile>=0.12.0
matplotlib>=3.7.0
numpy>=1.24.0
scipy>=1.10.0

Limitations

Key detection works best with melodic content (less accurate for drums/percussion)
BPM detection may struggle with free-tempo or complex time signatures
Very short clips (<5 seconds) may have reduced accuracy
LUFS calculation is simplified (not full ITU-R BS.1770-4)

audio-analyzer

NPX Install

Tags

SKILL.md Content

Audio Analyzer

Quick Start

Features

API Reference

Initialization

Analysis Methods

Results Access

Visualization Methods

Export

Analysis Details

Tempo Detection

Key Detection

Loudness Metrics

Frequency Analysis

CLI Usage

CLI Arguments

Examples

Song Analysis

Podcast Quality Check

Batch Analysis

Supported Formats

Dependencies

Limitations