High-performance drop-in replacement for librosa — audio analysis and music information retrieval in Rust
Project description
sonara
High-performance audio analysis library for Python, written in Rust.
A drop-in replacement for librosa with significantly faster feature extraction, batch analysis, and built-in perceptual features for playlist generation.
sonara — from Latin sonare, "to sound, to resonate"
Installation
pip install sonara
Requires Python 3.9+. Pre-built wheels available for Linux, macOS (Intel & Apple Silicon), and Windows.
Build from source:
git clone https://github.com/kkollsga/sonara.git
cd sonara
pip install maturin
maturin develop --release
Quick Start
import sonara
import numpy as np
# Load audio
y, sr = sonara.load("track.mp3", sr=22050)
# STFT
D = sonara.stft(y)
S_db = sonara.amplitude_to_db(np.abs(D))
# Mel spectrogram + MFCC
mel = sonara.melspectrogram(y=y, sr=22050.0)
mfcc = sonara.mfcc(y=y, sr=22050.0, n_mfcc=13)
# Beat tracking
tempo, beats = sonara.beat_track(y=y, sr=22050)
# Chroma
chroma = sonara.chroma_stft(y=y, sr=22050.0)
# Pitch estimation
f0, voiced, prob = sonara.pyin(y, fmin=65.0, fmax=2093.0, sr=22050)
Analysis Pipeline
sonara includes a fused analysis pipeline that extracts all features in a single optimized pass. Three modes control the depth of analysis:
Modes
| Mode | Features | Time (10s track) | Use case |
|---|---|---|---|
compact |
11 core features | ~1.2 ms | Fast scanning, metadata |
playlist |
22+ features incl. perceptual | ~3.3 ms | Playlist generation, music discovery |
playlist + accurate |
Same features, higher precision | ~12 ms | When accuracy matters more than speed |
Compact mode (default)
Core signal features, always computed:
r = sonara.analyze_file("track.mp3", mode="compact")
r['bpm'] # Tempo (BPM)
r['beats'] # Beat frame positions
r['onset_frames'] # Onset positions
r['onset_density'] # Onsets per second
r['rms_mean'] # Average loudness (RMS)
r['rms_max'] # Peak loudness (RMS)
r['loudness_lufs'] # Integrated loudness (LUFS, ITU-R BS.1770-4)
r['dynamic_range_db'] # Loudness range (p95 - p5, dB)
r['spectral_centroid_mean'] # Brightness (Hz)
r['zero_crossing_rate'] # Percussiveness proxy
r['duration_sec'] # Track length
Playlist mode
Everything for playlist generation: spectral features, MFCCs (timbre fingerprint), chroma (harmony), plus perceptual features:
r = sonara.analyze_file("track.mp3", mode="playlist")
# Perceptual features (0.0 - 1.0)
r['energy'] # Perceived intensity (loudness + brightness + activity)
r['danceability'] # Beat regularity + tempo sweet spot + rhythm
r['valence'] # Mood (0 = sad/dark, 1 = happy/bright)
r['acousticness'] # Acoustic vs electronic character
# Musical key
r['key'] # e.g. "C major", "A minor"
r['key_confidence'] # How confident the key detection is (0.0 - 1.0)
# Spectral features
r['spectral_bandwidth_mean'] # Frequency spread
r['spectral_rolloff_mean'] # Frequency below which 85% of energy sits
r['spectral_flatness_mean'] # Tonal (0) vs noise-like (1)
r['spectral_contrast_mean'] # Peak-valley ratio per band (7 values)
r['mfcc_mean'] # Timbre fingerprint (13 coefficients)
r['chroma_mean'] # Pitch class distribution (12 values)
Accurate flag
Trades speed for precision on select features:
r = sonara.analyze_file("track.mp3", mode="playlist", accurate=True)
| Feature | Default | Accurate |
|---|---|---|
| Chroma | Mel-band approximation (fast, +/-1 semitone) | Proper chroma filterbank (exact) |
| Spectral contrast | Mel sub-bands | Log-spaced frequency bands on magnitude spectrum |
| Danceability | Beat heuristic | Detrended Fluctuation Analysis (Streich & Herrera 2005) |
Custom feature selection
Cherry-pick specific features regardless of mode:
r = sonara.analyze_file("track.mp3", features=["bpm", "energy", "key", "loudness_lufs"])
Valid feature names: bpm, beats, onsets, rms, dynamic_range, centroid, zcr, onset_density, bandwidth, rolloff, flatness, contrast, mfcc, chroma, energy, danceability, key, valence, acousticness
Batch analysis
Analyze entire music libraries in parallel using all CPU cores:
import sonara
from pathlib import Path
files = [str(p) for p in Path("~/Music").rglob("*.mp3")]
results = sonara.analyze_batch(files, mode="playlist")
for r in results:
print(f"{r['bpm']:5.0f} BPM | {r['energy']:.2f} energy | "
f"{r['key']:>10} | {r['loudness_lufs']:6.1f} LUFS | "
f"{r['valence']:.2f} valence")
Display
import sonara
import sonara.display as display
import matplotlib.pyplot as plt
y, sr = sonara.load("track.mp3", sr=22050)
mel = sonara.melspectrogram(y=y, sr=22050.0)
mel_db = sonara.power_to_db(mel)
fig, ax = plt.subplots()
display.specshow(mel_db, x_axis='time', y_axis='mel', sr=22050, ax=ax)
plt.show()
Performance
All arithmetic uses f32 precision (matching native decoder format), with a parallelized fused FFT pipeline and fast-path 2:1 decimation for the common 44100 Hz to 22050 Hz resampling case.
vs librosa
| Feature | Speedup |
|---|---|
| Mel spectrogram | ~3x |
| MFCC | ~3x |
| Beat tracking | ~4x |
| Onset detection | ~3x |
| Spectral centroid | ~3x |
| Cold start (first call) | ~20-30x |
| Batch analysis (parallel) | ~5x |
Analysis pipeline benchmarks (10s signal, Apple Silicon)
| Mode | Time | Features |
|---|---|---|
compact |
1.2 ms | 11 core features |
playlist |
3.3 ms | 22+ features incl. perceptual |
playlist + accurate |
12.4 ms | Same, with accurate chroma/DFA |
Key optimizations
- f32 precision — halves memory bandwidth vs f64, matches Symphonia's native decode format (zero-cost conversion)
- Fused single-pass pipeline — one FFT per frame simultaneously produces mel, centroid, RMS, bandwidth, rolloff, flatness
- Parallel FFT frames — rayon parallelism across frames (for signals > 32 frames)
- Sparse mel projection — triangular mel filters are ~97% zeros; only non-zero weights multiplied
- Fast 2:1 decimation — half-band FIR filter for 44100-to-22050 Hz instead of full sinc resampling
- Thread-local FFT cache — plan and scratch buffer reuse with RefCell (no mutex contention)
- Mel filterbank caching — reused across calls in batch processing
- K-weighted LUFS — two-biquad IIR filter, single-pass (~0.05ms per second of audio)
API Compatibility
sonara implements 92 of librosa's top-level functions with matching signatures:
Core Audio: load, stft, istft, resample, to_mono, tone, chirp, clicks
Features: melspectrogram, mfcc, chroma_stft, tonnetz, spectral_centroid, spectral_bandwidth, spectral_rolloff, spectral_flatness, spectral_contrast, rms, zero_crossing_rate
Rhythm: beat_track, onset_detect, onset_strength, tempo
Pitch: yin, pyin, piptrack, estimate_tuning
Transforms: cqt, vqt, icqt, hybrid_cqt, pseudo_cqt, griffinlim
Conversions: hz_to_mel, mel_to_hz, hz_to_midi, midi_to_hz, note_to_hz, note_to_midi, fft_frequencies, mel_frequencies, cqt_frequencies, and 30+ more
Effects: time_stretch, pitch_shift, trim, split, preemphasis, deemphasis
Notation: key_to_notes, key_to_degrees, mela_to_svara, thaat_to_degrees, hz_to_svara_h, hz_to_svara_c
Architecture
sonara is a two-crate Rust workspace:
sonara— Pure Rust core library (~10,000 LOC)sonara-python— PyO3 bindings (~1,000 LOC)
sonara/src/
analyze.rs — Fused analysis pipeline (compact/playlist/full modes)
perceptual.rs — LUFS, energy, danceability, key detection, valence, acousticness
beat.rs — Beat tracking (Ellis 2007 DP algorithm)
onset.rs — Onset detection (spectral flux + peak picking)
core/
audio.rs — Audio I/O, resampling, fast 2:1 decimation
spectrum.rs — STFT, power spectrogram, dB conversions
fft.rs — FFT with thread-local plan caching
pitch.rs — YIN / pYIN pitch estimation
convert.rs — Hz/mel/MIDI/note conversions, frequency weighting
feature/
spectral.rs — Mel, MFCC, chroma, centroid, bandwidth, rolloff, flatness, contrast
dsp/
windows.rs — Window functions (Hann, Hamming, Blackman, Kaiser)
iir.rs — IIR filters (lfilter, sosfiltfilt)
filters.rs — Mel/chroma filterbanks
License
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
File details
Details for the file sonara-0.1.5.tar.gz.
File metadata
- Download URL: sonara-0.1.5.tar.gz
- Upload date:
- Size: 133.8 kB
- Tags: Source
- Uploaded using Trusted Publishing? Yes
- Uploaded via: twine/6.1.0 CPython/3.13.12
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
9c45e9e529e133781d0258f33a065338720337c169118e2d061a7ad5c8172909
|
|
| MD5 |
62db7332b06339027e5d0a75316e6bd0
|
|
| BLAKE2b-256 |
0eac2df5fe8ac554bd11699b04b8ad3545eaee303bf95aa86cbc6c33c5cfa3c2
|
Provenance
The following attestation bundles were made for sonara-0.1.5.tar.gz:
Publisher:
ci.yml on kkollsga/sonara
-
Statement:
-
Statement type:
https://in-toto.io/Statement/v1 -
Predicate type:
https://docs.pypi.org/attestations/publish/v1 -
Subject name:
sonara-0.1.5.tar.gz -
Subject digest:
9c45e9e529e133781d0258f33a065338720337c169118e2d061a7ad5c8172909 - Sigstore transparency entry: 1294697864
- Sigstore integration time:
-
Permalink:
kkollsga/sonara@6bfff9efdf28e12bb16751cc4bdb99c0f6018747 -
Branch / Tag:
refs/heads/main - Owner: https://github.com/kkollsga
-
Access:
public
-
Token Issuer:
https://token.actions.githubusercontent.com -
Runner Environment:
github-hosted -
Publication workflow:
ci.yml@6bfff9efdf28e12bb16751cc4bdb99c0f6018747 -
Trigger Event:
push
-
Statement type: