Ultra-fast, CPU-only classical audio feature extraction (BPM, Key, Loudness, Energy, Duration)
Project description
@ohnrshyp/dsp
Ultra-fast, CPU-only classical audio feature extraction library.
This module provides high-speed, lightweight audio signal processing utilities. It runs efficiently on standard CPU hardware without demanding GPU execution or deep learning models, making it ideal for high-throughput batch upload pipelines.
🚀 Key Features
- ⏱️ BPM & Tempo Tracking: Identifies the audio tempo (beats per minute) alongside a temporal stability-based confidence score.
- 🎹 Musical Key & Scale Detection: Recognizes the global pitch center and scale mode (Major/Minor) using the Krumhansl-Schmuckler chromagram correlation algorithm.
- 🔊 RMS Energy & Loudness: Computes integrated root-mean-square (RMS) energy and estimated loudness levels in decibels (dB).
- 📊 Dynamic Range: Calculates the dB spread between the 95th and 10th percentiles of frame-wise RMS energy.
- 💃 Danceability Estimator: Computes a composite danceability index based on tempo stability and acoustic energy distribution.
🧬 Architectural & Mathematical Design
The core processing pipeline bridges a Node.js child-process wrapper and a high-performance Python DSP engine utilizing Librosa and NumPy.
1. Tempo (BPM) & Confidence
The BPM estimation pipeline computes the onset strength envelope of the input signal: $$\text{Onset Strength}(t) = \sum_{f} \max(0, S(f, t) - S(f, t-1))$$ where $S(f, t)$ is the log-mel spectrogram. It then runs a Fourier-based tempogram or auto-correlation: $$\text{Tempogram}(\tau, t) = \sum_{n} \text{Onset Strength}(n) \cdot W(n - t) \cdot e^{-j 2\pi \tau n}$$ The dominant peak yields the primary BPM. The confidence score measures the dominance of the selected tempo frequency relative to the average spectral energy across the tempogram.
2. Krumhansl-Schmuckler Key Detection
A Constant-Q Transform (CQT) translates the audio signal into a 12-bin chromagram (pitch class profile). The average energy across time forms a 12-dimensional chroma vector $C$. This vector is normalized and rotated $12$ times (for each semitone shift) to correlate against major and minor key templates defined by Krumhansl & Schmuckler:
- Major Profile:
[6.35, 2.23, 3.48, 2.33, 4.38, 4.09, 2.52, 5.19, 2.39, 3.66, 2.29, 2.88] - Minor Profile:
[6.33, 2.68, 3.52, 5.38, 2.60, 3.53, 2.54, 4.75, 3.98, 2.69, 3.34, 3.17]
The Pearson correlation coefficient $r$ is computed for each pitch class rotation $i$: $$r_i = \frac{\sum (C_i - \bar{C_i})(T - \bar{T})}{\sqrt{\sum (C_i - \bar{C_i})^2 \sum (T - \bar{T})^2}}$$ The profile rotation that maximizes $r_i$ defines the musical key and scale mode. The confidence represents the correlation normalized to a $[0, 1]$ range.
📦 Installation
Node.js (NPM Package)
npm install @ohnrshyp/dsp
Python (PyPI Package)
pip install orbit-dsp
Host Dependencies
This package delegates core DSP tasks to a local Python executable. Ensure Python 3.8+ is installed on the host along with the required libraries:
pip install librosa numpy scipy
🛠️ Node.js API Reference
analyze(input, [options])
Performs comprehensive acoustic analysis of the input audio source.
-
Parameters:
input(Buffer|string): Raw binary buffer or absolute path to the target audio file.options(Object, optional):maxLength(number): Limit processing to the first $N$ seconds of the file. Default is120.stemsDir(string|null): Optional path to a directory containing separated stems (e.g. Demucs vocal/bass/other stems) to significantly improve pitch key resolution.verbose(boolean): Enable diagnostic logging. Default isfalse.
-
Returns:
Promise<Object>containing the following schema:{ "bpm": { "value": 128.0, "confidence": 0.8415 }, "key": { "value": "A minor", "key": "A", "mode": "minor", "confidence": 0.7912 }, "energy": 0.6843, "loudness_db": -14.21, "dynamic_range_db": 11.45, "duration": 240.5, "sample_rate": 22050, "analyzed_length": 120.0, "key_detection_source": "mix_hpss", "processingTimeMs": 420 }
calculateDanceability(analysisResult)
Estimates danceability from the extracted BPM and energy metrics using a normalized sigmoid correlation.
- Parameters:
analysisResult(Object): The JSON output returned fromanalyze().
- Returns:
numberbetween0and1.
checkPythonEnvironment()
Verifies that the Python binary and dependencies (librosa, numpy) are available and operational.
- Returns:
Promise<Object>:{ "available": true, "message": "Python environment ready for audio DSP analysis", "details": { "pythonVersion": "Python 3.10.8", "packages": ["librosa", "numpy"] } }
💻 Code Examples
Analyzing raw audio buffer and determining danceability
const dsp = require('@ohnrshyp/dsp');
const fs = require('fs');
async function run() {
// Check host environment first
const env = await dsp.checkPythonEnvironment();
if (!env.available) {
console.error('Environment check failed:', env.message);
console.log('Please run:', env.details?.install);
return;
}
const audioBuffer = fs.readFileSync('house-track.mp3');
try {
const analysis = await dsp.analyze(audioBuffer, {
maxLength: 90, // process only first 90s for ultra-fast response
verbose: true
});
const danceScore = dsp.calculateDanceability(analysis);
console.log(`BPM: ${analysis.bpm.value} (Confidence: ${analysis.bpm.confidence})`);
console.log(`Key: ${analysis.key.value}`);
console.log(`Loudness: ${analysis.loudness_db} dB`);
console.log(`Estimated Danceability: ${(danceScore * 100).toFixed(1)}%`);
} catch (error) {
console.error('Analysis failed:', error.message);
}
}
run();
📄 License
Licensed under the Apache License, Version 2.0 (the "License"). See LICENSE in the project root for details.
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file orbit_dsp-1.0.1.tar.gz.
File metadata
- Download URL: orbit_dsp-1.0.1.tar.gz
- Upload date:
- Size: 7.6 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.2.0 CPython/3.13.7
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
8040e2b493c8051eadc1c7eb23c36eec12330b0efa42ad6250aadc2eb12dd64a
|
|
| MD5 |
b9b69c19f627e485feca43941c16fc1d
|
|
| BLAKE2b-256 |
d990e1354942f1ec6de5b92da6ab2e49970aff124dbfe1c07959e6ef0a120456
|
File details
Details for the file orbit_dsp-1.0.1-py3-none-any.whl.
File metadata
- Download URL: orbit_dsp-1.0.1-py3-none-any.whl
- Upload date:
- Size: 7.9 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.2.0 CPython/3.13.7
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
6786d78b8f3880d3ecb1e9d0ad4c07b705c0b78729c78fe82f7e36a5e9d85f3d
|
|
| MD5 |
0a29b40f4cbacdd21e858f521c97031e
|
|
| BLAKE2b-256 |
7f84deba2c878557b01b419f6617bee4e7f888dfa44e480c246c7ec8bd46b8d2
|