Measure one or more aspects of one or more audio files.
Project description
analyzeAudio
Measure one or more aspects of one or more audio files.
Note well: FFmpeg & FFprobe binaries must be in PATH
Some options to download FFmpeg and FFprobe at ffmpeg.org.
Some ways to use this package
The top-level package re-exports a small high-level API:
| API | Purpose |
|---|---|
analyzeAudioFile(pathFilename, listAspectNames) |
Analyze one file and return one result per requested registered aspect name. |
analyzeAudioListPathFilenames(listPathFilenames, listAspectNames, CPUlimit=None) |
Analyze many files in parallel and return one row per completed file. |
getListAvailableAudioAspects() |
Return the sorted list of registered aspect names. |
audioAspects |
Registry of aspect name -> analyzer callable + required parameter names. |
dataTabularTOpathFilenameDelimited(...) |
Write batch results to a delimited text file. |
Use analyzeAudioFile to measure one or more registered aspects of a single audio file
from analyzeAudio import analyzeAudioFile
listAspectNames = [
'LUFS integrated',
'RMS peak',
'SRMR mean',
'Spectral Flatness mean',
]
listMeasurements = analyzeAudioFile(pathFilename, listAspectNames)
dictionaryMeasurements = dict(zip(listAspectNames, listMeasurements, strict=True))
analyzeAudioFile preserves the order of listAspectNames. If a requested aspect name is not registered, the matching return entry is 'not found'.
The registered names are case-sensitive and sometimes very similar names refer to different measurements. For example, Spectral Flatness mean and Spectral flatness mean are different entries, and so are Zero-crossing rate mean and Zero-crossings rate.
Use analyzeAudioListPathFilenames to measure one or more aspects for many audio files
from analyzeAudio import analyzeAudioListPathFilenames, dataTabularTOpathFilenameDelimited
listAspectNames = ['LUFS integrated', 'Spectral Flatness mean']
rowsListFilenameAspectValues = analyzeAudioListPathFilenames(listPathFilenames, listAspectNames)
dataTabularTOpathFilenameDelimited(
pathFilenameOutput,
rowsListFilenameAspectValues,
['pathFilename', *listAspectNames],
)
Each returned row starts with the file path converted to POSIX text, followed by the requested values. The rows are returned in worker-completion order rather than the original input order.
Use getListAvailableAudioAspects and audioAspects to inspect the registry or call an analyzer directly
from analyzeAudio import audioAspects, getListAvailableAudioAspects
print(getListAvailableAudioAspects())
print(audioAspects['Chromagram mean']['analyzerParameters'])
SI_SDR_channelsMean = audioAspects['SI-SDR mean']['analyzer'](
pathFilenameAudioFile,
pathFilenameDifferentAudioFile,
)
Use audioAspects[name]['analyzerParameters'] to see what inputs a registered analyzer expects. This is especially useful when a registered analyzer needs more than the single pathFilename accepted by analyzeAudioFile, such as a comparison between two files or a metric that expects tensors.
Use the lower-level analyzer modules when you want data arrays or tensors instead of one float
Most registered names ending in mean are scalar summaries. If you want the full feature array or tensor instead, import the lower-level analyzer function directly:
analyzeAudio.analyzersUseFilename- filename-based scalar analyzers, including comparisons such as
SI-SDR mean
- filename-based scalar analyzers, including comparisons such as
analyzeAudio.analyzersUseWaveformanalyzeTempogram-> full tempogram arrayanalyzeRMS-> framewise RMS-in-dB arrayanalyzeTempo-> tempo arrayanalyzeZeroCrossingRate-> framewise zero-crossing-rate array
analyzeAudio.analyzersUseSpectrogramanalyzeChromagram-> chromagram matrixanalyzeSpectralContrast-> spectral-contrast arrayanalyzeSpectralBandwidth-> spectral-bandwidth arrayanalyzeSpectralCentroid-> spectral-centroid arrayanalyzeSpectralFlatness-> spectral-flatness-in-dB array
analyzeAudio.analyzersUseTensoranalyzeSRMR->torch.Tensorof SRMR values
The matching ...Mean functions return one float summary instead. getListAvailableAudioAspects() lists the registered public aspect names, not every lower-level helper function.
import numpy
import soundfile
from analyzeAudio.analyzersUseWaveform import analyzeTempogram
with soundfile.SoundFile(pathFilename) as readSoundFile:
sampleRate = readSoundFile.samplerate
waveform = readSoundFile.read(dtype='float32').astype(numpy.float32).T
tempogram = analyzeTempogram(waveform, sampleRate)
Use whatMeasurements to list registered measurements from the command line
whatMeasurements
This prints the same sorted registry names returned by getListAvailableAudioAspects().
Reference materials
A Spectral-Flatness Measure for Studying the Autocorrelation Method of Linear Prediction of Speech Analysis
- Common name: spectral flatness
- BibTeX citation.
- DOI: 10.1109/TASSP.1974.1162572
- Implementation:
- librosa/librosa.feature.spectral_flatness
Perceptual Effects of Spectral Modifications on Musical Timbres
Robust Entropy-Based Endpoint Detection for Speech Recognition in Noisy Environments
- Common name: spectral entropy
- BibTeX citation.
- DOI: 10.21437/ICSLP.1998-527
- Proceedings: ISCA Archive
Realtime Chord Recognition of Musical Sound: A System Using Common Lisp Music
- Common name: chroma features
- BibTeX citation.
- Proceedings: University of Michigan ICMC archive
- Implementation:
- librosa/librosa.feature.chroma_stft
A Robust Audio Classification and Segmentation Method
- BibTeX citation.
- Technical report: Microsoft Research
- Implementations:
Music Type Classification by Spectral Contrast Feature
- Common name: spectral contrast
- BibTeX citation.
- DOI: 10.1109/ICME.2002.1035731
- Free PDF: Tsinghua University
- Implementation:
- librosa/librosa.feature.spectral_contrast
A Speech/Music Discriminator Based on RMS and Zero-Crossings
- Common names: RMS, zero-crossing rate
- BibTeX citation.
- DOI: 10.1109/TMM.2004.840604
- Free author proof: University of Crete
- Implementation:
- librosa/librosa.feature.rms
- librosa/librosa.feature.zero_crossing_rate
Performance Measurement in Blind Audio Source Separation
- Common name: BSS Eval SDR
- BibTeX citation.
- DOI: 10.1109/TSA.2005.858005
- Free author PDF: IRISA
- Implementations:
Automatic Chord Recognition from Audio Using a HMM with Supervised Learning
- BibTeX citation.
- Proceedings: ISMIR 2006
- Free PDF: Stanford CCRMA
- Implementation:
- librosa/librosa.feature.chroma_stft
Cyclic Tempogram: A Mid-Level Tempo Representation for Music Signals
- Common name: tempogram
- BibTeX citation.
- DOI: 10.1109/ICASSP.2010.5495219
- Free author PDF: AudioLabs Erlangen
- Implementations:
- librosa/librosa.feature.tempogram
- Vamp Tempogram Plugin
A Non-Intrusive Quality and Intelligibility Measure of Reverberant and Dereverberated Speech
- Common name: SRMR
- BibTeX citation.
- DOI: 10.1109/TASL.2010.2052247
- Free author PDF: MUSEA Lab
- Implementation:
Signal Processing for Music Analysis
- BibTeX citation.
- DOI: 10.1109/JSTSP.2011.2112333
- Free author PDF: Columbia University
- Implementation:
The Timbre Toolbox: Extracting Audio Descriptors from Musical Signals
- BibTeX citation.
- DOI: 10.1121/1.3642604
- Free PDF: McGill University
- Implementations:
Blind Audio Watermarking Technique Based on Two Dimensional Cellular Automata
- Common name: APSNR reference
- BibTeX citation.
- DOI: 10.14257/ijsia.2016.10.9.18
- Free repository copy: Universidad Autonoma de Madrid
- Implementation:
SDR - Half-Baked or Well Done?
- Common name: SI-SDR
- BibTeX citation. TeX Source with precise formulas for AI agents.
- DOI: 10.1109/ICASSP.2019.8683855
- Free author PDF: Jonathan Le Roux
- Implementations:
Loudness Range: A Measure to Supplement EBU R 128 Loudness Normalisation
- Common name: LUFS
- BibTeX citation.
- Standard: EBU Tech 3342
- Free PDF: EBU
- Implementation:
Algorithms to Measure Audio Programme Loudness and True-Peak Audio Level
- Common name: True peak
- BibTeX citation.
- Standard: ITU-R BS.1770-5
- Free PDF: ITU
- Implementation:
An Overview on Sound Features in Time and Frequency Domain
- BibTeX citation.
- DOI: 10.2478/ijasitels-2023-0006
- Open access article: Reference Global
Packages and documentation
- FFmpeg documentation
- librosa/librosa
- Lightning-AI/torchmetrics
- sigsep/sigsep-mus-eval
- mir-evaluation/mir_eval
My recovery
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
File details
Details for the file analyzeaudio-0.1.1.tar.gz.
File metadata
- Download URL: analyzeaudio-0.1.1.tar.gz
- Upload date:
- Size: 42.7 kB
- Tags: Source
- Uploaded using Trusted Publishing? Yes
- Uploaded via: twine/6.1.0 CPython/3.13.13
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
035ac776a87cf2b43a1f545672760469c2f0f0c7048fba99959beb5931e4a5c5
|
|
| MD5 |
925dadc862ed6fa45298dbfb7d4fd351
|
|
| BLAKE2b-256 |
20af9f8bcdb4d1ca3317340b2bb6bd0588e8b9a9a86d6499783f21c56902854e
|