Skip to main content

Measure one or more aspects of one or more audio files.

Project description

analyzeAudio

Measure one or more aspects of one or more audio files.

Note well: FFmpeg & FFprobe binaries must be in PATH

Some options to download FFmpeg and FFprobe at ffmpeg.org.

pip install analyzeAudio uv add analyzeAudio

Some ways to use this package

The top-level package re-exports a small high-level API:

API Purpose
analyzeAudioFile(pathFilename, listAspectNames) Analyze one file and return one result per requested registered aspect name.
analyzeAudioListPathFilenames(listPathFilenames, listAspectNames, CPUlimit=None) Analyze many files in parallel and return one row per completed file.
getListAvailableAudioAspects() Return the sorted list of registered aspect names.
audioAspects Registry of aspect name -> analyzer callable + required parameter names.
dataTabularTOpathFilenameDelimited(...) Write batch results to a delimited text file.

Use analyzeAudioFile to measure one or more registered aspects of a single audio file

from analyzeAudio import analyzeAudioFile

listAspectNames = [
    'LUFS integrated',
    'RMS peak',
    'SRMR mean',
    'Spectral Flatness mean',
]

listMeasurements = analyzeAudioFile(pathFilename, listAspectNames)
dictionaryMeasurements = dict(zip(listAspectNames, listMeasurements, strict=True))

analyzeAudioFile preserves the order of listAspectNames. If a requested aspect name is not registered, the matching return entry is 'not found'.

The registered names are case-sensitive and sometimes very similar names refer to different measurements. For example, Spectral Flatness mean and Spectral flatness mean are different entries, and so are Zero-crossing rate mean and Zero-crossings rate.

Use analyzeAudioListPathFilenames to measure one or more aspects for many audio files

from analyzeAudio import analyzeAudioListPathFilenames, dataTabularTOpathFilenameDelimited

listAspectNames = ['LUFS integrated', 'Spectral Flatness mean']
rowsListFilenameAspectValues = analyzeAudioListPathFilenames(listPathFilenames, listAspectNames)

dataTabularTOpathFilenameDelimited(
    pathFilenameOutput,
    rowsListFilenameAspectValues,
    ['pathFilename', *listAspectNames],
)

Each returned row starts with the file path converted to POSIX text, followed by the requested values. The rows are returned in worker-completion order rather than the original input order.

Use getListAvailableAudioAspects and audioAspects to inspect the registry or call an analyzer directly

from analyzeAudio import audioAspects, getListAvailableAudioAspects

print(getListAvailableAudioAspects())
print(audioAspects['Chromagram mean']['analyzerParameters'])

SI_SDR_channelsMean = audioAspects['SI-SDR mean']['analyzer'](
    pathFilenameAudioFile,
    pathFilenameDifferentAudioFile,
)

Use audioAspects[name]['analyzerParameters'] to see what inputs a registered analyzer expects. This is especially useful when a registered analyzer needs more than the single pathFilename accepted by analyzeAudioFile, such as a comparison between two files or a metric that expects tensors.

Use the lower-level analyzer modules when you want data arrays or tensors instead of one float

Most registered names ending in mean are scalar summaries. If you want the full feature array or tensor instead, import the lower-level analyzer function directly:

  • analyzeAudio.analyzersUseFilename
    • filename-based scalar analyzers, including comparisons such as SI-SDR mean
  • analyzeAudio.analyzersUseWaveform
    • analyzeTempogram -> full tempogram array
    • analyzeRMS -> framewise RMS-in-dB array
    • analyzeTempo -> tempo array
    • analyzeZeroCrossingRate -> framewise zero-crossing-rate array
  • analyzeAudio.analyzersUseSpectrogram
    • analyzeChromagram -> chromagram matrix
    • analyzeSpectralContrast -> spectral-contrast array
    • analyzeSpectralBandwidth -> spectral-bandwidth array
    • analyzeSpectralCentroid -> spectral-centroid array
    • analyzeSpectralFlatness -> spectral-flatness-in-dB array
  • analyzeAudio.analyzersUseTensor
    • analyzeSRMR -> torch.Tensor of SRMR values

The matching ...Mean functions return one float summary instead. getListAvailableAudioAspects() lists the registered public aspect names, not every lower-level helper function.

import numpy
import soundfile

from analyzeAudio.analyzersUseWaveform import analyzeTempogram

with soundfile.SoundFile(pathFilename) as readSoundFile:
    sampleRate = readSoundFile.samplerate
    waveform = readSoundFile.read(dtype='float32').astype(numpy.float32).T

tempogram = analyzeTempogram(waveform, sampleRate)

Use whatMeasurements to list registered measurements from the command line

whatMeasurements

This prints the same sorted registry names returned by getListAvailableAudioAspects().

Reference materials

A Spectral-Flatness Measure for Studying the Autocorrelation Method of Linear Prediction of Speech Analysis

Perceptual Effects of Spectral Modifications on Musical Timbres

Robust Entropy-Based Endpoint Detection for Speech Recognition in Noisy Environments

Realtime Chord Recognition of Musical Sound: A System Using Common Lisp Music

A Robust Audio Classification and Segmentation Method

Music Type Classification by Spectral Contrast Feature

A Speech/Music Discriminator Based on RMS and Zero-Crossings

Performance Measurement in Blind Audio Source Separation

Automatic Chord Recognition from Audio Using a HMM with Supervised Learning

Cyclic Tempogram: A Mid-Level Tempo Representation for Music Signals

A Non-Intrusive Quality and Intelligibility Measure of Reverberant and Dereverberated Speech

Signal Processing for Music Analysis

The Timbre Toolbox: Extracting Audio Descriptors from Musical Signals

Blind Audio Watermarking Technique Based on Two Dimensional Cellular Automata

SDR - Half-Baked or Well Done?

Loudness Range: A Measure to Supplement EBU R 128 Loudness Normalisation

Algorithms to Measure Audio Programme Loudness and True-Peak Audio Level

An Overview on Sound Features in Time and Frequency Domain

Packages and documentation

My recovery

Static Badge YouTube Channel Subscribers

CC-BY-NC-4.0

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

analyzeaudio-0.1.1.tar.gz (42.7 kB view details)

Uploaded Source

File details

Details for the file analyzeaudio-0.1.1.tar.gz.

File metadata

  • Download URL: analyzeaudio-0.1.1.tar.gz
  • Upload date:
  • Size: 42.7 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.13.13

File hashes

Hashes for analyzeaudio-0.1.1.tar.gz
Algorithm Hash digest
SHA256 035ac776a87cf2b43a1f545672760469c2f0f0c7048fba99959beb5931e4a5c5
MD5 925dadc862ed6fa45298dbfb7d4fd351
BLAKE2b-256 20af9f8bcdb4d1ca3317340b2bb6bd0588e8b9a9a86d6499783f21c56902854e

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page