Sonosthesia tools for baking audio analysis data

These details have not been verified by PyPI

Project links

Project description

sonosthesia-audio-pipeline

Python based tooling to analyse audio files and write results to file for use in realtime visualization apps. Results can be written using Message Pack for efficient (de)serialization or JSON for human readable output. Readers are provided for the Unity timeline, to be used alongside the original audio files.

Installation

Installation requires python (version 3.9 to 3.12 are supported). Once you have python you can run

pip install sonosthesia-audio-pipeline --upgrade

Quick start

Note that you might need to use the python3 command rather than python depending on your setup. File path is absolute or relative to the working directory. Supported audio files are .mp3 and .wav.

Note that the --input (shorthand -i) can also be a directory in which case files with appropriate extensions will be processed in order.

Use --help (shorthand h) to get argument list for each of the subcommands. For example

python -m sonosthesia_audio_pipeline analysis -h

Anaylysis

Will generate an .xaa analysis file alongside the audio file.

python -m sonosthesia_audio_pipeline analysis -i File.mp3

Separation

Will create seperated files in a directory with the audio file's name (without extension), nested by demucs separation model type as described here. The default model is mdx_extra.

python -m sonosthesia_audio_pipeline separation -i File.mp3

To specify another model use --model (shorthand -n)

python -m sonosthesia_audio_pipeline separation -i File.mp3 -n mdx

Pipeline

Pipeline runs the source separation on an audio file, then runs analysis on the original audio file as well as the separated ones

python -m sonosthesia_audio_pipeline pipeline -i File.mp3

Python Pipeline

Source Separation

Currently using Demucs because it seems to score better on overall SDR and is a lot easier to install with pip than Spleeter.

Sound Analysis

This is a high level description, for output file schemas see the Output file specification section.

Librosa is used to extract audio features which are of particular interest for driving reactive visuals, notably:

Beats and tempo
RMS magnitude
Energy in low, mid and high frequency bands
Onsets
Spectral centroid and bandwidth

The analysis contains various kinds of data

Continuous

Provided for each analysis from, with 512 sample hop size

{
    "time": 0.0,
    "rms": 0.0,
    "lows": 0.0,
    "mids": 0.0,
    "highs": 0.0,
    "centroid": 0.0
}

Peak

Discrete events describing a detected peak in the

{
    "channel": 0
    "start": 0.0,
    "duration": 0.0,
    "magnitude": 0.0,
    "strength": 0.0
}

channel is 0 (main), 1 (lows), 2 (mids), 3 (highs)
start is the peak start time in seconds
duration is the peak start time in seconds
magnitude is the max peak magnitude in dB
strength is max the onset envelope (normalized)

Info

There is an info field which contains meta data and stats about the analysis for easy retrieval

{
    "info": {
      "duration": 277.0721088435374,
      "main": {
        "band": {
          "lower": 20.0,
          "upper": 8000.0
        },
        "magnitude": {
          "lower": -87.18254089355469,
          "upper": -7.182541847229004
        },
        "peaks": 699
      },
      "lows": {
        "band": {
          "lower": 30.0,
          "upper": 100.0
        },
        "magnitude": {
          "lower": -86.90585327148438,
          "upper": -6.905849456787109
        },
        "peaks": 1823
      },
      "mids": {
        "band": {
          "lower": 500.0,
          "upper": 2000.0
        },
        "magnitude": {
          "lower": -94.8998794555664,
          "upper": -14.899882316589355
        },
        "peaks": 669
      },
      "highs": {
        "band": {
          "lower": 4000.0,
          "upper": 16000.0
        },
        "magnitude": {
          "lower": -95.01151275634766,
          "upper": -15.01151180267334
        },
        "peaks": 691
      },
      "centroid": {
        "lower": 0.0,
        "upper": 8195.748929100935
      }
    }
   
}

Planned

Look into using Essentia which seems to be good for highler level musical descriptors.

Readers

Unity Timeline

A Unity Timeline reader for analysis files is provided. A demo application is available here

Planned

Planning reader for Unreal Engine and Swift

Output file specification

Binary MsgPack (.xaa)

Header for the file is three 32 bit integers used to determine version and reserved for future use. Rest of the file is message pack data with the following JSON equivalent shema here

Human readable JSON (.json)

Primarily used for investigation and debugging purposes. JSON schema available here. In order to write JSON analysis files, you can specify the -j argument

Converting between .xaa and .json

You can convert between formats using

python -m sonosthesia_audio_pipeline conversion -i File.xaa

python -m sonosthesia_audio_pipeline conversion -i File.json

Project details

These details have not been verified by PyPI

Project links

Release history Release notifications | RSS feed

2.1.0

Sep 27, 2024

This version

2.0.0

Sep 26, 2024

0.0.10

Sep 26, 2024

0.0.9

Sep 26, 2024

0.0.8

Sep 26, 2024

0.0.7

Sep 26, 2024

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

sonosthesia_audio_pipeline-2.0.0.tar.gz (15.7 kB view details)

Uploaded Sep 26, 2024 Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

The dropdown lists show the available interpreters, ABIs, and platforms. Enable javascript to be able to filter the list of wheel files.

sonosthesia_audio_pipeline-2.0.0-py3-none-any.whl (18.2 kB view details)

Uploaded Sep 26, 2024 Python 3

File details

Details for the file sonosthesia_audio_pipeline-2.0.0.tar.gz.

File metadata

Download URL: sonosthesia_audio_pipeline-2.0.0.tar.gz
Upload date: Sep 26, 2024
Size: 15.7 kB
Tags: Source
Uploaded using Trusted Publishing? No
Uploaded via: twine/5.1.1 CPython/3.9.6

File hashes

Hashes for sonosthesia_audio_pipeline-2.0.0.tar.gz
Algorithm	Hash digest
SHA256	`8cc384e2a72454f5ea1605e9e41a6a5f268a2309b43dfad42e9a4a80e37687de`
MD5	`decb8cf8c14eacbedfdf8aee3bc6f0a0`
BLAKE2b-256	`fa0e24b0593fbb743ce60a3f4b63c748bf061ad21ad8ace59748e5ec6f70cbd7`

See more details on using hashes here.

File details

Details for the file sonosthesia_audio_pipeline-2.0.0-py3-none-any.whl.

File metadata

Download URL: sonosthesia_audio_pipeline-2.0.0-py3-none-any.whl
Upload date: Sep 26, 2024
Size: 18.2 kB
Tags: Python 3
Uploaded using Trusted Publishing? No
Uploaded via: twine/5.1.1 CPython/3.9.6

File hashes

Hashes for sonosthesia_audio_pipeline-2.0.0-py3-none-any.whl
Algorithm	Hash digest
SHA256	`3e7ac5e98b28df39054089f3da7b1d5e5c9c004a850b7207eefd245e50c68bf9`
MD5	`12f7b6d4e24dd2d7fa72208cd120c162`
BLAKE2b-256	`b4da72ff36b0b05d6a9171b36619c4af88be2803343618fba09344fc8a5df511`

See more details on using hashes here.

sonosthesia-audio-pipeline 2.0.0

Navigation

Verified details

Maintainers

Unverified details

Project links

Meta

Classifiers

Project description

sonosthesia-audio-pipeline

Installation

Quick start

Anaylysis

Separation

Pipeline

Python Pipeline

Source Separation

Sound Analysis

Continuous

Peak

Info

Planned

Readers

Unity Timeline

Planned

Output file specification

Binary MsgPack (.xaa)

Human readable JSON (.json)

Converting between .xaa and .json

Project details

Verified details

Maintainers

Unverified details

Project links

Meta

Classifiers

Release history Release notifications | RSS feed

Download files

Source Distribution

Built Distribution

File details

File metadata

File hashes

File details

File metadata

File hashes