Skip to main content

AudioIC Project

Project description

AudioIC

This repository is the official implementation of the ICASSP 2025 paper "Estimating Musical Surprisal in Audio", retrained on open data. Below is the abstract of the paper:

Abstract
In modeling musical surprisal expectancy with computational methods, it has been proposed to use the information content (IC) of one-step predictions from an autoregressive model as a proxy for surprisal in symbolic music. With an appropriately chosen model, the IC of musical events has been shown to correlate with human perception of surprise and complexity aspects, including tonal and rhythmic complexity. This work investigates whether an analogous methodology can be applied to music audio. We train an autoregressive Transformer model to predict compressed latent audio representations of a pretrained autoencoder network. We verify learning effects by estimating the decrease in IC with repetitions. We investigate the mean IC of musical segment types (e.g., A or B) and find that segment types appearing later in a piece have a higher IC than earlier ones on average. We investigate the IC's relation to audio and musical features and find it correlated with timbral variations and loudness and, to a lesser extent, dissonance, rhythmic complexity, and onset density related to audio and musical features. Finally, we investigate if the IC can predict EEG responses to songs and thus model humans' surprisal in music.

AudioIC provides tools for calculating the information content (IC) as a proxy for human experienced surprise when listening to music. It includes a command line tool and python classes for calculating IC.

Installation

You can install the package using pip with or without the extra dependencies required for the demo.

Install the package for general use:

pip install audioic

Install the package with demo dependencies:

git clone https://github.com/sonycslparis/audioic.git
cd audioic
pip install ".[demo]"

Usage

Running the audio_ic Command-Line Tool

The audio_ic command-line tool allows you to compute the information content (IC) of audio files. To use it, specify the audio files you want to process and provide an output directory where the results will be saved as CSV files:

python -m audio_ic --audio_files "['<audio-file1>', '<audio-file2>', ...]" --output_dir <output-dir> --device "cpu"

Replace <audio-file1>, <audio-file2>, etc., with the paths to your audio files, and <output-dir> with the directory where you want the output files to be stored.

To run the tool on a GPU (default), specify the --device argument as "cuda":

CUDA_VISIBLE_DEVICES=<device-id> python -m audio_ic --audio_files "['<audio-file1>', '<audio-file2>', ...]" --output_dir <output-dir> --device "cuda"

Replace <device-id> by a cuda device id.

Using the AudioIC programmatically

The demo.ipynb notebook demonstrates how to use the library programmatically to calculate and visualize the IC of audio files.

Citation

If you use this project in your research, please cite the following paper:

@INPROCEEDINGS{10890619,
    author={Bjare, Mathias Rose and Cantisani, Giorgia and Lattner, Stefan and Widmer, Gerhard},
    booktitle={ICASSP 2025 - 2025 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP)}, 
    title={Estimating Musical Surprisal in Audio}, 
    year={2025},
    volume={},
    number={},
    pages={1-5},
    keywords={Computational modeling;Music;Predictive models;Signal processing;Brain modeling;Transformers;Electroencephalography;Complexity theory;Integrated circuit modeling;Speech processing;Music information retrieval;Musical surprisal;Perceptual models;Neural networks},
    doi={10.1109/ICASSP49660.2025.10890619}}

License

This project is licensed under the CC BY-NC 4.0 License.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

audioic-0.1.2.tar.gz (7.1 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

audioic-0.1.2-py3-none-any.whl (7.1 kB view details)

Uploaded Python 3

File details

Details for the file audioic-0.1.2.tar.gz.

File metadata

  • Download URL: audioic-0.1.2.tar.gz
  • Upload date:
  • Size: 7.1 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.1.0 CPython/3.11.0

File hashes

Hashes for audioic-0.1.2.tar.gz
Algorithm Hash digest
SHA256 2a63305c5e12868d1838fedd1a76d6d9b947d5191b53fa1956dd91f2e4072df9
MD5 60078b924bdcbd3743e39cbee628ed4c
BLAKE2b-256 6a074f784da79c12abf4ef3f8e40bca89ccd398e70081d831ba188b84a6f125c

See more details on using hashes here.

File details

Details for the file audioic-0.1.2-py3-none-any.whl.

File metadata

  • Download URL: audioic-0.1.2-py3-none-any.whl
  • Upload date:
  • Size: 7.1 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.1.0 CPython/3.11.0

File hashes

Hashes for audioic-0.1.2-py3-none-any.whl
Algorithm Hash digest
SHA256 882bf225d270370bd892a055083bc0a99b50987466fa908f0933282d66319948
MD5 88112df78a95b8144bb4c4c799c07340
BLAKE2b-256 1486891c13e358af590450471605b0167893206b011b89f770f6d1233c164659

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page