Skip to main content

AudioIC Project

Project description

AudioIC

This repository is the official implementation of the ICASSP 2025 paper "Estimating Musical Surprisal in Audio", retrained on open data. Below is the abstract of the paper:

Abstract
In modeling musical surprisal expectancy with computational methods, it has been proposed to use the information content (IC) of one-step predictions from an autoregressive model as a proxy for surprisal in symbolic music. With an appropriately chosen model, the IC of musical events has been shown to correlate with human perception of surprise and complexity aspects, including tonal and rhythmic complexity. This work investigates whether an analogous methodology can be applied to music audio. We train an autoregressive Transformer model to predict compressed latent audio representations of a pretrained autoencoder network. We verify learning effects by estimating the decrease in IC with repetitions. We investigate the mean IC of musical segment types (e.g., A or B) and find that segment types appearing later in a piece have a higher IC than earlier ones on average. We investigate the IC's relation to audio and musical features and find it correlated with timbral variations and loudness and, to a lesser extent, dissonance, rhythmic complexity, and onset density related to audio and musical features. Finally, we investigate if the IC can predict EEG responses to songs and thus model humans' surprisal in music.

AudioIC provides tools for calculating the information content (IC) as a proxy for human experienced surprise when listening to music. It includes a command line tool and python classes for calculating IC.

Installation

You can install the package using pip with or without the extra dependencies required for the demo.

Install the package for general use:

pip install audioic

Install the package with demo dependencies:

git clone https://github.com/sonycslparis/audioic.git
cd audioic
pip install ".[demo]"

Usage

Running the audioic Command-Line Tool

The audioic command-line tool allows you to compute the information content (IC) of audio files. To use it, specify the audio files you want to process and provide an output directory where the results will be saved as CSV files:

python -m audioic.audioic --audio_files "['<audio-file1>', '<audio-file2>', ...]" --output_dir <output-dir> --device "cpu"

Replace <audio-file1>, <audio-file2>, etc., with the paths to your audio files, and <output-dir> with the directory where you want the output files to be stored.

To run the tool on a GPU (default), specify the --device argument as "cuda":

CUDA_VISIBLE_DEVICES=<device-id> python -m audioic --audio_files "['<audio-file1>', '<audio-file2>', ...]" --output_dir <output-dir> --device "cuda"

Replace <device-id> by a cuda device id.

Using the AudioIC programmatically

The demo.ipynb notebook demonstrates how to use the library programmatically to calculate and visualize the IC of audio files.

Citation

If you use this project in your research, please cite the following paper:

@INPROCEEDINGS{10890619,
    author={Bjare, Mathias Rose and Cantisani, Giorgia and Lattner, Stefan and Widmer, Gerhard},
    booktitle={ICASSP 2025 - 2025 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP)}, 
    title={Estimating Musical Surprisal in Audio}, 
    year={2025},
    volume={},
    number={},
    pages={1-5},
    keywords={Computational modeling;Music;Predictive models;Signal processing;Brain modeling;Transformers;Electroencephalography;Complexity theory;Integrated circuit modeling;Speech processing;Music information retrieval;Musical surprisal;Perceptual models;Neural networks},
    doi={10.1109/ICASSP49660.2025.10890619}}

License

This project is licensed under the CC BY-NC 4.0 License.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

audioic-0.1.3.tar.gz (51.4 MB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

audioic-0.1.3-py3-none-any.whl (51.4 MB view details)

Uploaded Python 3

File details

Details for the file audioic-0.1.3.tar.gz.

File metadata

  • Download URL: audioic-0.1.3.tar.gz
  • Upload date:
  • Size: 51.4 MB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.1.0 CPython/3.11.0

File hashes

Hashes for audioic-0.1.3.tar.gz
Algorithm Hash digest
SHA256 bf8f3ceda6f58810dec20f25ccb21f696b6bd860add1da98f5b3090d1e99bc33
MD5 e7fc6a06812908ea7e8af1d363c521c9
BLAKE2b-256 b9388d707378af631101584bb68a5f2069d563468fe9153c5f596b1e960caa12

See more details on using hashes here.

File details

Details for the file audioic-0.1.3-py3-none-any.whl.

File metadata

  • Download URL: audioic-0.1.3-py3-none-any.whl
  • Upload date:
  • Size: 51.4 MB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.1.0 CPython/3.11.0

File hashes

Hashes for audioic-0.1.3-py3-none-any.whl
Algorithm Hash digest
SHA256 744a80e55e49ddb8ec3761f6cc93b220f0ece5c70ba7f97ad588acd71e978bb3
MD5 33d25c83d28b0c09f57bd1836dc0e464
BLAKE2b-256 83fcd1da09003673f2bcfdca33f38a7683d75800f94b38885373f1c632bd228d

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page