AudioIC Project
Project description
AudioIC
This repository is the official implementation of the ICASSP 2025 paper "Estimating Musical Surprisal in Audio", retrained on open data. Below is the abstract of the paper:
Abstract
In modeling musical surprisal expectancy with computational methods, it has been proposed to use the information content (IC) of one-step predictions from an autoregressive model as a proxy for surprisal in symbolic music. With an appropriately chosen model, the IC of musical events has been shown to correlate with human perception of surprise and complexity aspects, including tonal and rhythmic complexity. This work investigates whether an analogous methodology can be applied to music audio. We train an autoregressive Transformer model to predict compressed latent audio representations of a pretrained autoencoder network. We verify learning effects by estimating the decrease in IC with repetitions. We investigate the mean IC of musical segment types (e.g., A or B) and find that segment types appearing later in a piece have a higher IC than earlier ones on average. We investigate the IC's relation to audio and musical features and find it correlated with timbral variations and loudness and, to a lesser extent, dissonance, rhythmic complexity, and onset density related to audio and musical features. Finally, we investigate if the IC can predict EEG responses to songs and thus model humans' surprisal in music.
AudioIC provides tools for calculating the information content (IC) as a proxy for human experienced surprise when listening to music. It includes a command line tool and python classes for calculating IC.
Installation
You can install the package using pip with or without the extra dependencies required for the demo.
Install the package for general use:
pip install git+https://github.com/sonycslparis/audioic.git
Install the package with demo dependencies:
git clone https://github.com/sonycslparis/audioic.git
cd audioic
pip install .[demo]
Usage
Running the audio_ic Command-Line Tool
The audio_ic command-line tool allows you to compute the information content (IC) of audio files. To use it, specify the audio files you want to process and provide an output directory where the results will be saved as CSV files:
python -m audio_ic --audio_files "['<audio-file1>', '<audio-file2>', ...]" --output_dir <output-dir> --device "cpu"
Replace <audio-file1>, <audio-file2>, etc., with the paths to your audio files, and <output-dir> with the directory where you want the output files to be stored.
To run the tool on a GPU (default), specify the --device argument as "cuda":
CUDA_VISIBLE_DEVICES=<device-id> python -m audio_ic --audio_files "['<audio-file1>', '<audio-file2>', ...]" --output_dir <output-dir> --device "cuda"
Replace <device-id> by a cuda device id.
Using the AudioIC programmatically
The demo.ipynb notebook demonstrates how to use the library programmatically to calculate and visualize the IC of audio files.
Citation
If you use this project in your research, please cite the following paper:
@INPROCEEDINGS{10890619,
author={Bjare, Mathias Rose and Cantisani, Giorgia and Lattner, Stefan and Widmer, Gerhard},
booktitle={ICASSP 2025 - 2025 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP)},
title={Estimating Musical Surprisal in Audio},
year={2025},
volume={},
number={},
pages={1-5},
keywords={Computational modeling;Music;Predictive models;Signal processing;Brain modeling;Transformers;Electroencephalography;Complexity theory;Integrated circuit modeling;Speech processing;Music information retrieval;Musical surprisal;Perceptual models;Neural networks},
doi={10.1109/ICASSP49660.2025.10890619}}
License
This project is licensed under the CC BY-NC 4.0 License.
Project details
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file audioic-0.1.0.tar.gz.
File metadata
- Download URL: audioic-0.1.0.tar.gz
- Upload date:
- Size: 7.1 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.1.0 CPython/3.11.0
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
d4e735bc9a95a8ab6ea23aed925cca475587e0ced07eafda70416646e9ba893e
|
|
| MD5 |
f856d3d1844289260c8b3f9e4ef2f0f0
|
|
| BLAKE2b-256 |
f0382b39485acf7e2b77d976906824cfd7b09899a6bd0a6207f28a0bd6413e0e
|
File details
Details for the file audioic-0.1.0-py3-none-any.whl.
File metadata
- Download URL: audioic-0.1.0-py3-none-any.whl
- Upload date:
- Size: 7.1 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.1.0 CPython/3.11.0
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
4a91720dcbb2da63cae96046fb9780074ee2f85509513ba783d870d633d462a9
|
|
| MD5 |
2f49aca4bbbcdb743ccf4ea6180fa53b
|
|
| BLAKE2b-256 |
edc0da59d2008aa6dfdad65fc84c235b3e9506f231e4632b81565a3db8c56ba7
|