Representation Learning for the Automatic Indexing of Sound Effects Libraries (ISMIR 2022): Deep audio embeddings pre-trained on UCS & Non-UCS-compliant datasets.
Project description
aiSFX
Representation Learning for the Automatic Indexing of Sound Effects Libraries (ISMIR 2022): Deep audio embeddings pre-trained on UCS & Non-UCS-compliant datasets.
This work was inspired by the creation of the Universal Category System (UCS), an industry-proposed public domain initiative initialized by Tim Nielsen, Justin Drury, Kai Paquin, and others. First launching in the fall of 2020, UCS offers a standardized framework for sound effects library metadata designed by and for sound designers and editors.
How To Use
Please refer to this package's documentation for Installation Instructions and Tutorials of how to extract embeddings.
Visualizations of UCS Classes
Click the above to visualize coarse-level "Category" UCS classes in Pro Sound Effects (PSE), Soundly (SDLY), and UCS Mixed (UMIX).
Cite This Work
Please cite the paper below if you use it in your work.
This paper has been accepted at the 23rd International Society for Music Information Retrieval Conference (ISMIR) in Bengaluru, India (December 04-08, 2022). To cite our work, please refer to the following.
[1] Representation Learning for the Automatic Indexing of Sound Effects Libraries
@inproceedings{ismir_aisfx,
title={Representation Learning for the Automatic Indexing of Sound Effects Libraries},
author={Ma, Alison Bernice and Lerch, Alexander},
booktitle={Proceedings of the 23rd International Society for Music Information Retrieval Conference (ISMIR)},
year={2022},
pages={866--875}
}
Acknowledgements
We would like to thank those who provided the data required to conduct this research as well as those who took the time to share their insights and software licenses for tools regarding sound search, query, and retrieval.
Universal Category System (UCS) • Alex Lane • All You Can Eat Audio • Articulated Sounds • Audio Shade • aXLSound • Big Sound Bank • BaseHead • Bonson • BOOM Library • Frick & Traa • Hzandbits • InspectorJ • Kai Paquin • KEDR Audio • Krotos Audio • Nikola Simikic • Penguin Grenade • Pro Sound Effects • Rick Allen Creative • Sononym • Sound Ideas • Soundly • Soundminer • Storyblocks • Tim Nielsen • Thomas Rex Beverly • ZapSplat
License: Pre-trained Model & Paper
This pre-trained model and paper [1] is made available under a Creative Commons Attribution 4.0 International License (CC BY 4.0).
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
File details
Details for the file aisfx-0.1.2.tar.gz
.
File metadata
- Download URL: aisfx-0.1.2.tar.gz
- Upload date:
- Size: 23.1 MB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/4.0.2 CPython/3.9.6
File hashes
Algorithm | Hash digest | |
---|---|---|
SHA256 | f1beb34ab0502e97a4dea18ea09e189abaf7fc6480b160cb3ddb73750827cc7d |
|
MD5 | f2f9be96c966e55a37394a1d2588cb04 |
|
BLAKE2b-256 | f3f91986170d9937f5f1cb2ae1b821a01fa5f42e8f0bbe640194fd7724a52d24 |
File details
Details for the file aisfx-0.1.2-py2.py3-none-any.whl
.
File metadata
- Download URL: aisfx-0.1.2-py2.py3-none-any.whl
- Upload date:
- Size: 23.1 MB
- Tags: Python 2, Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/4.0.2 CPython/3.9.6
File hashes
Algorithm | Hash digest | |
---|---|---|
SHA256 | bfe219950715af179e9a1653e7117cc1852baef267cd6596c42cad74afe1d194 |
|
MD5 | 74d1e85462c0e051fdd1c03facbffef6 |
|
BLAKE2b-256 | fddb25d653f8beeae1277c77da6fbfbc8965ad599afd4218e901406d69cf5551 |