Audio embeddings based on sparse or UST specialized (SEA) Look, Listen, and Learn (L3) models for the Edge

These details have not been verified by PyPI

Project links

Project description

edgel3

Look, Listen, and Learn (L3) [4] Audio subnetwork produces generic audio representations that can be used for myriad downstream tasks. However, L3-Net Audio requires 18 MB and 12 MB of static and dynamic memory respectively, making it infeasible for small edge devices with a single microcontroller. EdgeL3 [2] is competetive with L3 Audio while being 95.45% sparse. However, it still has a high activation memory requirement.

To jointly handle both static and dynamic memory, we introduce Specialized Embedding Approximation[1], a teacher-student learning paradigm where the student audio embedding model is trained to approximate only the part of the teacher's embedding manifold which is relevant to the target data-domain. Notice the difference between data-domain and dataset. Restricting the specialization on a particular downstream dataset would compromise intra-domain generalizability.

edgel3 is an open-source Python library for downloading the smaller versions of L3 models and computing deep audio embeddings from such models.

The sea models are specialized for SONYC-UST [5] data domain. Training pipelines can be found [here].
The sparse models provided have been re-trained using two different mechanisms: fine-tuning ft and knowledge distillation kd. Training pipelines can be found [here].

For non-compressed L3-Net, please refer to OpenL3 [3]

Installing edgel3

Dependencies

Tensorflow

edgel3 has been tested with Tensorflow 2.0 and Keras 2.3.1.

pip install tensorflow==2.0.0

libsndfile

edgel3 depends on the pysoundfile module to load audio files, which depends on the non-Python library libsndfile. On Windows and macOS, these will be installed via pip and you can therefore skip this step. However, on Linux this must be installed manually via your platform's package manager. For Debian-based distributions (such as Ubuntu), this can be done by simply running

apt-get install libsndfile1

For more detailed information, please consult the pysoundfile installation documentation.

Installing edgel3

The simplest way to install edgel3 is by using pip, which will also install the additional required dependencies if needed. To install edgel3 using pip, simply run

pip install edgel3

To install the latest version of edgel3 from source:

Clone or pull the lastest version:

 git clone https://github.com/ksangeeta2429/edgel3.git

Install using pip to handle python dependencies: cd edgel3 pip install -e .

Getting started with edgel3

Load a SONYC-UST specialized L3 audio (reduced input represenation and reduced architecture) that outputs an embedding of length 128

model = edgel3.models.load_embedding_model(model_type='sea', emb_dim=128)

Load a 95.45% sparse L3 audio re-trained with fine-tuning

model = edgel3.models.load_embedding_model(model_type='sparse', retrain_type='ft', sparsity=95.45)

Load a 87.0% sparse L3 audio re-trained with knowledge distillation

model = edgel3.models.load_embedding_model(model_type='sparse', retrain_type='kd', sparsity=87.0)

For more examples, please see the tutorial and module usage.

References

If you use the SEA/EdgeL3 Github repos or the pre-trained models, please cite the relevant work:

[1] Specialized Embedding Approximation for Edge Intelligence: A case study in Urban Sound Classification
Sangeeta Srivastava, Dhrubojyoti Roy, Mark Cartwright, Juan Pablo Bello, and Anish Arora.
To be published in IEEE Int. Conf. on Acoustics, Speech and Signal Processing (ICASSP), Toronto, Canada, June 2021.

[2] EdgeL3: Compressing L3-Net for Mote-Scale Urban Noise Monitoring
Sangeeta Kumari, Dhrubojyoti Roy, Mark Cartwright, Juan Pablo Bello, and Anish Arora.
Parallel AI and Systems for the Edge (PAISE), Rio de Janeiro, Brazil, May 2019.

[3] Look, Listen and Learn More: Design Choices for Deep Audio Embeddings
Jason Cramer, Ho-Hsiang Wu, Justin Salamon, and Juan Pablo Bello.
IEEE Int. Conf. on Acoustics, Speech and Signal Processing (ICASSP), pages 3852–3856, Brighton, UK, May 2019.

[4] Look, Listen and Learn
Relja Arandjelović and Andrew Zisserman
IEEE International Conference on Computer Vision (ICCV), Venice, Italy, Oct. 2017.

[5] SONYC Urban Sound Tagging (SONYC-UST): a multilabel dataset from an urban acoustic sensor network
Mark Cartwright, Ana Elisa Mendez Mendez, Graham Dove, Jason Cramer et al. 2019.

Project details

These details have not been verified by PyPI

Project links

Release history Release notifications | RSS feed

This version

0.2.1

Apr 19, 2021

0.1.0

May 16, 2019

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

edgel3-0.2.1.tar.gz (17.6 kB view details)

Uploaded Apr 19, 2021 Source

File details

Details for the file edgel3-0.2.1.tar.gz.

File metadata

Download URL: edgel3-0.2.1.tar.gz
Upload date: Apr 19, 2021
Size: 17.6 kB
Tags: Source
Uploaded using Trusted Publishing? No
Uploaded via: twine/3.4.1 importlib_metadata/4.0.0 pkginfo/1.7.0 requests/2.25.1 requests-toolbelt/0.9.1 tqdm/4.60.0 CPython/3.7.7

File hashes

Hashes for edgel3-0.2.1.tar.gz
Algorithm	Hash digest
SHA256	`104182954b7476b6e3eea179a2736f97addec502ecc1331d7d8ad28f15ffd88d`
MD5	`7fd67f55a9b52c4baeb1b1be28a77582`
BLAKE2b-256	`81b39dcb9b08880016ce02a95b7d8619a1a33d23f6ff11c449613e07bbfc13da`

See more details on using hashes here.

edgel3 0.2.1

Navigation

Verified details

Maintainers

Unverified details

Project links

Meta

Classifiers

Project description

edgel3

Installing edgel3

Dependencies

Tensorflow

libsndfile

Installing edgel3

Getting started with edgel3

References

Project details

Verified details

Maintainers

Unverified details

Project links

Meta

Classifiers

Release history Release notifications | RSS feed

Download files

Source Distribution

File details

File metadata

File hashes