VGGish in Keras.
Project description
VGGish: A VGG-like audio classification model
This repository provides a VGGish model, implemented in Keras with tensorflow backend (since tf.slim
is deprecated, I think we should have an up-to-date interface). This repository is developed
based on the model for AudioSet.
For more details, please visit the slim version.
Install
pip install vggish-keras
Weights will be automatically downloaded when installing via pip.
Currently - this relies on a pending change to pumpp
in https://github.com/bmcfee/pumpp/pull/123. To get those changes, you need
pip install git+https://github.com/beasteers/pumpp@tf_keras
Usage
import librosa
import numpy as np
import vggish_keras as vgk
# define the model
pump = vgk.get_pump()
model = vgk.VGGish(pump)
# transform audio into VGGish embeddings without fc layers
X = pump.transform(librosa.util.example_audio_file())[vgk.params.PUMP_INPUT]
X = np.concatenate([X]*5)
Z = model.predict(X)
# calculate timestamps
op = pump['mel']
ts = np.arange(len(Z)) / op.sr * op.hop_length
assert Z.shape == (5, 512)
Reference:
-
Gemmeke, J. et. al., AudioSet: An ontology and human-labelled dataset for audio events, ICASSP 2017
-
Hershey, S. et. al., CNN Architectures for Large-Scale Audio Classification, ICASSP 2017
-
Model with the top fully connected layers
-
Model without the top fully connected layers
TODO
- add fully connected layers
- add PCA postprocessing (needs fully connected layers and to add PCA params to model)
- currently, parameters (sample rate, hop size, etc) can be changed globally via
vgk.params
- I'd like to allow for parameter overrides to be passed tovgk.VGGish
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
File details
Details for the file vggish-keras-0.0.1.tar.gz
.
File metadata
- Download URL: vggish-keras-0.0.1.tar.gz
- Upload date:
- Size: 5.6 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/2.0.0 pkginfo/1.5.0.1 requests/2.22.0 setuptools/41.2.0 requests-toolbelt/0.9.1 tqdm/4.36.1 CPython/3.7.4
File hashes
Algorithm | Hash digest | |
---|---|---|
SHA256 |
acfb9e5ab07e99b31884874f5105115bca3b158b8fbce92d7e9fe716f86efabc
|
|
MD5 |
84bd15cf920f4d76624b3bfb90b8145f
|
|
BLAKE2b-256 |
1cf101c514640e2c1b1f9dced37dcf134fc668b3ce68df3c49cf52cb2aa7805f
|