Skip to main content

VGGish in Keras.

Project description

VGGish: A VGG-like audio classification model

This repository provides a VGGish model, implemented in Keras with tensorflow backend (since tf.slim is deprecated, I think we should have an up-to-date interface). This repository is developed based on the model for AudioSet. For more details, please visit the slim version.


pip install vggish-keras

Weights will be downloaded the first time they are requested. You can also run python -m vggish_keras.download_helpers.download_weights which will download them.

NOTE: Currently - this relies on a pending change to pumpp in To get those changes, you need

pip install git+


import librosa
import numpy as np
import vggish_keras as vgk

# define the model
pump = vgk.get_pump()
model = vgk.VGGish(pump)

# transform audio into VGGish embeddings without fc layers
X = pump.transform(librosa.util.example_audio_file())[vgk.params.PUMP_INPUT]
X = np.concatenate([X]*5)
Z = model.predict(X)

# calculate timestamps
op = pump['mel']
ts = np.arange(len(Z)) / * op.hop_length
assert Z.shape == (5, 512)


I include a weight conversion script in download_helpers/ which shows how I converted the weights from .ckpt to .h5 for those that are interested.


  • currently, parameters (sample rate, hop size, etc) can be changed globally via vgk.params - I'd like to allow for parameter overrides to be passed to vgk.VGGish

Project details

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

vggish-keras-0.0.19.tar.gz (7.7 kB view hashes)

Uploaded Source

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page