automated annotation of vocalizations for everybody
automated annotation of animal vocalizations
vak is a library for researchers studying animal vocalizations.
It automates annotation of vocalizations, using artificial neural networks.
By annotation, we mean something like the example of annotated birdsong shown below:
vak training data in the form of audio or spectrogram files with annotations,
vak helps you train neural network models
and use the trained models to predict annotations for new files.
$ pip install vak
For the long version detail, please see: https://vak.readthedocs.io/en/latest/get_started/installation.html
Training models to segment and label vocalizations
Currently the easiest way to work with
vak is through the command line.
You run it with
config.toml files, using one of a handful of commands.
For more details, please see the "autoannotate" tutorial here:
Data and folder structures
To train models, you provide training data in the form of audio or spectrograms files, and annotations for those files.
Spectrograms and labels
The package can generate spectrograms from
.wav files or
.cbin audio files.
It can also accept spectrograms in the form of Matlab
.mat or Numpy
The locations of these files are specified in the
The annotations are parsed by a separate library,
aims to handle common formats like Praat
textgrid files, and enable
researchers to easily work with formats they may have developed in their
own labs. For more information please see:
Preparing training files
It is possible to train on any manually annotated data but there are some useful guidelines:
- Use as many examples as possible - The results will just be better. Specifically, this code will not label correctly syllables it did not encounter while training and will most probably generalize to the nearest sample or ignore the syllable.
- Use noise examples - This will make the code very good in ignoring noise.
- Examples of syllables on noise are important - It is a good practice to start with clean recordings. The code will not perform miracles and is most likely to fail if the audio is too corrupt or masked by noise. Still, training with examples of syllables on the background of cage noises will be beneficial.
Predicting annotations for audio
You can predict annotations for audio files by creating a
config.toml file with a [PREDICT] section.
For more details, please see the "autoannotate" tutorial here: https://vak.readthedocs.io/en/latest/tutorial/autoannotate.html
Support / Contributing
Currently we are handling support through the issue tracker on GitHub:
Please raise an issue there if you run into trouble.
That would be a great place to start if you are interested in contributing, as well.
"Why this name, vak?"
Does your library have any poems?
Release history Release notifications | RSS feed
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
|Filename, size||File type||Python version||Upload date||Hashes|
|Filename, size vak-0.3.3-py3-none-any.whl (252.6 kB)||File type Wheel||Python version py3||Upload date||Hashes View|
|Filename, size vak-0.3.3.tar.gz (159.4 kB)||File type Source||Python version None||Upload date||Hashes View|