Skip to main content

Small example datasets of annotated vocalizations

Project description

pollyglot

Small example datasets of annotated vocalizations.

Useful if you have:

  • need some example vocalizations that are quick to download
  • build a tool that works with different annotation formats and you want to test that tool

usage

There are two components to pollyglot:

  1. command-line tool that creates the small example datasets from larger publicly-available datasets
  2. package that fetches the small example datasets, which can be a dependency for your library

To use (1), you invoke the command-line tool pollymake

Cloning this repository, installing it for development (see below), and then calling

$ pollymake all

will re-make the dataset within the repository.

pollymake creates an archive from each repository. These are then uploaded to a Figshare dataset repository: https://figshare.com/articles/pollyglot/9929549

The goal of this package to share code that automates the process of creating a data repository on FigShare, and make this source open for collaboration. The formats in this repository can be parsed by the Crowsetta package. Development and tutorials for crowsetta make use of the small, quick-to-download archives of each format on Figshare that are generated from the source in this repository.

crowsetta provides tools for anyone that wants to write clean code when working with these annotation formats (or their own format) To learn more, please visit https://github.com/NickleDave/crowsetta

formats + references

Below are the formats included and references for the sources.

Praat textgrid

Textgrids output by the Praat program.

Songs with Praat textgrid format are from the Birdsong Database provided by the Taylor lab at UCLA:
http://taylor0.biology.ucla.edu/birdDBQuery/
as presented in this paper: https://www.sciencedirect.com/science/article/pii/S1574954115000151 The .xls file containing links to songs from the Taylor lab birdsong database was created by Tim Sainburg to train generative networks for animal vocalizations: https://github.com/timsainb/AVGN; adapted under MIT license.

.not.mat

.not.mat files are output by the evsonganaly GUI created by Evren Tumer in the Brainard lab. The audio file format .cbin is output by the Labview program EvTAF.

Another repository of Bengalese finch song annotated in this format is here: https://figshare.com/articles/Bengalese_Finch_song_repository/4805749

BirdsongRecognition

A specific .xml format for a repository of labeled Bengalese Finch song. The repository is here: https://figshare.com/articles/BirdsongRecognition/3470165. The repository provides data for testing a convolutional neural network to segment and label vocalizations, as shared in the repository https://github.com/takuya-koumura/birdsong-recognition and discussed in the paper "Automatic recognition of element classes and boundaries in the birdsong with variable sequences" by Takuya Koumura and Kazuo Okanoya (http://journals.plos.org/plosone/article?id=10.1371/journal.pone.0159188).

Project information

License

Pollyglot (c) by David Nicholson, 2018-2019.

Code is shared under BSD-3 License.

Where applicable, data in the vocal-annotations-formats-dataset is licensed under a Creative Commons Attribution-ShareAlike 4.0 International License. (The figshare repositories are shared under CC-BY-4.0) Where the authors have not made their intentions clear with a license, citations to papers and links to the original source are included. Please raise an issue on this repository if there are any concerns about this.

You should have received a copy of the license along with this work. If not, see http://creativecommons.org/licenses/by-sa/4.0/.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

pollyglot-0.1.1.tar.gz (13.2 kB view details)

Uploaded Source

Built Distribution

pollyglot-0.1.1-py3-none-any.whl (21.2 kB view details)

Uploaded Python 3

File details

Details for the file pollyglot-0.1.1.tar.gz.

File metadata

  • Download URL: pollyglot-0.1.1.tar.gz
  • Upload date:
  • Size: 13.2 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/2.0.0 pkginfo/1.5.0.1 requests/2.22.0 setuptools/41.0.1 requests-toolbelt/0.9.1 tqdm/4.33.0 CPython/3.6.7

File hashes

Hashes for pollyglot-0.1.1.tar.gz
Algorithm Hash digest
SHA256 0870e46a15ccc90744c6a4804057692d996ca8c66474d7a2277626e29549bc84
MD5 f0254d588e9af16e03b7b28d66be673e
BLAKE2b-256 0b4e920931f8b00699ce4ce3fb349d53d0054b607fd58fa66ddf958449d78ea6

See more details on using hashes here.

File details

Details for the file pollyglot-0.1.1-py3-none-any.whl.

File metadata

  • Download URL: pollyglot-0.1.1-py3-none-any.whl
  • Upload date:
  • Size: 21.2 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/2.0.0 pkginfo/1.5.0.1 requests/2.22.0 setuptools/41.0.1 requests-toolbelt/0.9.1 tqdm/4.33.0 CPython/3.6.7

File hashes

Hashes for pollyglot-0.1.1-py3-none-any.whl
Algorithm Hash digest
SHA256 6620e430bc9370a7ef55a8920531b49e55c0bef519215041b735fdadcc950928
MD5 8e90766cb5ce59a8e07245208b76dae6
BLAKE2b-256 a36fcba0b60245b8bc13d2a417a01c3f635f9c133c84d9ebadd0a0c98590e18e

See more details on using hashes here.

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page