crowsetta

A tool to work with any format for annotating vocalizations

These details have not been verified by PyPI

Project links

Project description

a tool to work with any format for annotating vocalizations

crowsetta is a tool to work with any format for annotating vocalizations: speech, birdsong, mouse ultrasonic calls (insert your favorite animal vocalization here). The goal of crowsetta is to make sure that your ability to work with a dataset of vocalizations does not depend on your ability to work with any given format for annotating that dataset. What crowsetta gives you is not yet another format for annotation (I promise!); instead you get some nice data types that make it easy to work with any format: namely, Sequences made up of Segments.

    >>> from crowsetta import Segment, Sequence
    >>> a_segment = Segment.from_keyword(
    ...     label='a',
    ...     onset_ind=16000,
    ...     offset_ind=32000,
    ...     file='bird21.wav'
    ...     )
    >>> list_of_segments = [a_segment] * 3
    >>> seq = Sequence(segments=list_of_segments)
    >>> print(seq)
    Sequence(segments=[Segment(label='a', onset_s=None, offset_s=None, onset_ind=16000,
    offset_ind=32000, file='bird21.wav'), Segment(label='a', onset_s=None, offset_s=None,
    onset_ind=16000, offset_ind=32000, file='bird21.wav'), Segment(label='a', onset_s=None,
    offset_s=None, onset_ind=16000, offset_ind=32000, file='bird21.wav')])

You can load annotation from your format of choice into Sequences of Segments (most conveniently with the Transcriber, as explained below) and then use the Sequences however you need to in your program.

For example, if you want to loop through the Segments of each Sequences to pull syllables out of a spectrogram, you can do something like this, very Pythonically:

   >>> syllables_from_sequences = []
   >>> for a_seq in seq:
   ...     seq_dict = seq.to_dict()  # convert to dict with
   ...     spect = some_spectrogram_making_function(seq['file'])
   ...     syllables = []
   ...     for seg in seq.segments:
   ...         syllable = spect[:, seg.onset:seg.offset]  ## spectrogram is a 2d numpy array
   ...         syllables.append(syllable)
   ...     syllables_from_sequences.append(syllables)

As mentioned above, crowsetta provides you with a Transcriber that comes equipped with convenience functions to do the work of converting for you.

    from crowsetta import Transcriber
    scribe = Transcriber()
    seq = scribe.to_seq(file=notmat_files, format='notmat')

You can even easily adapt the Transcriber to use your own in-house format, like so:

    from crowsetta import Transcriber
    scribe = Transciber(user_config=your_config)
    scribe.to_csv(file_'your_annotation_file.mat',
                  csv_filename='your_annotation.csv')

Features

convert annotation formats to Sequence objects that can be easily used in a Python program
convert Sequence objects to comma-separated value text files that can be read on any system
load comma-separated values files back into Python and convert to other formats
easily use with your own annotation format

You might find it useful in any situation where you want to share audio files of song and some associated annotations, but you don't want to require the user to install a large application in order to work with the annotation files.

Getting Started

Installation

with `pip`

$ pip install crowsetta

with `conda`

$ conda install crowsetta -c conda-forge

Usage

To learn how to use crowsetta, please see the documentation at:
https://crowsetta.readthedocs.io/en/latest/index.html

Development Installation

Currently crowsetta is developed with conda. To set up a development environment:

$ conda create crowsetta-dev
$ conda create -n crowsetta-dev python=3.6 numpy scipy attrs
$ conda activate crowsetta-dev
$ $ pip install evfuncs koumura
$ git clone https://github.com/NickleDave/crowsetta.git
$ cd crowsetta
$ pip install -e .

Project Information

Background

crowsetta was developed for two libraries:

hybrid-vocal-classifier https://github.com/NickleDave/hybrid-vocal-classifier
vak https://github.com/NickleDave/vak

Testing relies on the Vocalization Annotation Formats Dataset which you may find useful if you need small samples of different audio files and associated annotation formats

on Figshare: https://figshare.com/articles/Vocalization_Annotation_Formats_Dataset/8046920
built from this GitHub repository: https://github.com/NickleDave/vocal-annotation-formats

Support

If you are having issues, please let us know.

Issue Tracker: https://github.com/NickleDave/crowsetta/issues

Contribute

Issue Tracker: https://github.com/NickleDave/crowsetta/issues
Source Code: https://github.com/NickleDave/crowsetta

CHANGELOG

You can see project history and work in progress in the CHANGELOG

License

The project is licensed under the BSD license.

Citation

If you use crowsetta, please cite the DOI:

Project details

These details have not been verified by PyPI

Project links

Release history Release notifications | RSS feed

5.1.3.post1

Dec 16, 2025

5.1.3

Dec 15, 2025

5.1.2

Jul 17, 2025

5.1.1

Jul 10, 2025

5.1.0

Oct 12, 2024

5.0.3

Jul 2, 2024

5.0.2.post1

Feb 2, 2024

5.0.2

Feb 2, 2024

5.0.1

May 27, 2023

5.0.0

Mar 29, 2023

5.0.0rc2 pre-release

Mar 6, 2023

5.0.0rc1 pre-release

Mar 1, 2023

4.0.0.post2

Jun 25, 2022

4.0.0.post1

Jun 25, 2022

This version

4.0.0

Jun 25, 2022

3.4.3

May 20, 2023

3.4.2

May 18, 2023

3.4.1

May 14, 2022

3.4.0

Mar 26, 2022

3.3.0

Jan 3, 2022

3.2.0

Dec 19, 2021

3.1.1.post1

Mar 5, 2021

3.1.1

Mar 4, 2021

3.1.0

Jan 12, 2021

3.0.1

Jan 5, 2021

3.0.0

Jan 4, 2021

2.3.0

Jan 3, 2021

2.2.0

Apr 20, 2020

2.1.0

Dec 9, 2019

2.0.0

Jul 17, 2019

1.1.1

May 7, 2019

1.1.0

May 7, 2019

1.0.0

May 5, 2019

0.2.0a5 pre-release

Jan 7, 2019

0.2.0a4 pre-release

Dec 31, 2018

0.2.0a3 pre-release

Dec 26, 2018

0.2.0a2 pre-release

Dec 26, 2018

0.2.0a1 pre-release

Dec 23, 2018

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

crowsetta-4.0.0.tar.gz (3.5 MB view details)

Uploaded Jun 25, 2022 Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

The dropdown lists show the available interpreters, ABIs, and platforms. Enable javascript to be able to filter the list of wheel files.

crowsetta-4.0.0-py3-none-any.whl (125.5 kB view details)

Uploaded Jun 25, 2022 Python 3

File details

Details for the file crowsetta-4.0.0.tar.gz.

File metadata

Download URL: crowsetta-4.0.0.tar.gz
Upload date: Jun 25, 2022
Size: 3.5 MB
Tags: Source
Uploaded using Trusted Publishing? No
Uploaded via: python-requests/2.27.1

File hashes

Hashes for crowsetta-4.0.0.tar.gz
Algorithm	Hash digest
SHA256	`4a4de847708709fc5e6f2e5dc6680a0f29411c9fba9edfab891a7bf66cadb891`
MD5	`cb9a8404706e707e495c0b043e921a48`
BLAKE2b-256	`b63bb977d97cbc2c4f2d7b548d52c28907219609e2e2a827babf1dcfd5c86c9e`

See more details on using hashes here.

File details

Details for the file crowsetta-4.0.0-py3-none-any.whl.

File metadata

Download URL: crowsetta-4.0.0-py3-none-any.whl
Upload date: Jun 25, 2022
Size: 125.5 kB
Tags: Python 3
Uploaded using Trusted Publishing? No
Uploaded via: python-requests/2.27.1

File hashes

Hashes for crowsetta-4.0.0-py3-none-any.whl
Algorithm	Hash digest
SHA256	`f650d340975514710738ec6bf7be7ab36bde317d8c91a0e589fbb364d5fca4eb`
MD5	`67ba7728873f4f3496c0d57788dab039`
BLAKE2b-256	`c129062fd964934daa9d3e1d9e3c9514c4eec823e8888b96cb2b354bbd26a0ae`

See more details on using hashes here.

crowsetta 4.0.0

Navigation

Verified details

Maintainers

Unverified details

Project links

Meta

Classifiers

Project description

a tool to work with any format for annotating vocalizations

Features

Getting Started

Installation

with pip

with conda

Usage

Development Installation

Project Information

Background

Support

Contribute

CHANGELOG

License

Citation

Project details

Verified details

Maintainers

Unverified details

Project links

Meta

Classifiers

Release history Release notifications | RSS feed

Download files

Source Distribution

Built Distribution

File details

File metadata

File hashes

File details

File metadata

File hashes

with `pip`

with `conda`