Skip to main content

Use bacpipe to streamline the process of generating embeddings and analysing your PAM datasets.

Project description

Welcome to bacpipe (BioAcoustic Collection Pipeline)

bacpipe makes using deep learning models for bioacoustics easy! Using bacpipe you can generate embeddings, classification predictions and clusters. All you need is your audio data and to customize the settings.

And the best part is, it comes with a GUI for you to explore the results.

bacpipe also ties in nicely with acodet, allowing you to generate heatmaps of species activity from your datasets based on predictions of deep learning models. But keep in mind predictions are only as good as the model's you are using - they can seem confident but still be very wrong - bacpipe's aim is to make model evaluations easier and let us improve them.

bacpipe is also available on pip: pip install bacpipe

import bacpipe

# This will execute the whole pipeline
# if nothing is specified it will generate embeddings on
# a set of audio test data using the models birdnet and perch

bacpipe.play()

A more detailed description of the API can be found under API

📚 Table of Contents


How it works

This repository aims to streamline the generation and evaluation of embeddings using a large variety of bioacoustic models.

The below image shows a comparison of umap embeddings based on 15 different bioacoustic models. The models are being evaluated on a bird and frog dataset (more details in this conference paper).

bacpipe requires a dataset of audio files, runs them through a series of models, and generates embeddings. These embeddings can then be used visualized and evaluated for various tasks such as clustering or classification.

By default the embeddings will be generated for the models specified in the config.yaml file.

Currently these bioacoustic models are supported (more details below):

available_models : [
    "audiomae"
    "audioprotopnet"
    "avesecho_passt"
    "aves_especies"
    "beats"
    "birdaves_especies"
    "biolingual"
    "birdnet"
    "birdmae"
    "hbdet"
    "insect66"
    "insect459"
    "mix2"
    "naturebeats"
    "perch_bird"
    "protoclr"
    "rcl_fs_bsed"
    "surfperch"
    "google_whale"
    "vggish"
  ]

Once the embeddings are generated, 2d reduced embeddings will be created using the dimensionality reduction model specified in the config.yaml file. And these dimensionality reduction models are supported:

available_reduction_models: [
  "pca",
  "sparse_pca",
  "t_sne",
  "umap"
]

Furthermore, the embeddings can be evaluated using different metrics. The evaluation is done using the evaluate.py script, which takes the generated embeddings and computes various metrics such as clustering performance and classification performance. The evaluation results are saved in the bacpipe/results directory.

available_evaluation_tasks: [
  "classification",
  "clustering"
]

The repository also includes a panel dashboard for visualizing the generated embeddings. To enable the dashboard, simply set the dashboard variable to True in the settings.yaml file. The dashboard will automatically open in your browser (at http://localhost:8050) after running the run_dashboard.py script.

The pipeline is designed to be modular, so you can easily add or remove models as needed. The models are organized into pipelines, which are defined in the bacpipe/embedding_generation_pipelines/feature_extractors directory. If you want to add a different dimensionality reduction model, you do so by adding new pipeline to the bacpipe/embedding_generation_pipelines/dimensionality_reduction directory.

Using annotations for evaluation

If you have annotations for your dataset, you can use them to evaluate the generated embeddings. The labels will be used to compute the clustering and classification performance of the embeddings.

To use the annotations for evaluation, create a file called annotations.csv in the directory specified in the audio_dir variable in the config.yaml file. The file should contain the following columns:

audiofilename,start,end,label

Where audiofilename is the name of the audio file, start and end are the start and end times of the annotation in seconds, and label is the label of the annotation.

For reference see the example annotations file.

If this file exists, the evaluation script will automatically use the annotations to compute the clustering and classification performance of the embeddings. The labels will also be used to color the points in the dashboard visualization showing the embeddings.

API

bacpipe can be used as a package and installed from pip

pip install bacpipe

import bacpipe

# This will execute the whole pipeline
# if nothing is specified it will generate embeddings on
# a set of audio test data using the models birdnet and perch

bacpipe.play()

# to modify configurations and settings, you can simply access them
# as attributes

# to see available settings and configs run the above commands
bacpipe.config
bacpipe.settings

# to modify the audio data path for example, do
bacpipe.config.audio_dir = '/path/to/your/audio/dir'

# to modify the models you want to run, do
bacpipe.config.models = ['birdnet', 'birdmae', 'naturebeats']
# keep in mind some models require checkpoints, to find out which ones, run
bacpipe.models_needing_checkpoint
# links to checkpoints are to be found in this readme file, 
# location of the checkpoints is specified under 
bacpipe.settings.model_base_path = '/path/to/model_checkpoints'
# On first execution the birdnet checkpoint is downloaded and the
# model_checkpoints folder is created from the current working directory

# If you just want to run models and get embeddings and don't want 
# the dashboard and all of that, define an embedder object and pass it
# the model name, and the settings you modified
# To ensure birdnet is downloaded, run the following (this is done by default
# if you run bacpipe.play(), but without that, you need to call this explicitly)
bacpipe.ensure_std_models(bacpipe.settings.model_base_path)

em = bacpipe.Embedder('birdnet', **vars(bacpipe.settings)) 
# the vars part is important!

audio_file = '/path/to/all/the/audio/file'
embeddings = em.get_embeddings_from_model(audio_file)

# if the model has a built in classifier, like birdnet
# you can make sure the class predictions are also saved 
# by setting 
bacpipe.settings.run_pretrained_classifier = True
# the generating of embeddings above will then let you access 
# the class predictions using
em.model.classifier_outputs

# If you want to produce embeddings for various models, bacpipe will always store
# them to keep your memory from overfilling. Still you can use the package to easily
# access the embeddings and all the metadata
loader = bacpipe.model_specific_embedding_creation(
  **vars(bacpipe.config), **vars(bacpipe.settings)
  )

# this call will initiate the embedding generation process, it will check if embeddings
# already exist for the combination of each model and the dataset and if so it will
# be ready to load them. The loader keys will be the model name and the values will
# be the loader objects for each model. Each object contains all the information
# on the generated embeddings. To name access them:
loader['birdnet'].embedding_dict() 
# this will give you a dictionary with the keys corresponding to embedding files
# and the values corresponding to the embeddings as numpy arrays

loader['birdnet'].metadata_dict
# This will give you a dictionary overview of:
# - where the audio data came from,
# - where the embeddings were saved
# - all the audio files, 
# - the embedding size of the model, 
# - the audio file lengths,
# - the number of embeddings for each audio files
# - the sample rate
# - the number of samples per window
# - and the total length of the processed dataset in seconds
# Thic dictionary is also saved as a yaml file in the directory of the embeddings

Dashboard visualization

bacpipe includes a dashboard visualization by default allowing you to easily explore the generated embeddings

Once embeddings are generated, they can be easily visualized using a dashboard (built using panel) by simply setting the dashboard setting in the config.yaml file to True.

Below you can see a gif showing the basic usage of the dashboard.

The dashboard has 3 main sections:

  1. Single model
  2. Two models
  3. All models

In the single model section, you can select a model and visualize the embeddings generated by that model. The embeddings can be colored by :

  • metadata extracted from the files (date and time information, and file and parent directory)
  • the labels specified in the annotations.csv file
  • the cluster labels generated by the clustering algorithm (kmeans)

In the dashboard sidebar you can select the model, by which to label the embeddings, whether to remove noise, and the type of classification task to show the results for.

The noise removal is done by removing the embeddings that do not correspond to annotated sections of the audio files. This is useful if you want to focus on the annotated sections of the audio files and disregard the rest of the data.

The visualizations can be saved as png files by clicking the save button in the bottom right corner of the plot.

Try it out and (please) feel free to give feedback and ask questions (or suggestions for improvements) - or in case something does not work raise issues.


Installation

Install uv (recommended) or poetry

It is recommended to use python 3.11 for this repository, as some of the models require it.

For speed and stability it is recommended to use uv. To install uv use the following command (you can also do it all without uv then, just leave the uv part away, as all the commands are also pip commands):

pip install uv

(for windows use /c/Users/$USERNAME/AppData/Local/Programs/Python/Python311/python.exe -m pip install uv)

If you prefer to use poetry, you can install it using:

pipx install poetry

Create a virtual environment

python3.11 -m uv venv .env_bacpipe

(for windows use /c/Users/$USERNAME/AppData/Local/Programs/Python/Python311/python.exe -m uv venv .env_bacpipe)

(alternatively for poetry use poetry env use 3.11)

activate the environment

source .env_bacpipe/bin/activate (for windows use source .env_bacpipe\Scripts\activate)

Clone the repository

git clone https://github.com/bioacoustic-ai/bacpipe.git

cd into the bacpipe directory (cd bacpipe)

Install the dependencies once the prerequisites are satisfied.

uv pip install -r pyproject.toml

  • this will automatically install requirements based on your os, so windows should also work fine. However, gpu support is not available on windows

For poetry:

poetry lock poetry install

Alternatively:

uv sync

If for some reasons you would prefer requirements, use the these for windows:

uv pip install -r requirements_windows.txt

If you do not have admin rights and encounter a permission denied error when using pip install, use python -m pip install ... instead.

OPTIONAL: Add other model checkpoints that are not included by default.

Download the ones that are available from here and create directories corresponding to the pipeline-names and place the checkpoints within them.

Test the installation was successful

By doing so you will also ensure that the directory structure for the model checkpoints will be created.

pytest -v --disable-warnings bacpipe/tests/test_embedding_creation.py

The tests could take a while, so to run a small test, you can also pass the model you would like to test:

pytest -v --disable-warnings bacpipe/tests/test_embedding_creation.py --models=birdnet,perch

(keep in mind you have to have the checkpoints locally for the models that require it)

In case of a permission denied error, run python -m pytest -v --disable-warnings bacpipe/tests/test_embedding_creation.py

If everything passes then you've successfully installed bacpipe and can now proceed to use it.

Usage

Configurations and settings

To see the capabilities of bacpipe, go ahead and run the run_pipeline.py script. This will run the pipeline with the default settings and configurations on a small set of test data.

To use bacpipe on your own data, you will need to modify the configuration files.

The only two files that need to be modified are the config.yaml and settings.yaml files. The config.yaml is used for the standard configurations:

  • path to audio files
  • models to run
  • dimensionality reduction model
  • evaluation tasks
  • whether to run the dashboard or not

The settings.yaml file is used for more advanced configurations and does not need to be modified unless you have specific preferences. It includes settings such as to run on a cpu or a cuda (gpu) (by default cpu), the paths where results are saved, configurations for the evaluation tasks and more.

Modify the config.yaml file in the root directory to specify the path to your dataset. Define what models to run by specifying the strings in the models list (copy and paste as needed, I usually just comment the model's I don't want to run).

If you have already computed embeddings on the dataset specified in audio_data, and you want to do the dimensionality reduction and evaluation for the models you have already run, you can set the already_computed variable to True. This will only select the models that have already been computed.

In either case if you have already computed embeddings with a model, bacpipe will skip the model and use the already computed embeddings (if they are still located in the same directory). Even if overwrite is set to True, bacpipe will not overwrite the embeddings if they already exist. It will recompute clusterings and label generation.

Running the pipeline

Once the configuration is complete, execute the run_pipeline.py file (make sure the environment is activated) python run_pipeline.py

While the scripts are executed, directories will be created corresponding to the main_results_dir setting. Embeddings will be saved in main_results_dir/YOUR_DATASET/embeddings (see here for more info) and if selected, reduced dimensionality embeddings will be saved in main_results_dir/evaluation/dim_reduced_embeddings (see here for more info).

Model selection

Select the models you want to run in the config.yaml file. The models are specified in this ReadMe and in the test_file. You can select the models you want to run by adding them to the models list in the config.yaml file.

Dimensionality reduction

Different dimensionality reduction models can be selected in the config.yaml file. The available models are specified in the section Dimensionality reduction models. Insert the name of the selected model in the dim_reduction_model variable in the config.yaml file. The default is umap, but you can also select pca, sparse_pca or t_sne.

Dashboard

The dashboard is a panel application that allows you to visualize the generated embeddings. To enable the dashboard, set the dashboard variable in the config.yaml file to True. The dashboard will automatically open in your browser (at http://localhost:8050) after running the run_dashboard.py script.

Evaluation

You can use bacpipe to evaluate the generated embeddings using different metrics. To evaluate the embeddings, you need annotations for your dataset. The annotations should be in a file called annotations.csv in the directory specified in the audio_dir variable in the config.yaml file or the results directory of your dataset main_results_dir/YOUR_DATASET. The file should contain the following columns:

audiofilename,start,end,label:species

Where audiofilename is the name of the audio file, start and end are the start and end times of the annotation in seconds, and label is the label of the annotation.

species is a placeholder here and can be replaced with any label description. So if you have labelled call types, change it to label:call_type. But it's important that there are no spaces and that it contains label:. By doing this you will be able to visualize your data based on all of these label columns.

The labels can then be used to perform clustering and classification evaluation. This can be done only in regard to one label, so specify the main label column in the label_column variable in settings.yaml. This defaults to species. Only labels that exceed the min_label_occurances value will be used. This is to make sure you have enough data to train linear classifiers and do meaningful evaluations. If you have enough labeled data, feel free to increase this.

See the file annotations.csv for an example of how the annotations file should look like.

Once the annotations file is created, add either classification or clustering (or both) to the evaluation_task variable in the config.yaml file (use double quotes: "classification" or "clustering"). You can run the evaluation script using normal python run_pipeline.py command. The evaluation script will automatically use the annotations to compute the clustering and classification performance of the embeddings. The results will be saved in the bacpipe/results/YOUR_DATASET/evaluation directory.

If you selected classification, a linear classifier will be trained and saved in the classification subdirectory of the evaluation folder. This .pt file can be used to generate class predictions with a model that wasn't originally trained on these classes. A tutorial will be available shortly explaining this in more detail. The .pt file can be used in the repository acodet to generate class predictions with the combination of a feature extractor and the trained linear classifier.

Models with classifiers

Models that already contain classification heads, are the following:

  • AudioProtoPNet
  • BirdNET
  • Perch_bird
  • SurfPerch
  • google_whale

With all of these models, you only need to set run_pretrained_classifier to True and then the model will save the classification outputs in the classification/original_classifier_outputs folder. Only predictions exceeding the classifier_threshold value will be saved. A csv file in the shape of the annotations.csv file is also saved corresponding to the class predictions. The dashboard will also contain an extra label_by option default_classifier.

Available models

The models all have their model specific code to ensure inference runs smoothly. More info on the models and their pipelines can be found here.

Models currently include:

Name ref paper ref code sampling rate input length embedding dimension
AudioMAE paper code 16 kHz 10 s 768
AudioProtoPNet paper code 32 kHz 5 s 1024
AvesEcho_PASST paper code 32 kHz 3 s 768
AVES_ESpecies paper code 16 kHz 1 s 768
BEATs paper code 16 kHz 10 s 768
BioLingual paper code 48 kHz 10 s 512
BirdAVES_ESpecies paper code 16 kHz 1 s 1024
BirdMAE paper code 32 kHz 10 s 1280
BirdNET paper code 48 kHz 3 s 1024
Google_Whale paper code 24 kHz 5 s 1280
HumpbackNET paper code 2 kHz 3.9124 s 2048
Insect66NET paper code 44.1 kHz 5.5 s 1280
Insect459NET paper pending 44.1 kHz 5.5 s 1280
Mix2 paper code 16 kHz 3 s 960
NatureBEATs paper code 16 kHz 10 s 768
Perch_Bird paper code 32 kHz 5 s 1280
ProtoCLR paper code 16 kHz 6 s 384
RCL_FS_BSED paper code 22.05 kHz 0.2 s 2048
SurfPerch paper code 32 kHz 5 s 1280
VGGish paper code 16 kHz 0.96 s 128
Click to see more details on the models
Name paper code training CNN/Trafo architecture checkpoint link
AudioMAE paper code ssl + ft trafo ViT weights
AudioProtoPNet paper code sup l CNN ConvNext included
AvesEcho_PaSST paper code sup l trafo PaSST weights
AVES_ESpecies paper code ssl trafo HuBERT weights
BEATs paper code ssl trafo ViT weights
BioLingual paper code ssl trafo CLAP included
BirdAVES_ESpecies paper code ssl trafo HuBERT weights
BirdMAE paper code ssl trafo ViT included
BirdNET paper code sup l CNN EffNetB0 weights
Google_Whale paper code sup l CNN EffNetb0 included
HumpbackNET paper code sup l CNN ResNet50 weights
Insect66NET paper code sup l CNN EffNetv2s weights
Insect459NET paper pending sup l CNN EffNetv2s pending
Mix2 paper code sup l CNN MobNetv3 release pending
NatureBEATs paper code ssl trafo BEATs weights
Perch_Bird paper code sup l CNN EffNetb0 included
ProtoCLR paper code sup cl trafo CvT-13 weights
RCL_FS_BSED paper code sup cl CNN ResNet9 weights
SurfPerch paper code sup l CNN EffNetb0 included
VGGish paper code sup l CNN VGG weights

Brief description of models

All information is extracted from the respective repositories and manuscripts. Please refer to them for more details

AudioMAE

  • spectrogram input
  • self-supervised pretrained model, fine-tuned
  • vision transformer
  • trained on general audio

AudioMAE from the facebook research group is a vision transformer pretrained on AudioSet-2M data and fine-tuned on AudioSet-20K.

AudioProtoPNet

  • spectrogram input
  • supervised learning, trained using asymmetric loss
  • ConvNext architecture
  • trained on the xeno-canto large section of BirdSet

This CNN is trained in two phases. The main contribution of this model is its interpretability. It learned prototypes during its second training phase which can be used during inference time to visualize sections of the spectrogram that were most important for classification. It also reaches competitive performance on bird classification tasks. The (include) classifier can distinguish 9736 classes.

AvesEcho_PaSST

  • transformer
  • supervised pretrained model, fine-tuned
  • pretrained on general audio and bird song data

AvesEcho_PaSST is a vision transformer trained on AudioSet and (deep) fine-tuned on xeno-canto. The model is based on the PaSST framework.

AVES_ESpecies

  • transformer
  • self-supervised pretrained model
  • trained on general audio

AVES_ESpecies is short for Animal Vocalization Encoder based on Self-Supervision by the Earth Species Project. The model is based on the HuBERT-base architecture. The model is pretrained on unannotated audio datasets AudioSet-20K, FSD50K and the animal sounds from AudioSet and VGGSound.

BEATs

  • trafo
  • self-supervised learning
  • trained on AudioSet

BEATs is microsofts SotA audio model based on audio pre-training with acoustic tokenizers. The model reaches competitive results with many bioacosutic models in benchmarks for linear and attentive probing, and is therefore also included in bacpipe as a general audio baseline model.

BioLingual

  • transformer
  • spectrogram input
  • contrastive-learning
  • self-supervised pretrained model
  • trained on animal sound data (primarily bird song)

BioLingual is a language-audio model trained on captioning bioacoustic datasets inlcuding xeno-canto and iNaturalist. The model architecture is based on the CLAP model architecture.

BirdAVES_ESpecies

  • transformer
  • self-supervised pretrained model
  • trained on general audio and bird song data

BirdAVES_ESpecies is short for Bird Animal Vocalization Encoder based on Self-Supervision by the Earth Species Project. The model is based on the HuBERT-large architecture. The model is pretrained on unannotated audio datasets AudioSet-20K, FSD50K and the animal sounds from AudioSet and VGGSound as well as bird vocalizations from xeno-canto.

BirdMAE

  • trafo (ViT)
  • self-supervised model
  • trained on XCM

BirdMAE is a masked autoencoder inspired by meta's AudioMAE, however the model was heavily adapted for the bioacoustic domain. The model was trained on the xeno-canto M dataset (1.7 million samples) from BirdSet and evaluated on various soundscape datasets, where it outperformed all competing models (including SotA bioacoustic models).

BirdNET

  • CNN
  • supervised training model
  • trained on bird song data

BirdNET (v2.4) is based on a EfficientNET(b0) architecture. The model is trained on a large amount of bird vocalizations from the xeno-canto database alongside other bird song databses.

Google_Whale

  • CNN
  • supervised training model
  • trained on 7 whale species

Google_Whale (multispecies_whale) is a EFficientNet B0 model trained on whale vocalizations and other marine sounds.

HumpbackNET

  • CNN
  • supervised training model
  • trained on humpback whale song

HumpbackNET is a binary classifier based on a ResNet-50 model trained on humpback whale data from different parts in the North Atlantic.

Insect66NET

  • CNN
  • supervised training model
  • trained on insect sounds

InsectNET66 is a EfficientNet v2 s model trained on the Insect66 dataset including sounds of grasshoppers, crickets, cicadas developed by the winning team of the Capgemini Global Data Science Challenge 2023.

Insect459NET

  • CNN
  • supervised training model
  • trained on insect sounds

InsectNET459 is a EfficientNet v2 s model trained on the Insect459 dataset (publication pending).

Mix2

  • CNN
  • supervised training model
  • trained on frog sounds

Mix2 is a MobileNet v3 model trained on the AnuranSet which includes sounds of 42 different species of frogs from different regions in Brazil. The model was trained using a mixture of Mixup augmentations to handle the class imbalance of the data.

NatureBEATs

  • trafo
  • self-supervised training model
  • trained on diverse set of bioacoustics, general sound, music, human speech

NatureLM-Audio is a very ambitious foundational model specifically for bioacoustics. It uses Microsoft's BEATs backbone as an audio encoder along with Meta's Llama-3.1-8B large language model capabilities. In the implementation used here in bacpipe, only the support for BEATs audio-encoder with NatureLM-Audio's weights, referred to here as NatureBEATs, is provided.

RCL_FS_BSED

  • CNN
  • supervised contrastive learning
  • trained on dcase 2023 task 5 dataset link

RCL_FS_BSED stands for Regularized Contrastive Learning for Few-shot Bioacoustic Sound Event Detection and features a model based on a ResNet model. The model was originally created for the DCASE bioacoustic few shot challenge (task 5) and later improved.

ProtoCLR

  • transformer
  • supervised contrastive learning
  • trained on bird song data

ProtoCLR stands for Prototypical Contrastive Learning for robust representation learning. The architecture is a CvT-13 (Convolutional vision transformer) with 20M parameters. ProtoCLR has been validated on transfer learning tasks for bird sound classification, showing strong domain-invariance in few-shot scenarios. The model was trained on the xeno-canto dataset.

Perch_Bird

  • CNN
  • supervised training model
  • trained on bird song data

Perch_Bird is a EFficientNet B1 model trained on the entire Xeno-canto database.

SurfPerch

  • CNN
  • supervised training model
  • trained on bird song, fine-tuned on tropical reef data

Perch is a EFficientNet B1 model trained on the entire Xeno-canto database and fine tuned on coral reef and unrelated sounds.

VGGISH

  • CNN
  • supervised training model
  • trained on general audio

VGGish is a model based on the VGG architecture. The model is trained on audio from youtube videos (YouTube-8M)

Dimensionality reduction models

To evaluate the generated embeddings a number of dimensionality reduction models are included in this repository:

name reference code reference linear
UMAP paper code No
t-SNE paper code No
PCA paper code Yes
Sparse_PCA paper code Yes

Add a new model

To add a new model, simply add a pipeline with the name of your model. Make sure your model follows the following criteria:

  • define the model specific sampling rate
  • define the model specific input segment length
  • define a class called "Model" which inherits the ModelBaseClass from bacpipe.utils
  • define the init, preproc, and call methods so that the model can be called
  • if necessary save the checkpoint in the bacpipe.model_checkpoints dir with the name corresponding to the name of the model
  • if you need to import code where your specific model class is defined, create a directory in bacpipe.model_specific_utils corresponding to your model name "newmodel" and add all the necessary code in there

Here is an example:

import torch
from bacpipe.model_specific_utils.newmodel.module import MyClass

SAMPLE_RATE = 12345
LENGTH_IN_SAMPLES = int(10 * SAMPLE_RATE)

from ..utils import ModelBaseClass

class Model(ModelBaseClass):
    def __init__(self, **kwargs):
        super().__init__(sr=SAMPLE_RATE, segment_length=LENGTH_IN_SAMPLES, **kwargs)
        self.model = MyClass()
        state_dict = torch.load(
            self.model_base_path + "/newmodel/checkpoint_path.pth",
            weights_only=True,
        )
        self.model.load_state_dict(state_dict)

    def preprocess(self, audio): # audio is a torch.tensor object
        # insert your preprocessing steps
        return processed_audio

    def __call__(self, x):
        # by default the model will be called with .eval() mode
        return self.model(x)

Most of the models are based on pytorch. For tensorflow models, see birdnet, hbdet or vggish.


Contribute

This repository is intended to be a collaborative project for people working in the field of bioacoustics. If you think there is some improvement that could be useful, please raise an issues, submit a PR or get in touch.

There are two main intentions for this repository that should always be considered when contributing:

  1. Only add new requirements if truly necessary

Given the large number of different models, there are already a lot of requirements. To ensure that the repository is stable, and installation errors are kept minimal, please only add code with new requirements if truly necessary.

  2. The main purpose of bacpipe is quickly generating embeddings from models

There should always be a baseline minimal use case, where embeddings are created from different feature extractors and everything else is an add-on.

Known issues

Given that this repository compiles a large number of very different deep learning models with different requirements, some issues have been noted.

Please raise issues if there are questions or bugs.

Previous versions of bacpipe included models like animal2vec, but the requirements conflicts led me to remove them. In the future I hope there will be an updated version of those models and then they will be included again.

Citation

A lot of work has gone into creating these bioacoustic models, both by data collectors and by machine learning practitioners, please cite the authors of the respective models (all models are referenced in the table above).

This work is first described in a conference paper. If you use bacpipe for your research, please include the following reference:

@misc{kather2025clusteringnovelclassrecognition,
      title={Clustering and novel class recognition: evaluating bioacoustic deep learning feature extractors}, 
      author={Vincent S. Kather and Burooj Ghani and Dan Stowell},
      year={2025},
      eprint={2504.06710},
      archivePrefix={arXiv},
      primaryClass={cs.LG},
      url={https://arxiv.org/abs/2504.06710}, 
}

Newsletter and Q&A sessions

Reading from the traffic on the repository, there seems to be an interest in bacpipe. I have set up a newsletter under this link: https://buttondown.com/vskode. Once more than 30 people have signed up for the newsletter, I will schedule a Q&A session and post the link in the newsletter. Hopefully I can then help answer questions and address issues that people are running into.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

bacpipe-1.1.1.tar.gz (19.0 MB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

bacpipe-1.1.1-py3-none-any.whl (14.7 MB view details)

Uploaded Python 3

File details

Details for the file bacpipe-1.1.1.tar.gz.

File metadata

  • Download URL: bacpipe-1.1.1.tar.gz
  • Upload date:
  • Size: 19.0 MB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.11.13

File hashes

Hashes for bacpipe-1.1.1.tar.gz
Algorithm Hash digest
SHA256 289e697afaab3d83cbc770982023d46570067fed785af67ec42ec5f0c59347c1
MD5 eb96a22c2ececa34e8062396c437d8ba
BLAKE2b-256 b3cffe3bbddd2ee511822a00187cbaadb3c9fd76a5989a37c90461ba3526eea1

See more details on using hashes here.

File details

Details for the file bacpipe-1.1.1-py3-none-any.whl.

File metadata

  • Download URL: bacpipe-1.1.1-py3-none-any.whl
  • Upload date:
  • Size: 14.7 MB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.11.13

File hashes

Hashes for bacpipe-1.1.1-py3-none-any.whl
Algorithm Hash digest
SHA256 2fc988b5e54d3bb57456e398af0e99ce0c6936472595e40e47901ed2bf145274
MD5 13b631eb79d6777030272547b4c934dd
BLAKE2b-256 be557e098ab882cd46a4b9c4a2f52b8b4660b8c01c550c0109d77f3333c68994

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page