Skip to main content

canapy is a user friendly auto annotator for animal vocalizations using reservoir computing

Project description

Canapy

Automatic audio annotation tools for animal vocalizations


Canapy trains automatic annotators for animal vocalizations using Reservoir Computing (Echo State Networks). It comes with an interactive dashboard to guide you through the full pipeline: dataset preparation, model training, evaluation, and annotation.

For the full reference documentation, see README_extended.md.

Installation

git clone git@github.com:birds-canopy/canapy.git
pip install -e canapy/.

Quick start

1. Prepare your dataset

You need hand-labeled audio recordings. Annotations should be .csv files in marron1csv format (columns: wave, start, end, syll). Audio must be mono WAV files.

Recommended structure:

song_dataset/
├── annotations/
│   ├── song1.csv
│   └── song2.csv
└── audio/
    ├── song1.wav
    └── song2.wav

or

song_dataset/
├── song1.wav
├── song1.csv
├── song2.wav
└── song2.csv

Aim for 30 min–1 hour of annotated data (10 min can already give good results on canary songs).

2. Launch the dashboard

canapy dash

The dashboard opens automatically at localhost:9321.

You can also pass paths directly as arguments:

canapy dash -a song_dataset/annotations -s song_dataset/audio -o output

Using the dashboard

The dashboard is organized around a sidebar giving access to all pages. The typical workflow for training a new annotator is:

Load data → Preprocess → Train → Eval → (iterate) → Export → Annotate

Home

The Home page gives an overview of Canapy and its pipelines, with a FAQ section and links to the GitHub repository and the Mnemosyne INRIA team.

Load data

On the Load data page, specify where your data lives:

  • Source selection: choose combined folders if audio and annotations are in the same directory, or separate folders otherwise.
  • Model directory: if you want to annotate unlabeled data, point to a folder containing already-trained models (exported after using the training pipeline).
  • Output directory: where models and annotations will be saved (defaults to output/).
  • Annotation format and audio extension: set these to match your files.
  • The sampling rate is auto-detected from your audio files. Enable Downsample if you want audio resampled to that rate at load time.

Preprocess (Edit dataset)

Use this page to clean your dataset before training. It has two sections:

Two collapsible modules are always accessible at the top of the page, regardless of the active section:

  • Class spectrograms — displays one representative mel spectrogram per class (the sample closest to the median duration). Useful for quickly spotting acoustically similar classes before merging.
  • Trim silences (Preprocess only) — balances the proportion of silence in the dataset. Set the target silence ratio (e.g. 20%) and Canapy center-crops silence segments that exceed it. Trimmed files are saved to output/audio_trimmed/ and output/annots_trimmed/.

Class merge — listen to each annotation class, compare them side by side, and rename or merge classes that are acoustically too similar. Type the new label in the text field and click Apply.

Sample correction — review individual samples per class. Select a class to listen to its samples one by one, view their spectrogram, and correct any mislabeled ones via the text input. Click Save all when done.

You can export corrected annotations at any time using the Export Annotations button at the top of the page.

Train

Hyperparameter search (optional): Before training, you can run an automatic HP search to find better ESN parameters for your dataset. It uses TPE (sequential) or parallel random search. Key settings (on the Settings page): opt_max_evals, opt_n_jobs, opt_max_percentage. Optimized parameters are preserved across training iterations — no need to re-run the search each time.

Click Start Training. Three ESN-based models are trained:

Model Description
syn Trained on complete songs in order — uses sequential/syntactic context
nsyn Trained on randomly shuffled, class-balanced samples — context-free
ensemble Combines syn and nsyn by majority vote

Eval

After training, the Eval page shows:

  • Confusion matrix — which classes are being confused with each other.
  • Per-class metrics table — precision, recall, F1-score for each class (1.0 = perfect).
  • Class merge / sample correction — same tools as Preprocessing, but focused on misclassified samples from the model's output.

If results are not satisfying, correct samples or merge classes, then retrain. 3–4 iterations of train → eval is typically enough to converge. All corrections are preserved across iterations.

Export

When you are satisfied with the model's performance, go to the Export page to save the trained models to the output folder. These exported models are then used in the Annotate pipeline. You can also export the current configuration (ESN and species parameters) from this page — or from the Settings page — to reuse it as a personal preset in future sessions.

Annotate unlabeled data

Load your unlabeled audio and trained models via the Load data page, or pass them directly at launch:

canapy dash -d song_dataset/audio -c output/model -o output/annotations

Select which model(s) to use (Syn-ESN, NSyn-ESN, Ensemble), click Start annotation, then Export annotation when done. The export folder is named with a timestamp (YYYY-MM-DD_HHhMMminSS).

If you trained with multiple iterations, you can pick a specific one: load output/model/3 instead of output/model.

Settings

The Settings page lets you configure the parameters used by Canapy:

  • Species parameters: fmin, fmax (frequency range), win_length, hop_length, n_fft (spectrogram).
  • ESN parameters: sr (spectral radius), leak (leak rate), iss, isd, isd2 (input scalings), ridge (regularization).
  • HP search parameters: opt_max_evals, opt_n_jobs, opt_max_percentage.

Click Validate to apply changes to the current session.

Presets — the Settings page provides pre-configured profiles for specific species (canary, Bengalese finch, zebra finch, mouse, infant marmoset). The canary preset is loaded by default. You can also load your own configuration file via Load config. If you have tuned Canapy for a new species, feel free to send us your configuration so we can add it to the preset library.


Quick commands (no dashboard)

# Train once and export models
canapy train -d song_dataset/data -o output

# Annotate unlabeled audio with trained models
canapy annotate -d song_dataset/audio -c models_folder -o output

Support

Contact Axel Arnaud or Xavier Hinaut at Inria Mnemosyne: axel.arnaud@inria.frxavier.hinaut@inria.fr

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

canapy-0.1.2.tar.gz (964.9 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

canapy-0.1.2-py3-none-any.whl (153.2 kB view details)

Uploaded Python 3

File details

Details for the file canapy-0.1.2.tar.gz.

File metadata

  • Download URL: canapy-0.1.2.tar.gz
  • Upload date:
  • Size: 964.9 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: poetry/2.4.0 CPython/3.11.15 Linux/6.17.0-1010-azure

File hashes

Hashes for canapy-0.1.2.tar.gz
Algorithm Hash digest
SHA256 b4e9e5334c613be7b4d34161444c4898a9bbf081c9be05e4a2bb8e87399770de
MD5 3fd16f527e51464ceb06682674db78e0
BLAKE2b-256 3adf6d51a5842f502992b16124c8f210bb57a10d180fd8ce280ea5126600d04d

See more details on using hashes here.

File details

Details for the file canapy-0.1.2-py3-none-any.whl.

File metadata

  • Download URL: canapy-0.1.2-py3-none-any.whl
  • Upload date:
  • Size: 153.2 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: poetry/2.4.0 CPython/3.11.15 Linux/6.17.0-1010-azure

File hashes

Hashes for canapy-0.1.2-py3-none-any.whl
Algorithm Hash digest
SHA256 549b961bf7b692374d43244298e6c741e99979862ea7206773681ae41c8785ac
MD5 80e56767b9c757f4cb50e8ecf67da45e
BLAKE2b-256 23c200d9082271f02ab394586f6b1ce9198bdd34b1ef487baa1845b5b680f1be

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page