JupyterLab bioacoustic audio review plugin

These details have not been verified by PyPI

Project links

Project description

JupyterBioacoustic

A JupyterLab plugin for reviewing and annotating bioacoustic audio clips.

JupyterBioacoustic Plugin

Documentation

Table of Contents

Install
Quick Start
Features
Documentation
Usage
BioacousticAnnotator Parameters
Demo
License

Install

pip install jupyter-bioacoustic

See the Development wiki for building from source.

Quick Start

from jupyter_bioacoustic import BioacousticAnnotator

BioacousticAnnotator(
    data='detections-test.csv',
    audio='test.flac',
    ident_column='common_name',
    form_config='form-review.yaml',
    output='reviews.csv',
).open()

JupyterBioacoustic Plugin

See the Quick Start guide for test files and more examples.

Features

Player / Visualizer — browse audio clips with interactive spectrograms and other visualizations (linear, mel, log-frequency, or custom)
Configurable Forms — YAML-driven annotation and review forms with selects, textboxes, checkboxes, and conditional sections
Annotation Tools — time markers, start/end lines, bounding boxes, and multibox (multiple labeled regions per clip)
Built-In & Custom Visualizations — built-in linear and mel spectrograms, log-frequency, bandpass and waveform visualizations. Easily integrate third-party libraries (OpenSoundscape, librosa, SciPy, ...) or write your own
Zoom and Capture — keyboard/mouse zoom, drag-to-pan, zoom-to-selection box, and PNG export
Flexible data sources — CSV, Parquet, SQL (DuckDB), API endpoints, S3 byte-range reads
Syncing — sync local annotation files to remote storage such as S3 and GCS

Documentation

Full documentation is on the wiki:

Quick Start — Installation and first usage
Configuration — All parameters, config files, capture, S3, kwargs
Configurable Forms — YAML form layout reference
Annotation Tools — Spectrogram interaction tools
Data Schema — Input and output formats
API Reference — BioacousticAnnotator class, properties, methods
Audio IO — jupyter_bioacoustic.audio module reference
Demo — Running the demo notebooks
Development — Project structure, build tasks, architecture

Usage

The BioacousticAnnotator class has an extremely simple interface; having only two methods (.open(inline=True), .output(force=False)) and one property (.source).

from jupyter_bioacoustic import BioacousticAnnotator

# Create an instance
ja = BioacousticAnnotator(data='path_to_data.parquet', ...)

# Open the interface
ja.open()

# Get a dataframe with all the submitted data
# Note: this data is lazy loaded. this will read from
#       file each time you submit.
#       however between submissions it will be cached.
result_df = ja.output()
result_df = ja.output(force=True)  # force re-read from disk

# Dataframe access to the source data (here 'path_to_data.parquet')
ja.source

The parameters for BioacousticAnnotator are listed below. There is one special parameter config that can be used instead of providing the parameter values directly in the notebook. This is a great feature for reproduciblity, organization and avoiding bloated notebooks.

Consider the example above:

BioacousticAnnotator(
    data='detections-test.csv',
    audio='test.flac',
    ident_column='common_name',
    form_config='form-review.yaml',
    output='reviews.csv',
)

This can instead be produced this way

BioacousticAnnotator(
    data='detections-test.csv',
    config='config/review-configuration.yaml',
).open()

# config/review-configuration.yaml
audio: 'test.flac'
ident_column: 'common_name'
form_config: 'form-review.yaml'
output: 'reviews.csv'

For this simple example, this might not seem helpful. However for more advanced configurations this is quite useful. Moreover, in the example above the review-form has a configuration file form-review.yaml. If using config the form can be included directly.

See Configuration for full details. Here is an advanced example:

# BioacousticAnnotator Args
audio: "audio_path"    # column name — auto-detected (no slashes or dots)
data_columns: ["common_name", "confidence", "start_time", "county", "audio_path"]
ident_column: 'common_name'
display_columns: ["confidence", "county", "start_time", "audio_path"]
capture: 'Save Spectrogram'
capture_dir: 'spectrograms'


# Form
form_config:
    title:
      value: 'REVIEW CLIP'
      progress_tracker: true
    select:
      label: Is Valid
      column: is_valid
      required: true
      items:
        - label: 'yes'
          value: 'yes'
        - label: 'no'
          value: 'no'
          form: correction_form
    textbox:
      label: notes
      column: notes
    annotation:
      start_time:
        label: Start
        column: start_time
        source_value: start_time
      end_time:
        label: End
        column: end_time
        source_value: end_time
      tools: start_end_time_select
    correction_form:
      - select:
          label: verified name
          column: verified_common_name
          required: true
          items:
            path: data/categories.csv
            value: common_name
      - select:
          label: verif. confidence
          column: verification_confidence
          items:
            - low
            - medium
            - high
    submission_buttons:
      line: true
      next:
        label: Skip
      submit:
        label: Verify

BioacousticAnnotator Parameters

Parameter	Type	Default	Description
`data`	DataFrame / str / dict	required*	Input data. String: file path, URL, `api::url`, or SQL (`SELECT ...`). Dict: `{path\|url\|uri\|api\|sql, secrets, columns}`.
`data_path`	str	`None`	Explicit file path for data (overrides `data` source).
`data_url`	str	`None`	Explicit URL for data (overrides `data` source).
`data_sql`	str	`None`	Explicit SQL query for data (overrides `data` source).
`data_api`	str	`None`	Explicit API endpoint for data (overrides `data` source).
`data_secrets`	dict, list, or `false`	`None`	Auth for data loading. `{key, value}` pairs. Value: `env:VAR`, `dialog`, or literal. Set to `false` to opt out of global `secrets` fallback.
`data_columns`	list	`[]`	Columns for the clip table.
`audio`	str or dict	required*	Audio source. String: local path, URL/URI, or column name (auto-detected). Dict: `{path\|url\|uri\|column\|sql\|api\|src, prefix, suffix, fallback, secrets, property, response_index}`.
`audio_src`	str	`None`	Audio source string (auto-detected as path, URL, or column name). Same as passing a bare string to `audio`.
`audio_path`	str	`None`	Explicit local file path for audio (overrides `audio` source).
`audio_url`	str	`None`	Explicit URL for audio (overrides `audio` source).
`audio_uri`	str	`None`	Alias for `audio_url`.
`audio_column`	str	`None`	Explicit column name for per-row audio (overrides `audio` source).
`audio_prefix`	str	`''`	Prefix joined with `/` to audio paths.
`audio_suffix`	str	`''`	Suffix joined with `/` to audio paths.
`audio_fallback`	str	`''`	Fallback when `audio` is a column and the row value is empty.
`audio_secrets`	dict, list, or `false`	`None`	Auth for audio loading (same format as `data_secrets`). Set to `false` to opt out of global `secrets` fallback.
`audio_sql`	str	`None`	SQL query to resolve audio path. Requires `audio_property`.
`audio_api`	str	`None`	API URL to resolve audio path. Requires `audio_property`.
`audio_property`	str	`None`	Field/column to extract from SQL/API response as the audio path.
`audio_response_index`	int	`1`	1-based row index for SQL/API response (1 = first row).
`secrets`	dict or list	`None`	Global auth — fallback for `data_secrets`, `audio_secrets`, and `output_secrets`. Each can opt out by setting to `false`.
`output`	str or dict	`''`	Output file path or sync config dict. String: local path. Dict: `{path, uri/url, sync_button, recursive, secrets}`. See Output & Sync.
`output_path`	str	`None`	Explicit local output file path (overrides `output` string or `output.path` dict key).
`output_url`	str	`None`	Remote sync destination (overrides `output.uri`/`output.url` dict key).
`output_uri`	str	`None`	Alias for `output_url`.
`output_sync_button`	bool / str	`None`	Show a sync button (`true` = "Sync", string = custom label).
`output_recursive`	bool	`None`	Passed to `io.write()` for uploading directories.
`output_secrets`	dict, list, or `false`	`None`	Auth for sync uploads (same format as `data_secrets`). Set to `false` to opt out of global `secrets` fallback.
`form_config`	dict / str	`None`	Form layout — YAML file, dict, or `None` for no form.
`ident_column`	str	`''`	Identifying column — shown first (without label) in the info card and capture filenames.
`app_title`	str	`'Jupyter Bioacoustic'`	Custom title shown in the widget header and tab.
`display_columns`	list	`[]`	Extra columns in the info card.
`duplicate_entries`	bool	`False`	Allow multiple submissions per row
`default_buffer`	int / float	`3`	Default buffer time in seconds around each clip
`capture`	bool / str	`True`	Capture button (`False` to hide, string for custom label)
`capture_dir`	str	`''`	Directory prefix for captures
`capture_height`	int	`None`	Capture image height in pixels. Defaults to `player_height` if not set.
`spectrogram_resolution`	int / list	`[1000, 2000, 4000]`	Spectrogram width in pixels. List for a dropdown selector, single value for fixed. Prefix an item with `selected::` to set the default (e.g. `[1000, 'selected::2000', 4000]`).
`visualizations`	list	`['linear', 'mel']`	Visualization types for the dropdown. Built-in strings (`'linear'`, `'mel'`, `'log_frequency'`, `'bandpass'`, `'waveform'`) or custom callables. See Custom Visualizations.
`partial_download`	bool	`True`	Use byte-range downloads for remote audio (requires ffmpeg/pydub). Set to `False` to always download and cache the full file.
`width`	str	`'100%'`	Inline widget width.
`height`	int	`900`	Inline widget height.
`config`	str	`None`	Path to YAML/JSON config file
`**kwargs`			Fixed columns in every output row

* data is not required if data_path, data_url, data_sql, or data_api is provided. audio is not required if audio_src, audio_path, audio_url, audio_uri, audio_column, audio_sql, or audio_api is provided.

Demo

Example notebooks are included in the demo/ directory. They require additional dependencies (ipyleaflet, shapely, seaborn, requests).

1. Install with demo dependencies

With pixi:

cd demo
pixi run -e demo jba lab

This launches JupyterLab inside the demo pixi environment with the data-rate limit pre-configured.

With pip:

pip install -e ".[demo]"
jupyter lab --ServerApp.iopub_data_rate_limit=1e10

2. Download audio files (one-time)

Audio files are not included in this repository (they are large FLAC files, ~50-100 MB each).

These are large files. It will likely take multiple minutes per file to download. For demo purposes, you can replace them with any FLAC audio file — the spectrograms will look different but the plugin works the same way.

With AWS CLI (faster):

cd demo
mkdir -p audio
aws s3 cp s3://dse-soundhub/public/audio/dev/20230522_200000.flac audio/test-default.flac --no-sign-request &
aws s3 cp s3://dse-soundhub/public/audio/dev/20230524_200000.flac audio/test1.flac --no-sign-request &
aws s3 cp s3://dse-soundhub/public/audio/dev/20230525_200000.flac audio/test2.flac --no-sign-request &
aws s3 cp s3://dse-soundhub/public/audio/dev/20230526_000000.flac audio/test3.flac --no-sign-request &
wait

With curl:

cd demo
mkdir -p audio
curl -o audio/test-default.flac https://dse-soundhub.s3.us-west-2.amazonaws.com/public/audio/dev/20230522_200000.flac
curl -o audio/test1.flac https://dse-soundhub.s3.us-west-2.amazonaws.com/public/audio/dev/20230524_200000.flac
curl -o audio/test2.flac https://dse-soundhub.s3.us-west-2.amazonaws.com/public/audio/dev/20230525_200000.flac
curl -o audio/test3.flac https://dse-soundhub.s3.us-west-2.amazonaws.com/public/audio/dev/20230526_000000.flac

3. Open a demo notebook

Open simple-example.ipynb from the JupyterLab file browser. The notebook demonstrates both review and annotation workflows.

License

BSD 3-Clause

Project details

These details have not been verified by PyPI

Project links

Release history Release notifications | RSS feed

0.8.8

Jun 5, 2026

0.8.7

Jun 4, 2026

0.8.6

Jun 2, 2026

0.8.5

May 27, 2026

0.8.4

May 27, 2026

0.8.3

May 27, 2026

0.8.2

May 25, 2026

0.8.1

May 25, 2026

0.8.0

May 25, 2026

0.7.2

May 20, 2026

0.7.1

May 20, 2026

This version

0.7.0

May 20, 2026

0.6.9

May 19, 2026

0.6.8

May 19, 2026

0.6.7

May 19, 2026

0.6.6

May 19, 2026

0.6.5

May 18, 2026

0.6.4

May 18, 2026

0.6.3

May 15, 2026

0.6.2

May 13, 2026

0.6.1

May 12, 2026

0.6.0

May 12, 2026

0.5.6

May 6, 2026

0.5.5

May 4, 2026

0.5.4

May 1, 2026

0.5.3

May 1, 2026

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distributions

No source distribution files available for this release.See tutorial on generating distribution archives.

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

The dropdown lists show the available interpreters, ABIs, and platforms. Enable javascript to be able to filter the list of wheel files.

jupyter_bioacoustic-0.7.0-py3-none-any.whl (1.3 MB view details)

Uploaded May 20, 2026 Python 3

File details

Details for the file jupyter_bioacoustic-0.7.0-py3-none-any.whl.

File metadata

Download URL: jupyter_bioacoustic-0.7.0-py3-none-any.whl
Upload date: May 20, 2026
Size: 1.3 MB
Tags: Python 3
Uploaded using Trusted Publishing? No
Uploaded via: twine/6.2.0 CPython/3.14.4

File hashes

Hashes for jupyter_bioacoustic-0.7.0-py3-none-any.whl
Algorithm	Hash digest
SHA256	`17f2aa7aac493207f5de4770778197d7a8f6fd608347dad5529bc195d575b024`
MD5	`24997216512901dfd5f881171e61c2d6`
BLAKE2b-256	`69340c5e1449f59bead03932a4c501e7053574ad2897cb9cd6f549657092269f`

See more details on using hashes here.

jupyter-bioacoustic 0.7.0

Navigation

Verified details

Maintainers

Unverified details

Project links

Meta

Classifiers

Project description

JupyterBioacoustic

Install

Quick Start

Features

Documentation

Usage

BioacousticAnnotator Parameters

Demo

1. Install with demo dependencies

2. Download audio files (one-time)

3. Open a demo notebook

License

Project details

Verified details

Maintainers

Unverified details

Project links

Meta

Classifiers

Release history Release notifications | RSS feed

Download files

Source Distributions

Built Distribution

File details

File metadata

File hashes