An opinionated segmentation library for (machine learning) audio

Project description

The `libsegmenter` audio segmentation library

A small library intended to provide helper functions for block-based processing in Python.

Find out more by exploring the code or reading the docs.

About

The main idea is to help the user choose a combination of window function and hop size, which satisfy the constant-overlap-add (COLA) condition, i.e., if the processing does not modify the blocks, the act of segmenting and un-segmenting the input audio data should be perfectly reconstructing (with some potential latency introduced by the system).

The library currently supports three different modes of operation

Overlap-Add (ola), where a rectangular window is applied to the input frames, and the specified window is applied to the output frames prior to reconstruction. This mode is intended for block-based processing in the time-domain, where the purposed of the overlapping windows is to interpolate the discontinuities between adjacent frames prior to reconstruction.
Weighted Overlap-Add (wola), where a square-root (COLA)-window is applied to both the input frame and output frame. This mode is intended for processing in the frequency domain along the lines of Short-time Fourier Transform (STFT) processing.
Analysis (analysis), where a window is applied to the input frames and disables computing output frames. Useful to obtain spectrograms.

The primary use-case for the library is to support machine learning tasks, which has led to a number of options which are designed to ease training tasks. The segmenter is implemented in both TensorFlow and PyTorch to support multiple machine learning tasks.

Recently, we have upgraded the library to version 1.0. This deprecated the C++ backend for now to simplify development. That being said, the general design has been simplified so implementing your own backend (and verifying it with our unit tests) should not be infeasible.

A word of caution

Note that segmentation is a destructive operation in the sense that we do not provide any pre and post windows. This means that the first and last couple of samples of your audio post subsequently segment-ing and unsegment-ing are going to be windowed, thus different than what you started out with. This is something to take into account when training.

Installation

Simply install from PyPi:

# base version
pip install libsegmenter

# with torch
pip install libsegmenter[torch]

# with tensorflow
pip install libsegmenter[tensorflow]

Example

To create a specific window

import libsegmenter as seg
window = seg.WindowSelector("hann75", "ola", 1024)
window.analysis_window  # numpy ndarray containing the analysis window
window.synthesis_window # numpy ndarray containing the synthesis window

To make a segmenter with a specific window:

import libsegmenter as seg
segmenter = seg.Segmenter(seg.WindowSelector("hann75", "ola", 1024), backend="torch")

With an asymetric window:

import libsegmenter as seg
segmenter = seg.Segmenter(seg.AsymmetricWindowSelector("ola", 1024, 128, 2048), backend="torch")

Use various supported transforms:

import libsegmenter as seg
segmenter = seg.Segmenter(seg.WindowSelector("hann75", "ola", 1024), backend="torch")
transform = seg.TransformSelector(transform="spectrogram", backend="torch")
X = transform.forward(segmenter.segment(x))
x = transform.inverse(x)

Development

Installing python

Install uv (pip replacement):

# install for linux / mac
curl -LsSf https://astral.sh/uv/install.sh | sh

# install for windows
powershell -ExecutionPolicy ByPass -c "irm https://astral.sh/uv/install.ps1 | iex"

Install the development packages:

uv venv
source .venv/bin/activate
uv sync --dev

Linting

We require everything to be fully typed. We enforce that by having 100% clearance on pyright:

uv run pyright
uv run ruff check
uv run ruff format

Licenses

The project is licensed under MIT. Add licenses using the addlicense tool found here:

addlicense -c "Niels de Koeijer, Martin Bo Møller" -l mit -y 2025 -ignore *.m

Documentation

Docs are mainly automatically generated and described with docstrings. To host the docs locally run:

mkdocs serve

They are automatically rebuilt on push to main.

Project details

Release history Release notifications | RSS feed

1.0.10

Oct 27, 2025

This version

1.0.9

Apr 29, 2025

1.0.8

Mar 26, 2025

1.0.7

Mar 24, 2025

1.0.6

Mar 11, 2025

1.0.5

Mar 11, 2025

1.0.4

Mar 11, 2025

1.0.3

Mar 11, 2025

1.0.2

Mar 11, 2025

1.0.1

Mar 11, 2025

0.11

Oct 6, 2024

0.10.0

May 2, 2024

0.9.3

May 2, 2024

0.9.2

May 2, 2024

0.9.1

Apr 23, 2024

0.9

Apr 22, 2024

0.8

Apr 18, 2024

0.7

Apr 15, 2024

0.5

Apr 10, 2024

0.3

Apr 10, 2024

0.2

Apr 10, 2024

0.1

Apr 9, 2024

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

libsegmenter-1.0.9.tar.gz (16.0 kB view details)

Uploaded Apr 29, 2025 Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

The dropdown lists show the available interpreters, ABIs, and platforms. Enable javascript to be able to filter the list of wheel files.

libsegmenter-1.0.9-py3-none-any.whl (46.1 kB view details)

Uploaded Apr 29, 2025 Python 3

File details

Details for the file libsegmenter-1.0.9.tar.gz.

File metadata

Download URL: libsegmenter-1.0.9.tar.gz
Upload date: Apr 29, 2025
Size: 16.0 kB
Tags: Source
Uploaded using Trusted Publishing? No
Uploaded via: twine/6.1.0 CPython/3.10.17

File hashes

Hashes for libsegmenter-1.0.9.tar.gz
Algorithm	Hash digest
SHA256	`1ce0b633e792c8f0b60961227de1896810006486686ba342a57f35fa828496e9`
MD5	`4eac471fa9d5cbe18c0b43cf98bf3a08`
BLAKE2b-256	`96e7bfa2ef612c334bc6693e28e6d19551cd59277a09fb81f75cbc6f20ee6eaf`

See more details on using hashes here.

File details

Details for the file libsegmenter-1.0.9-py3-none-any.whl.

File metadata

Download URL: libsegmenter-1.0.9-py3-none-any.whl
Upload date: Apr 29, 2025
Size: 46.1 kB
Tags: Python 3
Uploaded using Trusted Publishing? No
Uploaded via: twine/6.1.0 CPython/3.10.17

File hashes

Hashes for libsegmenter-1.0.9-py3-none-any.whl
Algorithm	Hash digest
SHA256	`264fe52506799143549613853752c32822c9acac353225477adfe92be80a7223`
MD5	`0e6071f7e8a254099497f77241a5efc4`
BLAKE2b-256	`972a013d8fa5e77f94922c266099e61b2b45e56dd0c58ccb5df3b28726a0260e`

See more details on using hashes here.

libsegmenter 1.0.9

Navigation

Verified details

Maintainers

Unverified details

Meta

Project description

The `libsegmenter` audio segmentation library

About

A word of caution

Installation

Example

Development

Installing python

Linting

Licenses

Documentation

Project details

Verified details

Maintainers

Unverified details

Meta

Release history Release notifications | RSS feed

Download files

Source Distribution

Built Distribution

File details

File metadata

File hashes

File details

File metadata

File hashes

libsegmenter 1.0.9

Navigation

Verified details

Maintainers

Unverified details

Meta

Project description

The libsegmenter audio segmentation library

About

A word of caution

Installation

Example

Development

Installing python

Linting

Licenses

Documentation

Project details

Verified details

Maintainers

Unverified details

Meta

Release history Release notifications | RSS feed

Download files

Source Distribution

Built Distribution

File details

File metadata

File hashes

File details

File metadata

File hashes

The `libsegmenter` audio segmentation library