Skip to main content

A module for Audio/Acoustic Activity Detection

Project description

https://raw.githubusercontent.com/amsehili/auditok/f2e212068b6d5bfb7bf3932bc3a9cad01e03759d/doc/figures/auditok-logo.png Build Status https://codecov.io/github/amsehili/auditok/graph/badge.svg?token=0rwAqYBdkf Documentation Status

auditok is an Audio Activity Detection tool that processes online data (from an audio device or standard input) and audio files. It can be used via the command line or through its API.

Full documentation is available on Read the Docs.

Installation

auditok requires Python 3.7 or higher.

To install the latest stable version, use pip:

sudo pip install auditok

To install the latest development version from GitHub:

pip install git+https://github.com/amsehili/auditok

Alternatively, clone the repository and install it manually:

git clone https://github.com/amsehili/auditok.git
cd auditok
python setup.py install

Basic example

Here’s a simple example of using auditok to detect audio events:

import auditok

# `split` returns a generator of AudioRegion objects
audio_events = auditok.split(
    "audio.wav",
    min_dur=0.2,     # Minimum duration of a valid audio event in seconds
    max_dur=4,       # Maximum duration of an event
    max_silence=0.3, # Maximum tolerated silence duration within an event
    energy_threshold=55 # Detection threshold
)

for i, r in enumerate(audio_events):
    # AudioRegions returned by `split` have defined 'start' and 'end' attributes
    print(f"Event {i}: {r.start:.3f}s -- {r.end:.3f}")

    # Play the audio event
    r.play(progress_bar=True)

    # Save the event with start and end times in the filename
    filename = r.save("event_{start:.3f}-{end:.3f}.wav")
    print(f"Event saved as: {filename}")

Example output:

Event 0: 0.700s -- 1.400s
Event saved as: event_0.700-1.400.wav
Event 1: 3.800s -- 4.500s
Event saved as: event_3.800-4.500.wav
Event 2: 8.750s -- 9.950s
Event saved as: event_8.750-9.950.wav
Event 3: 11.700s -- 12.400s
Event saved as: event_11.700-12.400.wav
Event 4: 15.050s -- 15.850s
Event saved as: event_15.050-15.850.wav

Split and plot

Visualize the audio signal with detected events:

import auditok
region = auditok.load("audio.wav") # Returns an AudioRegion object
regions = region.split_and_plot(...) # Or simply use `region.splitp()`

Example output:

https://raw.githubusercontent.com/amsehili/auditok/f2e212068b6d5bfb7bf3932bc3a9cad01e03759d/doc/figures/example_1.png

Split an audio stream and re-join (glue) audio events with silence

The following code detects audio events within an audio stream, then insert 1 second of silence between them to create an audio with pauses:

# Create a 1-second silent audio region
# Audio parameters must match the original stream
from auditok import split, make_silence
silence = make_silence(duration=1,
                       sampling_rate=16000,
                       sample_width=2,
                       channels=1)
events = split("audio.wav")
audio_with_pauses = silence.join(events)

Alternatively, use split_and_join_with_silence:

from auditok import split_and_join_with_silence
audio_with_pauses = split_and_join_with_silence(silence_duration=1, input="audio.wav")

Export an AudioRegion as a numpy array

from auditok import load, AudioRegion
audio = load("audio.wav") # or use `AudioRegion.load("audio.wav")`
x = audio.numpy()
assert x.shape[0] == audio.channels
assert x.shape[1] == len(audio)

Limitations

The detection algorithm is based on audio signal energy. While it performs well in low-noise environments (e.g., podcasts, language lessons, or quiet recordings), performance may drop in noisy settings. Additionally, the algorithm does not distinguish between speech and other sounds, so it is not suitable for Voice Activity Detection in multi-sound environments.

License

MIT.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

auditok-0.3.0.tar.gz (1.8 MB view details)

Uploaded Source

Built Distribution

auditok-0.3.0-py3-none-any.whl (1.5 MB view details)

Uploaded Python 3

File details

Details for the file auditok-0.3.0.tar.gz.

File metadata

  • Download URL: auditok-0.3.0.tar.gz
  • Upload date:
  • Size: 1.8 MB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/5.1.1 CPython/3.13.0

File hashes

Hashes for auditok-0.3.0.tar.gz
Algorithm Hash digest
SHA256 8565d6e7dfbecb7dbbe5c54fb5af66f8c1c827e06745c19df0e3fa468d0022a1
MD5 6249a8c159f9ab836cbdeb018542afb2
BLAKE2b-256 0e0557e6c498cc8b224dc3d057136ce40f983c55a02d1f279ffcf73c544ffdc0

See more details on using hashes here.

File details

Details for the file auditok-0.3.0-py3-none-any.whl.

File metadata

  • Download URL: auditok-0.3.0-py3-none-any.whl
  • Upload date:
  • Size: 1.5 MB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/5.1.1 CPython/3.13.0

File hashes

Hashes for auditok-0.3.0-py3-none-any.whl
Algorithm Hash digest
SHA256 32a19d2fedcac5dac67127d6c622472ba87ac3b6cd4ebc6f8276340658b52ecc
MD5 470cee33ee24dc1d24c58b4b1c2ebe14
BLAKE2b-256 8f42644aef57467b6fd07d399bc38cabb120284fb86fa8989284bc5f8a1b34a6

See more details on using hashes here.

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page