A module for Audio/Acoustic Activity Detection
Project description
auditok is an Audio Activity Detection tool that processes online data (from an audio device or standard input) and audio files. It can be used via the command line or through its API.
Full documentation is available on Read the Docs.
Installation
auditok requires Python 3.7 or higher.
To install the latest stable version, use pip:
sudo pip install auditok
To install the latest development version from GitHub:
pip install git+https://github.com/amsehili/auditok
Alternatively, clone the repository and install it manually:
git clone https://github.com/amsehili/auditok.git
cd auditok
python setup.py install
Basic example
Here’s a simple example of using auditok to detect audio events:
import auditok
# `split` returns a generator of AudioRegion objects
audio_events = auditok.split(
"audio.wav",
min_dur=0.2, # Minimum duration of a valid audio event in seconds
max_dur=4, # Maximum duration of an event
max_silence=0.3, # Maximum tolerated silence duration within an event
energy_threshold=55 # Detection threshold
)
for i, r in enumerate(audio_events):
# AudioRegions returned by `split` have defined 'start' and 'end' attributes
print(f"Event {i}: {r.start:.3f}s -- {r.end:.3f}")
# Play the audio event
r.play(progress_bar=True)
# Save the event with start and end times in the filename
filename = r.save("event_{start:.3f}-{end:.3f}.wav")
print(f"Event saved as: {filename}")
Example output:
Event 0: 0.700s -- 1.400s
Event saved as: event_0.700-1.400.wav
Event 1: 3.800s -- 4.500s
Event saved as: event_3.800-4.500.wav
Event 2: 8.750s -- 9.950s
Event saved as: event_8.750-9.950.wav
Event 3: 11.700s -- 12.400s
Event saved as: event_11.700-12.400.wav
Event 4: 15.050s -- 15.850s
Event saved as: event_15.050-15.850.wav
Split and plot
Visualize the audio signal with detected events:
import auditok
region = auditok.load("audio.wav") # Returns an AudioRegion object
regions = region.split_and_plot(...) # Or simply use `region.splitp()`
Example output:
Split an audio stream and re-join (glue) audio events with silence
The following code detects audio events within an audio stream, then insert 1 second of silence between them to create an audio with pauses:
# Create a 1-second silent audio region
# Audio parameters must match the original stream
from auditok import split, make_silence
silence = make_silence(duration=1,
sampling_rate=16000,
sample_width=2,
channels=1)
events = split("audio.wav")
audio_with_pauses = silence.join(events)
Alternatively, use split_and_join_with_silence:
from auditok import split_and_join_with_silence
audio_with_pauses = split_and_join_with_silence(silence_duration=1, input="audio.wav")
Export an AudioRegion as a numpy array
from auditok import load, AudioRegion
audio = load("audio.wav") # or use `AudioRegion.load("audio.wav")`
x = audio.numpy()
assert x.shape[0] == audio.channels
assert x.shape[1] == len(audio)
Limitations
The detection algorithm is based on audio signal energy. While it performs well in low-noise environments (e.g., podcasts, language lessons, or quiet recordings), performance may drop in noisy settings. Additionally, the algorithm does not distinguish between speech and other sounds, so it is not suitable for Voice Activity Detection in multi-sound environments.
License
MIT.
Project details
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
File details
Details for the file auditok-0.3.0.tar.gz
.
File metadata
- Download URL: auditok-0.3.0.tar.gz
- Upload date:
- Size: 1.8 MB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/5.1.1 CPython/3.13.0
File hashes
Algorithm | Hash digest | |
---|---|---|
SHA256 | 8565d6e7dfbecb7dbbe5c54fb5af66f8c1c827e06745c19df0e3fa468d0022a1 |
|
MD5 | 6249a8c159f9ab836cbdeb018542afb2 |
|
BLAKE2b-256 | 0e0557e6c498cc8b224dc3d057136ce40f983c55a02d1f279ffcf73c544ffdc0 |
File details
Details for the file auditok-0.3.0-py3-none-any.whl
.
File metadata
- Download URL: auditok-0.3.0-py3-none-any.whl
- Upload date:
- Size: 1.5 MB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/5.1.1 CPython/3.13.0
File hashes
Algorithm | Hash digest | |
---|---|---|
SHA256 | 32a19d2fedcac5dac67127d6c622472ba87ac3b6cd4ebc6f8276340658b52ecc |
|
MD5 | 470cee33ee24dc1d24c58b4b1c2ebe14 |
|
BLAKE2b-256 | 8f42644aef57467b6fd07d399bc38cabb120284fb86fa8989284bc5f8a1b34a6 |