Python interface to the Google WebRTC Voice Activity Detector (VAD)
Project description
py-webrtcvad
This is a python interface to the WebRTC Voice Activity Detector (VAD). It is compatible with Python 2 and Python 3.
A VAD classifies a piece of audio data as being voiced or unvoiced. It can be useful for telephony and speech recognition.
The VAD that Google developed for the WebRTC project is reportedly one of the best available, being fast, modern and free.
How to use it
Install the webrtcvad module:
pip install webrtcvad
Create a Vad object:
import webrtcvad vad = webrtcvad.Vad()
Optionally, set its aggressiveness mode, which is an integer between 0 and 3. 0 is the least aggressive about filtering out non-speech, 3 is the most aggressive. (You can also set the mode when you create the VAD, e.g. vad = webrtcvad.Vad(3)):
vad.set_mode(1)
Give it a short segment (“frame”) of audio. The WebRTC VAD only accepts 16-bit mono PCM audio, sampled at 8000, 16000, 32000 or 48000 Hz. A frame must be either 10, 20, or 30 ms in duration:
# Run the VAD on 10 ms of silence. The result should be False. sample_rate = 16000 frame_duration = 10 # ms frame = b'\x00\x00' * int(sample_rate * frame_duration / 1000) print 'Contains speech: %s' % (vad.is_speech(frame, sample_rate)
See example.py for a more detailed example that will process a .wav file, find the voiced segments, and write each one as a separate .wav.
How to run unit tests
To run unit tests:
pip install -e ".[dev]" python setup.py test
History
2.0.10
Fixed memory leak. Thank you, bond005!
2.0.9
Improved example code. Added WebRTC license.
2.0.8
Fixed Windows compilation errors. Thank you, xiongyihui!
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
File details
Details for the file webrtcvad123-2.0.11.dev0.tar.gz
.
File metadata
- Download URL: webrtcvad123-2.0.11.dev0.tar.gz
- Upload date:
- Size: 81.0 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/4.0.2 CPython/3.8.10
File hashes
Algorithm | Hash digest | |
---|---|---|
SHA256 | 00bbb478872863cdb88a9a517a38d50b62fc5b7f8bbcac6aa4640b3f4fec17db |
|
MD5 | 1d3a5ce011d5620ea21597844dc53c3e |
|
BLAKE2b-256 | d914dc70197b83caa186f2ff8eb9d611478ca4c35f66fc2befd85a0b92e21de1 |