Skip to main content

Python interface to the Google WebRTC Voice Activity Detector (VAD)

Project description

https://travis-ci.org/wiseman/py-webrtcvad.svg?branch=master

py-webrtcvad

This is a python interface to the WebRTC Voice Activity Detector (VAD). It is compatible with Python 2 and Python 3.

A VAD classifies a piece of audio data as being voiced or unvoiced. It can be useful for telephony and speech recognition.

The VAD that Google developed for the WebRTC project is reportedly one of the best available, being fast, modern and free.

How to use it

  1. Create a Vad object.:

    import webrtcvad
    vad = webrtcvad.Vad()
  2. Optionally, set its aggressiveness mode, which is an integer between 0 and 3. 0 is the least aggressive about filtering out non-speech, 3 is the most aggressive. (You can also set the mode when you create the VAD, e.g. vad = webrtcvad.Vad(3)):

    vad.set_mode(1)
  3. Give it a short segment (“frame”) of audio. The WebRTC VAD only accepts 16-bit mono PCM audio, sampled at 8000, 16000, or 32000 Hz. A frame must be either 10, 20, or 30 ms in duration.:

    # Run the VAD on 10 ms of silence. The result should be False.
    sample_rate = 16000
    frame_duration = 10  # ms
    frame = b'\x00\x00' * (sample_rate * frame_duration / 1000)
    print 'Contains speech: %s' % (vad.is_voiced(frame, sample_rate)

See example.py for a more detailed example that will process a .wav file, find the voiced segments, and write each one as a separate .wav.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

webrtcvad-1.0.2.tar.gz (64.3 kB view details)

Uploaded Source

File details

Details for the file webrtcvad-1.0.2.tar.gz.

File metadata

  • Download URL: webrtcvad-1.0.2.tar.gz
  • Upload date:
  • Size: 64.3 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No

File hashes

Hashes for webrtcvad-1.0.2.tar.gz
Algorithm Hash digest
SHA256 1a80e1cb5b5d6b0da3696385d94a2b41adac7c5358c27ba5b8a111c948240ee1
MD5 c74e19e016fb7ed291aa2b5f626b1072
BLAKE2b-256 f09fcc44d790711e4fae1a41337c9a4cdb3f74f59baca640d11aeb7aab716850

See more details on using hashes here.

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page