Skip to main content

rVADfast - a fast and robust unsupervised VAD

Project description

rVADfast

The Python library for an unsupervised, fast method for robust voice activity detection (rVAD), as presented in the paper rVAD: An Unsupervised Segment-Based Robust Voice Activity Detection Method, Computer Speech and Language, 2020. More info on the rVAD GitHub page.

The rVAD method consists of two passes of denoising followed by a VAD stage. It has been applied as a preprocessor for a wide range of applications, such as speech recognition, speaker identification, language identification, age and gender identification, self-supervised learning, human-robot interaction, audio archive segmentation, and so on as in Google Scholar.

The method is unsupervised to make it applicable to a broad range of acoustic environments, and it is optimized considering both noisy and clean conditions.

The rVAD (out of the box) ranks the 4th place (out of 27 supervised/unsupervised systems) in a Fearless Steps Speech Activity Detection Challenge.

The rVAD paper is among the most cited articles from Computer Speech and Language published since 2018 (the 6th place), in 2023.

Usage

The rVADfast method is available as a python package installable via: pip install rVADfast. After installation, you can import the rVADfast VAD class as from rVADfast import rVADfast from which you can instantiate a VAD instance, e.g. as vad = rVADfast().
The package also contains functionality to process folders of audio files, to generate VAD labels or to trim non-speeh segments from audio files. This is done by importing the rVADfast.process module which has two methods for processing audio files, namely process.rVADfast_single_process and process.rVADfast_multi_process, with the latter utilizing multiple CPUs for processing. Additionally, a processing script can be called from commandline-tools by executing ```rVADfast_process

In /notebooks a concrete example on how to use the rVADfast package is found.

References

  1. Z.-H. Tan, A.k. Sarkara and N. Dehak, "rVAD: an unsupervised segment-based robust voice activity detection method," Computer Speech and Language, vol. 59, pp. 1-21, 2020.
  2. Z.-H. Tan and B. Lindberg, "Low-complexity variable frame rate analysis for speech recognition and voice activity detection,” IEEE Journal of Selected Topics in Signal Processing, vol. 4, no. 5, pp. 798-807, 2010.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

rVADfast-0.0.1.tar.gz (16.6 kB view details)

Uploaded Source

Built Distribution

rVADfast-0.0.1-py3-none-any.whl (17.9 kB view details)

Uploaded Python 3

File details

Details for the file rVADfast-0.0.1.tar.gz.

File metadata

  • Download URL: rVADfast-0.0.1.tar.gz
  • Upload date:
  • Size: 16.6 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/4.0.2 CPython/3.8.10

File hashes

Hashes for rVADfast-0.0.1.tar.gz
Algorithm Hash digest
SHA256 f8acb3696de18aa194d9e36c6a07c60ddc71ebf020c835c32d3bcf6ba9b19475
MD5 2c9f7ea9dfad13f9fea8a2fca48a3d94
BLAKE2b-256 80c5c97481a51944224ad049ce501a9eb84d566f14615c490d5058f00a2f9413

See more details on using hashes here.

Provenance

File details

Details for the file rVADfast-0.0.1-py3-none-any.whl.

File metadata

  • Download URL: rVADfast-0.0.1-py3-none-any.whl
  • Upload date:
  • Size: 17.9 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/4.0.2 CPython/3.8.10

File hashes

Hashes for rVADfast-0.0.1-py3-none-any.whl
Algorithm Hash digest
SHA256 2274cc0df5e3737d3dc9a9c6fa29bf5557651e824db3319d445d794c8cb00f87
MD5 b85f211804d9a8c2b7eaa40f5cbdcc1a
BLAKE2b-256 a473cd1c473b1c641e513ba1b15e62b44b23c0669289eabb51138c7f77e32e61

See more details on using hashes here.

Provenance

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page