Skip to main content

Fully open-source and state-of-the-art Voice Activity Detection (VAD) models for academic research and commercial applications.

Project description

Fully Open-Source Voice Activity Detection (VAD) for Real-Time Speech Applications

Voice Activity Detection (VAD) is a critical first step in any application involving speech recognition. However, while exploring real-time voice chat agents, I found that many state-of-the-art (SoTA) models are not truly open-source—they provide only open weights, limiting transparency and hindering research and development.

This repository aims to change that by providing a fully open and research-friendly implementation of the Silero VAD model. The goal is to advance the state of the art in VAD through open experimentation, training, and integration.

Status

As of May 27, 2025, this repository includes:

✅ A complete implementation of the Silero VAD model for research use

Roadmap

In the near future, I plan to add the following:

🧠 Code to train Silero VAD from scratch on custom datasets

📊 Evaluation scripts for standard VAD benchmarks

🔧 Support for LoRA fine-tuning to extend or adapt Silero VAD

🔌 Example integrations with Python, client-side web applications, and Unity

Instructions

Install the package in editable mode:

pip install --editable .

License

This project is released under the Creative Commons Attribution-ShareAlike 4.0 International License (CC BY-SA 4.0), encouraging both academic research and commercial application.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

open_voice_activity_detection-0.0.1.tar.gz (9.3 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

open_voice_activity_detection-0.0.1-py3-none-any.whl (12.2 kB view details)

Uploaded Python 3

File details

Details for the file open_voice_activity_detection-0.0.1.tar.gz.

File metadata

File hashes

Hashes for open_voice_activity_detection-0.0.1.tar.gz
Algorithm Hash digest
SHA256 bbc1604a463067e76828009bc4bd814549b18d0cf864009c1bed13e0ca98614a
MD5 e9bdb692f236c6d46a5af3102d5e7b8c
BLAKE2b-256 a444ac621405764e6a49ab08322e8274cfa27a91514deeef463ece66489a0fd7

See more details on using hashes here.

File details

Details for the file open_voice_activity_detection-0.0.1-py3-none-any.whl.

File metadata

File hashes

Hashes for open_voice_activity_detection-0.0.1-py3-none-any.whl
Algorithm Hash digest
SHA256 919ee61f1b1849069cf943bb52f57da6c72743466f87029fd16a129d63ec1c12
MD5 94e710e9167a38f1f886d35782eaf601
BLAKE2b-256 d415d508ff4df50e121e4d23db7141361565b6e51d6eb03c77dddd459f24c204

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page