Skip to main content

Fully open-source and state-of-the-art Voice Activity Detection (VAD) models for academic research and commercial applications.

Project description

Fully Open-Source Voice Activity Detection (VAD) for Real-Time Speech Applications

Voice Activity Detection (VAD) is a critical first step in any application involving speech recognition. However, while exploring real-time voice chat agents, I found that many state-of-the-art (SoTA) models are not truly open-source—they provide only open weights, limiting transparency and hindering research and development.

This repository aims to change that by providing a fully open and research-friendly implementation of the Silero VAD model. The goal is to advance the state of the art in VAD through open experimentation, training, and integration.

Status

As of May 27, 2025, this repository includes:

✅ A complete implementation of the Silero VAD model for research use

Roadmap

In the near future, I plan to add the following:

🧠 Code to train Silero VAD from scratch on custom datasets

📊 Evaluation scripts for standard VAD benchmarks

🔧 Support for LoRA fine-tuning to extend or adapt Silero VAD

🔌 Example integrations with Python, client-side web applications, and Unity

Instructions

Install the package in editable mode:

pip install --editable .

License

This project is released under the Creative Commons Attribution-ShareAlike 4.0 International License (CC BY-SA 4.0), encouraging both academic research and commercial application.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

open_voice_activity_detection-0.0.2.tar.gz (9.4 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

open_voice_activity_detection-0.0.2-py3-none-any.whl (12.3 kB view details)

Uploaded Python 3

File details

Details for the file open_voice_activity_detection-0.0.2.tar.gz.

File metadata

File hashes

Hashes for open_voice_activity_detection-0.0.2.tar.gz
Algorithm Hash digest
SHA256 0be1784fb8ce2f5d6db5a2e7199e1575167f111a5cd443de84e7abba3d2a8853
MD5 e8ec4110d838dc6b525a9c81ef8f39d5
BLAKE2b-256 0b3a304f0a9f96a260d9dd631b99f557a3387337dc02e073e9940f37168632a4

See more details on using hashes here.

File details

Details for the file open_voice_activity_detection-0.0.2-py3-none-any.whl.

File metadata

File hashes

Hashes for open_voice_activity_detection-0.0.2-py3-none-any.whl
Algorithm Hash digest
SHA256 65befb87239a615f7245b45a9f6fc79d27bcfb67567aa6938b8c14525d7b0ce5
MD5 ac5b7a58c28e4644701ffc79d3345719
BLAKE2b-256 5b789a23feb64b285c075af81689ecd6747d62567f3b7e915bc42c8522630508

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page