Extremely fast voice activity detection in Rust with Python bindings and streaming mode support.
Project description
fast-vad
Extremely fast voice activity detection in Rust with Python bindings and streaming mode support. Significantly faster than WebRTC VAD and orders of magnitude faster than Silero ONNX. See benchmark comparisons.
Supports 16 kHz and 8 kHz sample rates.
Architecture
Audio is split into non-overlapping 32 ms frames (512 samples at 16 kHz, 256 at 8 kHz), Hann-windowed, FFT'd, and collapsed into 8 log-energy bands covering roughly 94-4000 Hz.
Per frame, the detector builds 32 features: 8 raw log-energies, 8 noise-normalised values (raw minus a running noise floor), and their first and second order deltas. A logistic regression model with weights compiled into the crate scores these features and compares the result to a mode-specific threshold. The noise floor is a per-band exponential moving average that only updates on silence frames, so it adapts to background noise without being contaminated by speech.
Raw frame labels are then post-processed: short speech bursts below min_speech_ms are dropped, short silence gaps below min_silence_ms are filled, and voiced regions are extended by hangover_ms to avoid clipping word endings.
VAD processes all frames in parallel with rayon. VadStateful processes one frame at a time with reused FFT scratch buffers for low-latency streaming. Hot loops are SIMD-accelerated via the wide crate.
Install
Python
pip install fast-vad
Or with uv:
uv add fast-vad
Rust
cargo add fast-vad
Build from source
Python
Requires a Rust toolchain and maturin.
git clone https://github.com/AtharvBhat/fast-vad
cd fast-vad
maturin develop --release
Rust
cargo build --release
Python usage
Fast vad comes with a few modes.VAD() and VadStateful() default to fast_vad.mode.normal for offline and streaming mode respectively. To customize parameters use with_mode or with_config for even finer control.
import numpy as np
import soundfile as sf
import fast_vad
audio, sr = sf.read("audio.wav", dtype="float32")
assert sr in (8000, 16000)
# Default (Normal mode)
vad = fast_vad.VAD(sr)
# Explicit mode
vad = fast_vad.VAD.with_mode(sr, fast_vad.mode.aggressive) # choose permissive, normal or aggressive
# Custom parameters
vad = fast_vad.VAD.with_config(
sr,
threshold_probability=0.7,
min_speech_ms=100,
min_silence_ms=300,
hangover_ms=100,
)
# Per-sample labels
labels = vad.detect(audio)
# Per-frame labels
frame_labels = vad.detect_frames(audio)
# Speech segments as a (N, 2) uint64 numpy array of [start, end] sample indices
segments = vad.detect_segments(audio)
for start, end in segments:
print(f"speech: {start/sr:.2f}s – {end/sr:.2f}s")
Streaming
# Default (Normal mode)
vad = fast_vad.VadStateful(sr)
# Explicit mode
vad = fast_vad.VadStateful.with_mode(sr, fast_vad.mode.normal)
# Custom parameters
vad = fast_vad.VadStateful.with_config(sr, 0.7, 100, 300, 100)
frame_size = vad.frame_size # 512 at 16 kHz, 256 at 8 kHz
for i in range(0, len(audio) - frame_size + 1, frame_size):
is_speech = vad.detect_frame(audio[i : i + frame_size])
print(f"frame {i // frame_size}: {'speech' if is_speech else 'silence'}")
vad.reset_state() # reuse for another stream
Feature extraction
You can also use fast vad as a feature extractor.
fe = fast_vad.FeatureExtractor(sr)
# 8 log-energy band features per frame
features = fe.extract_features(audio) # shape: (num_frames, 8)
# 24-dimensional features per frame: raw bands + first- and second-order deltas
features = fe.feature_engineer(audio) # shape: (num_frames, 24)
Modes
| Constant | Description |
|---|---|
fast_vad.mode.permissive |
Low false-negative rate; more speech accepted |
fast_vad.mode.normal |
Balanced, general-purpose |
fast_vad.mode.aggressive |
Low false-positive rate; stricter |
The built-in modes were tuned against LibriVAD, so they work best on read speech. For other domains (phone calls, meetings, noisy environments, etc.) you'll likely get better results tuning with_config() against your own data.
Rust usage
Config is set at construction. VAD::new and VadStateful::new default to Normal
mode; use with_mode or with_config to customise.
use fast_vad::vad::detector::{VAD, VADModes, VadConfig};
fn main() -> Result<(), fast_vad::VadError> {
let audio = vec![0.0f32; 16000]; // 1 second of silence
// Default (Normal mode)
let vad = VAD::new(16000)?;
// Explicit mode
let vad = VAD::with_mode(16000, VADModes::Aggressive)?;
// Custom parameters
let vad = VAD::with_config(16000, VadConfig {
threshold_probability: 0.7,
min_speech_ms: 100,
min_silence_ms: 300,
hangover_ms: 100,
})?;
let labels = vad.detect(&audio); // one bool per sample
let frame_labels = vad.detect_frames(&audio); // one bool per frame
let segments = vad.detect_segments(&audio); // Vec<[start, end]>
Ok(())
}
Streaming
use fast_vad::vad::detector::{VadStateful, VADModes, VadConfig};
fn main() -> Result<(), fast_vad::VadError> {
let audio = vec![0.0f32; 16000];
// Default (Normal mode)
let mut vad = VadStateful::new(16000)?;
// Explicit mode
let mut vad = VadStateful::with_mode(16000, VADModes::Normal)?;
// Custom parameters
let mut vad = VadStateful::with_config(16000, VadConfig {
threshold_probability: 0.7,
min_speech_ms: 100,
min_silence_ms: 300,
hangover_ms: 100,
})?;
let frame_size = vad.frame_size();
for frame in audio.chunks_exact(frame_size) {
let is_speech = vad.detect_frame(frame)?;
println!("{is_speech}");
}
vad.reset_state(); // reuse for another stream
Ok(())
}
Benchmarking
cargo bench --manifest-path bench_rs/Cargo.toml
License
Licensed under either of
- Apache License, Version 2.0 (LICENSE-APACHE)
- MIT license (LICENSE-MIT)
at your option.
Project details
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distributions
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file fast_vad-0.2.1.tar.gz.
File metadata
- Download URL: fast_vad-0.2.1.tar.gz
- Upload date:
- Size: 31.3 kB
- Tags: Source
- Uploaded using Trusted Publishing? Yes
- Uploaded via: twine/6.1.0 CPython/3.13.7
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
16d41e946bacde94527865a28f35cd782de31d9e49c827def8184f60c4e6944e
|
|
| MD5 |
ae19a828ad05094ab39e82333fd7164a
|
|
| BLAKE2b-256 |
b4ade233693c6e405869ed197fe87aeb3e7f3d74f702dbd9c1b015254d48a3cc
|
Provenance
The following attestation bundles were made for fast_vad-0.2.1.tar.gz:
Publisher:
ci.yml on AtharvBhat/fast-vad
-
Statement:
-
Statement type:
https://in-toto.io/Statement/v1 -
Predicate type:
https://docs.pypi.org/attestations/publish/v1 -
Subject name:
fast_vad-0.2.1.tar.gz -
Subject digest:
16d41e946bacde94527865a28f35cd782de31d9e49c827def8184f60c4e6944e - Sigstore transparency entry: 1181292738
- Sigstore integration time:
-
Permalink:
AtharvBhat/fast-vad@870bcb1b8bff52a78f69abc92b7139b1084cd47d -
Branch / Tag:
refs/tags/v0.2.1 - Owner: https://github.com/AtharvBhat
-
Access:
public
-
Token Issuer:
https://token.actions.githubusercontent.com -
Runner Environment:
github-hosted -
Publication workflow:
ci.yml@870bcb1b8bff52a78f69abc92b7139b1084cd47d -
Trigger Event:
push
-
Statement type:
File details
Details for the file fast_vad-0.2.1-cp311-abi3-win_amd64.whl.
File metadata
- Download URL: fast_vad-0.2.1-cp311-abi3-win_amd64.whl
- Upload date:
- Size: 586.2 kB
- Tags: CPython 3.11+, Windows x86-64
- Uploaded using Trusted Publishing? Yes
- Uploaded via: twine/6.1.0 CPython/3.13.7
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
f33ec68687ee8a6f39cb2ed00dff002f9d375c18aa7be9d7bf4d659c69322e9b
|
|
| MD5 |
cc8ffb219d077f3247f55bfc701bdb3a
|
|
| BLAKE2b-256 |
8dd76db3d00875a3536c5b4b0a26916d9fcea866707499f99a8c99d7c0f6f24d
|
Provenance
The following attestation bundles were made for fast_vad-0.2.1-cp311-abi3-win_amd64.whl:
Publisher:
ci.yml on AtharvBhat/fast-vad
-
Statement:
-
Statement type:
https://in-toto.io/Statement/v1 -
Predicate type:
https://docs.pypi.org/attestations/publish/v1 -
Subject name:
fast_vad-0.2.1-cp311-abi3-win_amd64.whl -
Subject digest:
f33ec68687ee8a6f39cb2ed00dff002f9d375c18aa7be9d7bf4d659c69322e9b - Sigstore transparency entry: 1181292741
- Sigstore integration time:
-
Permalink:
AtharvBhat/fast-vad@870bcb1b8bff52a78f69abc92b7139b1084cd47d -
Branch / Tag:
refs/tags/v0.2.1 - Owner: https://github.com/AtharvBhat
-
Access:
public
-
Token Issuer:
https://token.actions.githubusercontent.com -
Runner Environment:
github-hosted -
Publication workflow:
ci.yml@870bcb1b8bff52a78f69abc92b7139b1084cd47d -
Trigger Event:
push
-
Statement type:
File details
Details for the file fast_vad-0.2.1-cp311-abi3-manylinux_2_28_x86_64.whl.
File metadata
- Download URL: fast_vad-0.2.1-cp311-abi3-manylinux_2_28_x86_64.whl
- Upload date:
- Size: 815.9 kB
- Tags: CPython 3.11+, manylinux: glibc 2.28+ x86-64
- Uploaded using Trusted Publishing? Yes
- Uploaded via: twine/6.1.0 CPython/3.13.7
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
00807e0bc7f303fe997d67f258360785420547265f5b79675b5ef605babda2a9
|
|
| MD5 |
0bc734b85a654d28a0a451f4b43dbb66
|
|
| BLAKE2b-256 |
bd406b2d3ec3335d698dc64708693770f055f8d7276194090db1fab464e31e65
|
Provenance
The following attestation bundles were made for fast_vad-0.2.1-cp311-abi3-manylinux_2_28_x86_64.whl:
Publisher:
ci.yml on AtharvBhat/fast-vad
-
Statement:
-
Statement type:
https://in-toto.io/Statement/v1 -
Predicate type:
https://docs.pypi.org/attestations/publish/v1 -
Subject name:
fast_vad-0.2.1-cp311-abi3-manylinux_2_28_x86_64.whl -
Subject digest:
00807e0bc7f303fe997d67f258360785420547265f5b79675b5ef605babda2a9 - Sigstore transparency entry: 1181292749
- Sigstore integration time:
-
Permalink:
AtharvBhat/fast-vad@870bcb1b8bff52a78f69abc92b7139b1084cd47d -
Branch / Tag:
refs/tags/v0.2.1 - Owner: https://github.com/AtharvBhat
-
Access:
public
-
Token Issuer:
https://token.actions.githubusercontent.com -
Runner Environment:
github-hosted -
Publication workflow:
ci.yml@870bcb1b8bff52a78f69abc92b7139b1084cd47d -
Trigger Event:
push
-
Statement type:
File details
Details for the file fast_vad-0.2.1-cp311-abi3-manylinux_2_28_aarch64.whl.
File metadata
- Download URL: fast_vad-0.2.1-cp311-abi3-manylinux_2_28_aarch64.whl
- Upload date:
- Size: 677.3 kB
- Tags: CPython 3.11+, manylinux: glibc 2.28+ ARM64
- Uploaded using Trusted Publishing? Yes
- Uploaded via: twine/6.1.0 CPython/3.13.7
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
ab1446a85bf5039edcf6658fa227c9a4ce098482ba5d68ec088650a6ae80af8d
|
|
| MD5 |
a222e9911fec565bc3e0add5eb1aa552
|
|
| BLAKE2b-256 |
47cc0df69a380814f0af04aaa76ffd0eeb543af07b39f8036379d087f79d4dca
|
Provenance
The following attestation bundles were made for fast_vad-0.2.1-cp311-abi3-manylinux_2_28_aarch64.whl:
Publisher:
ci.yml on AtharvBhat/fast-vad
-
Statement:
-
Statement type:
https://in-toto.io/Statement/v1 -
Predicate type:
https://docs.pypi.org/attestations/publish/v1 -
Subject name:
fast_vad-0.2.1-cp311-abi3-manylinux_2_28_aarch64.whl -
Subject digest:
ab1446a85bf5039edcf6658fa227c9a4ce098482ba5d68ec088650a6ae80af8d - Sigstore transparency entry: 1181292753
- Sigstore integration time:
-
Permalink:
AtharvBhat/fast-vad@870bcb1b8bff52a78f69abc92b7139b1084cd47d -
Branch / Tag:
refs/tags/v0.2.1 - Owner: https://github.com/AtharvBhat
-
Access:
public
-
Token Issuer:
https://token.actions.githubusercontent.com -
Runner Environment:
github-hosted -
Publication workflow:
ci.yml@870bcb1b8bff52a78f69abc92b7139b1084cd47d -
Trigger Event:
push
-
Statement type:
File details
Details for the file fast_vad-0.2.1-cp311-abi3-macosx_11_0_arm64.whl.
File metadata
- Download URL: fast_vad-0.2.1-cp311-abi3-macosx_11_0_arm64.whl
- Upload date:
- Size: 598.3 kB
- Tags: CPython 3.11+, macOS 11.0+ ARM64
- Uploaded using Trusted Publishing? Yes
- Uploaded via: twine/6.1.0 CPython/3.13.7
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
c17f819a123627cf194aa18543b1b37aa3860caab614d3a852c09977e68dd98e
|
|
| MD5 |
9f26077efb93479f888c27955a8a6de1
|
|
| BLAKE2b-256 |
cebe7d0bb1af7beb756923c04218dd89a2eda3b6d0870ec018232161b629af20
|
Provenance
The following attestation bundles were made for fast_vad-0.2.1-cp311-abi3-macosx_11_0_arm64.whl:
Publisher:
ci.yml on AtharvBhat/fast-vad
-
Statement:
-
Statement type:
https://in-toto.io/Statement/v1 -
Predicate type:
https://docs.pypi.org/attestations/publish/v1 -
Subject name:
fast_vad-0.2.1-cp311-abi3-macosx_11_0_arm64.whl -
Subject digest:
c17f819a123627cf194aa18543b1b37aa3860caab614d3a852c09977e68dd98e - Sigstore transparency entry: 1181292755
- Sigstore integration time:
-
Permalink:
AtharvBhat/fast-vad@870bcb1b8bff52a78f69abc92b7139b1084cd47d -
Branch / Tag:
refs/tags/v0.2.1 - Owner: https://github.com/AtharvBhat
-
Access:
public
-
Token Issuer:
https://token.actions.githubusercontent.com -
Runner Environment:
github-hosted -
Publication workflow:
ci.yml@870bcb1b8bff52a78f69abc92b7139b1084cd47d -
Trigger Event:
push
-
Statement type:
File details
Details for the file fast_vad-0.2.1-cp311-abi3-macosx_10_12_x86_64.whl.
File metadata
- Download URL: fast_vad-0.2.1-cp311-abi3-macosx_10_12_x86_64.whl
- Upload date:
- Size: 739.7 kB
- Tags: CPython 3.11+, macOS 10.12+ x86-64
- Uploaded using Trusted Publishing? Yes
- Uploaded via: twine/6.1.0 CPython/3.13.7
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
f2937b3511447b91419df23cdffe901c63108998bb8b41a8b7d8ab200fe722f4
|
|
| MD5 |
a711ff9c987d4df53ef1db547046022d
|
|
| BLAKE2b-256 |
134a33f23eacef546cbdbad9a4d3a6931993974b7d63b5ee52125c26e0034a65
|
Provenance
The following attestation bundles were made for fast_vad-0.2.1-cp311-abi3-macosx_10_12_x86_64.whl:
Publisher:
ci.yml on AtharvBhat/fast-vad
-
Statement:
-
Statement type:
https://in-toto.io/Statement/v1 -
Predicate type:
https://docs.pypi.org/attestations/publish/v1 -
Subject name:
fast_vad-0.2.1-cp311-abi3-macosx_10_12_x86_64.whl -
Subject digest:
f2937b3511447b91419df23cdffe901c63108998bb8b41a8b7d8ab200fe722f4 - Sigstore transparency entry: 1181292745
- Sigstore integration time:
-
Permalink:
AtharvBhat/fast-vad@870bcb1b8bff52a78f69abc92b7139b1084cd47d -
Branch / Tag:
refs/tags/v0.2.1 - Owner: https://github.com/AtharvBhat
-
Access:
public
-
Token Issuer:
https://token.actions.githubusercontent.com -
Runner Environment:
github-hosted -
Publication workflow:
ci.yml@870bcb1b8bff52a78f69abc92b7139b1084cd47d -
Trigger Event:
push
-
Statement type: