Extremely fast voice activity detection in Rust with Python bindings and streaming mode support.
Project description
fast-vad
Extremely fast voice activity detection in Rust with Python bindings and streaming mode support.
Supports 16 kHz and 8 kHz audio. Fixed frame width of 32 ms (512 samples at 16 kHz and 256 samples at 8 kHz).
If you are interested in benchmark comparisons, see docs/README.md.
Benchmarking
Python benchmarks live in bench_py/ and Rust benchmarks live in bench_rs/.
uv run pytest bench_py/bench_vad.py bench_py/bench_feature_extractor.py --benchmark-sort=mean --benchmark-group-by=group
cargo bench --manifest-path bench_rs/Cargo.toml
Architecture
fast_vad is a small fixed-frame DSP pipeline with a hardcoded lightweight classifier.
audio
-> 32 ms frames
- 16 kHz: 512 samples
- 8 kHz: 256 samples
-> Hann window
-> real FFT
-> 8 log-energy bands
-> feature engineering
- raw bands (8)
- noise-normalized (8)
- first deltas (8)
- second deltas (8)
= 32 total features
-> hardcoded logistic regression
-> threshold + smoothing
-> speech / silence labels
At a glance:
VAD(offline / batch) splits audio into 32 ms frames and usesrayonto process complete frames in parallel while extracting the 8-band features.VadStateful(streaming) runs the same per-frame pipeline one frame at a time and reuses scratch buffers instead of paying thread-pool overhead.- The detector keeps a running 8-band noise floor, then derives 32 total features from each frame: raw band energies, noise-normalized energies, first-order deltas, and second-order deltas.
- Classification is a tiny hardcoded logistic-regression-style model with fixed weights and bias compiled into the crate.
- The final decision is shaped by simple temporal rules: thresholding, minimum speech length, minimum silence length, and hangover.
- Hot loops are SIMD-accelerated with the
widecrate for windowing, spectral power computation, band-energy math, and detector feature math.
frame features (8 bands)
| raw
| raw - noise_floor
| delta
| delta2
v
32 engineered features
v
linear score + bias
v
speech / silence
Build from source
Python (with uv)
Requires uv and a Rust toolchain.
git clone https://github.com/AtharvBhat/fast-vad
cd fast-vad
uv venv
uv pip install maturin
uv run maturin develop --release
The package is then importable inside the virtual environment.
Rust
cargo build --release
Add as a dependency in another crate:
[dependencies]
fast-vad = { path = "/path/to/fast-vad" }
Python usage
Config is set at construction time. VAD() and VadStateful() default to Normal
mode; use with_mode or with_config to customise.
import numpy as np
import soundfile as sf
import fast_vad
audio, sr = sf.read("audio.wav", dtype="float32")
assert sr in (8000, 16000)
# Default (Normal mode)
vad = fast_vad.VAD(sr)
# Explicit mode
vad = fast_vad.VAD.with_mode(sr, fast_vad.mode.aggressive)
# Custom parameters
vad = fast_vad.VAD.with_config(
sr,
threshold_probability=0.7,
min_speech_ms=100,
min_silence_ms=300,
hangover_ms=100,
)
# Per-sample labels
labels = vad.detect(audio)
# Per-frame labels
frame_labels = vad.detect_frames(audio)
# Speech segments as a (N, 2) uint64 numpy array of [start, end] sample indices
segments = vad.detect_segments(audio)
for start, end in segments:
print(f"speech: {start/sr:.2f}s – {end/sr:.2f}s")
Streaming
# Default (Normal mode)
vad = fast_vad.VadStateful(sr)
# Explicit mode
vad = fast_vad.VadStateful.with_mode(sr, fast_vad.mode.normal)
# Custom parameters
vad = fast_vad.VadStateful.with_config(sr, 0.7, 100, 300, 100)
frame_size = vad.frame_size # 512 at 16 kHz, 256 at 8 kHz
for i in range(0, len(audio) - frame_size + 1, frame_size):
is_speech = vad.detect_frame(audio[i : i + frame_size])
print(f"frame {i // frame_size}: {'speech' if is_speech else 'silence'}")
vad.reset_state() # reuse for another stream
Feature extraction
fe = fast_vad.FeatureExtractor(sr)
features = fe.extract_features(audio) # shape: (num_frames, 8)
Modes
| Constant | Description |
|---|---|
fast_vad.mode.permissive |
Low false-negative rate; more speech accepted |
fast_vad.mode.normal |
Balanced, general-purpose |
fast_vad.mode.aggressive |
Low false-positive rate; stricter |
Rust usage
Config is set at construction. VAD::new and VadStateful::new default to Normal
mode; use with_mode or with_config to customise.
use fast_vad::vad::detector::{VAD, VADModes, VadConfig};
fn main() -> Result<(), fast_vad::VadError> {
let audio = vec![0.0f32; 16000]; // 1 second of silence
// Default (Normal mode)
let vad = VAD::new(16000)?;
// Explicit mode
let vad = VAD::with_mode(16000, VADModes::Aggressive)?;
// Custom parameters
let vad = VAD::with_config(16000, VadConfig {
threshold_probability: 0.7,
min_speech_ms: 100,
min_silence_ms: 300,
hangover_ms: 100,
})?;
let labels = vad.detect(&audio); // one bool per sample
let frame_labels = vad.detect_frames(&audio); // one bool per frame
let segments = vad.detect_segments(&audio); // Vec<[start, end]>
Ok(())
}
Streaming
use fast_vad::vad::detector::{VadStateful, VADModes, VadConfig};
fn main() -> Result<(), fast_vad::VadError> {
let audio = vec![0.0f32; 16000];
// Default (Normal mode)
let mut vad = VadStateful::new(16000)?;
// Explicit mode
let mut vad = VadStateful::with_mode(16000, VADModes::Normal)?;
// Custom parameters
let mut vad = VadStateful::with_config(16000, VadConfig {
threshold_probability: 0.7,
min_speech_ms: 100,
min_silence_ms: 300,
hangover_ms: 100,
})?;
let frame_size = vad.frame_size();
for frame in audio.chunks_exact(frame_size) {
let is_speech = vad.detect_frame(frame)?;
println!("{is_speech}");
}
vad.reset_state(); // reuse for another stream
Ok(())
}
License
Licensed under either of
- Apache License, Version 2.0 (LICENSE-APACHE)
- MIT license (LICENSE-MIT)
at your option.
Project details
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distributions
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file fast_vad-0.1.0.tar.gz.
File metadata
- Download URL: fast_vad-0.1.0.tar.gz
- Upload date:
- Size: 29.1 kB
- Tags: Source
- Uploaded using Trusted Publishing? Yes
- Uploaded via: twine/6.1.0 CPython/3.13.7
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
fe76cb9a039c8a9e7d6b97e4ffd22779c98770811dcdee1e1b1a1032b54b5092
|
|
| MD5 |
730fcc6092040b710b775be481203241
|
|
| BLAKE2b-256 |
0370fa0b26b4e1ca115cbd591a72eef5439c57ca76fbdcf0b5a27b938588e541
|
Provenance
The following attestation bundles were made for fast_vad-0.1.0.tar.gz:
Publisher:
ci.yml on AtharvBhat/fast-vad
-
Statement:
-
Statement type:
https://in-toto.io/Statement/v1 -
Predicate type:
https://docs.pypi.org/attestations/publish/v1 -
Subject name:
fast_vad-0.1.0.tar.gz -
Subject digest:
fe76cb9a039c8a9e7d6b97e4ffd22779c98770811dcdee1e1b1a1032b54b5092 - Sigstore transparency entry: 1067385108
- Sigstore integration time:
-
Permalink:
AtharvBhat/fast-vad@72da488d4d48f776e546fe43334928117bc3e1eb -
Branch / Tag:
refs/tags/v0.1.0 - Owner: https://github.com/AtharvBhat
-
Access:
public
-
Token Issuer:
https://token.actions.githubusercontent.com -
Runner Environment:
github-hosted -
Publication workflow:
ci.yml@72da488d4d48f776e546fe43334928117bc3e1eb -
Trigger Event:
push
-
Statement type:
File details
Details for the file fast_vad-0.1.0-cp311-abi3-win_amd64.whl.
File metadata
- Download URL: fast_vad-0.1.0-cp311-abi3-win_amd64.whl
- Upload date:
- Size: 562.7 kB
- Tags: CPython 3.11+, Windows x86-64
- Uploaded using Trusted Publishing? Yes
- Uploaded via: twine/6.1.0 CPython/3.13.7
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
7bc87691435d7ba24d930dd3288c128e2956cd7f33ea22d54f3f5c11a342d8e2
|
|
| MD5 |
d824a9d96ab6ee86b4c83ed75cc22707
|
|
| BLAKE2b-256 |
00b7e8cf0e05f548b2ba170f3390f4b70208eb1d4db48f940b1f2273e9da4d89
|
Provenance
The following attestation bundles were made for fast_vad-0.1.0-cp311-abi3-win_amd64.whl:
Publisher:
ci.yml on AtharvBhat/fast-vad
-
Statement:
-
Statement type:
https://in-toto.io/Statement/v1 -
Predicate type:
https://docs.pypi.org/attestations/publish/v1 -
Subject name:
fast_vad-0.1.0-cp311-abi3-win_amd64.whl -
Subject digest:
7bc87691435d7ba24d930dd3288c128e2956cd7f33ea22d54f3f5c11a342d8e2 - Sigstore transparency entry: 1067385446
- Sigstore integration time:
-
Permalink:
AtharvBhat/fast-vad@72da488d4d48f776e546fe43334928117bc3e1eb -
Branch / Tag:
refs/tags/v0.1.0 - Owner: https://github.com/AtharvBhat
-
Access:
public
-
Token Issuer:
https://token.actions.githubusercontent.com -
Runner Environment:
github-hosted -
Publication workflow:
ci.yml@72da488d4d48f776e546fe43334928117bc3e1eb -
Trigger Event:
push
-
Statement type:
File details
Details for the file fast_vad-0.1.0-cp311-abi3-manylinux_2_28_x86_64.whl.
File metadata
- Download URL: fast_vad-0.1.0-cp311-abi3-manylinux_2_28_x86_64.whl
- Upload date:
- Size: 794.2 kB
- Tags: CPython 3.11+, manylinux: glibc 2.28+ x86-64
- Uploaded using Trusted Publishing? Yes
- Uploaded via: twine/6.1.0 CPython/3.13.7
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
b4812e17008a6f8a6b3903777d73822a947a9c8df8874157f60a13cd62675ef7
|
|
| MD5 |
c4420c5fca0425324ead54f167c467ff
|
|
| BLAKE2b-256 |
e79a656d75e7bfb558e72b901ff29ef90ea2f5f40153fa1c8efd0b4b683f5400
|
Provenance
The following attestation bundles were made for fast_vad-0.1.0-cp311-abi3-manylinux_2_28_x86_64.whl:
Publisher:
ci.yml on AtharvBhat/fast-vad
-
Statement:
-
Statement type:
https://in-toto.io/Statement/v1 -
Predicate type:
https://docs.pypi.org/attestations/publish/v1 -
Subject name:
fast_vad-0.1.0-cp311-abi3-manylinux_2_28_x86_64.whl -
Subject digest:
b4812e17008a6f8a6b3903777d73822a947a9c8df8874157f60a13cd62675ef7 - Sigstore transparency entry: 1067385383
- Sigstore integration time:
-
Permalink:
AtharvBhat/fast-vad@72da488d4d48f776e546fe43334928117bc3e1eb -
Branch / Tag:
refs/tags/v0.1.0 - Owner: https://github.com/AtharvBhat
-
Access:
public
-
Token Issuer:
https://token.actions.githubusercontent.com -
Runner Environment:
github-hosted -
Publication workflow:
ci.yml@72da488d4d48f776e546fe43334928117bc3e1eb -
Trigger Event:
push
-
Statement type:
File details
Details for the file fast_vad-0.1.0-cp311-abi3-manylinux_2_28_aarch64.whl.
File metadata
- Download URL: fast_vad-0.1.0-cp311-abi3-manylinux_2_28_aarch64.whl
- Upload date:
- Size: 665.1 kB
- Tags: CPython 3.11+, manylinux: glibc 2.28+ ARM64
- Uploaded using Trusted Publishing? Yes
- Uploaded via: twine/6.1.0 CPython/3.13.7
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
219d098c0a09f0d5a2c2c70344646b84b32edaaac94e467d2101db47b092b435
|
|
| MD5 |
74cdf956f1b2ccfb080febc0754ef85a
|
|
| BLAKE2b-256 |
53ee9d91841cc35a659773aeb31fe4573f4b2f99021b6e069ffd6bddd046d58c
|
Provenance
The following attestation bundles were made for fast_vad-0.1.0-cp311-abi3-manylinux_2_28_aarch64.whl:
Publisher:
ci.yml on AtharvBhat/fast-vad
-
Statement:
-
Statement type:
https://in-toto.io/Statement/v1 -
Predicate type:
https://docs.pypi.org/attestations/publish/v1 -
Subject name:
fast_vad-0.1.0-cp311-abi3-manylinux_2_28_aarch64.whl -
Subject digest:
219d098c0a09f0d5a2c2c70344646b84b32edaaac94e467d2101db47b092b435 - Sigstore transparency entry: 1067385180
- Sigstore integration time:
-
Permalink:
AtharvBhat/fast-vad@72da488d4d48f776e546fe43334928117bc3e1eb -
Branch / Tag:
refs/tags/v0.1.0 - Owner: https://github.com/AtharvBhat
-
Access:
public
-
Token Issuer:
https://token.actions.githubusercontent.com -
Runner Environment:
github-hosted -
Publication workflow:
ci.yml@72da488d4d48f776e546fe43334928117bc3e1eb -
Trigger Event:
push
-
Statement type:
File details
Details for the file fast_vad-0.1.0-cp311-abi3-macosx_11_0_arm64.whl.
File metadata
- Download URL: fast_vad-0.1.0-cp311-abi3-macosx_11_0_arm64.whl
- Upload date:
- Size: 586.9 kB
- Tags: CPython 3.11+, macOS 11.0+ ARM64
- Uploaded using Trusted Publishing? Yes
- Uploaded via: twine/6.1.0 CPython/3.13.7
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
a26217eb7b104f733a5cfeed41c81d64812e7de8c5624999ab29f6af24aac54b
|
|
| MD5 |
d456a5e288a141604b664711bba2fd5b
|
|
| BLAKE2b-256 |
2f2baccaac7195fc9852a8cb194a09b1e031698ce5dfd3b770c716ebe0dbcc64
|
Provenance
The following attestation bundles were made for fast_vad-0.1.0-cp311-abi3-macosx_11_0_arm64.whl:
Publisher:
ci.yml on AtharvBhat/fast-vad
-
Statement:
-
Statement type:
https://in-toto.io/Statement/v1 -
Predicate type:
https://docs.pypi.org/attestations/publish/v1 -
Subject name:
fast_vad-0.1.0-cp311-abi3-macosx_11_0_arm64.whl -
Subject digest:
a26217eb7b104f733a5cfeed41c81d64812e7de8c5624999ab29f6af24aac54b - Sigstore transparency entry: 1067385246
- Sigstore integration time:
-
Permalink:
AtharvBhat/fast-vad@72da488d4d48f776e546fe43334928117bc3e1eb -
Branch / Tag:
refs/tags/v0.1.0 - Owner: https://github.com/AtharvBhat
-
Access:
public
-
Token Issuer:
https://token.actions.githubusercontent.com -
Runner Environment:
github-hosted -
Publication workflow:
ci.yml@72da488d4d48f776e546fe43334928117bc3e1eb -
Trigger Event:
push
-
Statement type:
File details
Details for the file fast_vad-0.1.0-cp311-abi3-macosx_10_12_x86_64.whl.
File metadata
- Download URL: fast_vad-0.1.0-cp311-abi3-macosx_10_12_x86_64.whl
- Upload date:
- Size: 719.9 kB
- Tags: CPython 3.11+, macOS 10.12+ x86-64
- Uploaded using Trusted Publishing? Yes
- Uploaded via: twine/6.1.0 CPython/3.13.7
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
db565d50acce794f6b9f7af9052621ffc1fc57084932607bad2b14bb65f33b6e
|
|
| MD5 |
e5e2f9dfa37368df8658d38567cebf5e
|
|
| BLAKE2b-256 |
ea0da0e4bda6e4a192cf46ea0f27cd2bdae7d3c92d330019d1e1ed57a1f55443
|
Provenance
The following attestation bundles were made for fast_vad-0.1.0-cp311-abi3-macosx_10_12_x86_64.whl:
Publisher:
ci.yml on AtharvBhat/fast-vad
-
Statement:
-
Statement type:
https://in-toto.io/Statement/v1 -
Predicate type:
https://docs.pypi.org/attestations/publish/v1 -
Subject name:
fast_vad-0.1.0-cp311-abi3-macosx_10_12_x86_64.whl -
Subject digest:
db565d50acce794f6b9f7af9052621ffc1fc57084932607bad2b14bb65f33b6e - Sigstore transparency entry: 1067385320
- Sigstore integration time:
-
Permalink:
AtharvBhat/fast-vad@72da488d4d48f776e546fe43334928117bc3e1eb -
Branch / Tag:
refs/tags/v0.1.0 - Owner: https://github.com/AtharvBhat
-
Access:
public
-
Token Issuer:
https://token.actions.githubusercontent.com -
Runner Environment:
github-hosted -
Publication workflow:
ci.yml@72da488d4d48f776e546fe43334928117bc3e1eb -
Trigger Event:
push
-
Statement type: