Load arbitrary Voice Activity Detection (VAD) models behind a unified ONNX API

These details have not been verified by PyPI

Project links

Project description

vadonnx

Load arbitrary Voice Activity Detection models behind a single, unified API — every model runs through ONNX Runtime. One streaming/batch interface, one audio-format story, pluggable models.

from vadonnx import load_vad

vad = load_vad("silero")                              # bundled, works fully offline
prob = vad.process_chunk(pcm_bytes)                   # streaming → float in [0, 1]
segments = vad.get_speech_segments(audio, sample_rate=16000)
# -> [SpeechSegment(start=0.32, end=2.27), SpeechSegment(start=3.27, end=4.45), ...]

Why

Every VAD ships its own loader, audio format, feature pipeline and state handling. vadonnx hides that behind one VADModel interface: feed it audio (raw int16 bytes, numpy arrays, any sample rate) and get back per-frame speech probabilities or ready-made speech segments. Models are described declaratively by an IOSignature, so a single generic engine drives most of them and you can point the same API at any custom .onnx file.

Lightweight runtime — only numpy, onnxruntime, huggingface_hub.
Offline by default — a small Silero model is bundled in the wheel.
Streaming and batch — process_chunk() for live audio, get_speech_segments() / probabilities() for whole buffers.
Bring your own model — load any ONNX VAD by path/URL with a signature.
Extensible — third parties register backends/models via entry points.

Install

uv pip install vadonnx          # runtime (numpy + onnxruntime + huggingface_hub)
uv pip install "vadonnx[mic]"   # + microphone examples

Models

name	rate	parity vs upstream	notes
`silero` / `silero-8k` / `silero-op15`	16k / 8k / 16k	MAE 0	bundled default, raw PCM
`marblenet` / `marblenet-int8`	16k	MAE 4e-4	NVIDIA NeMo Frame-VAD, multilingual (license)
`pyannote` / `pyannote-int8`	16k	MAE 0	pyannote segmentation-3.0, windowed
`fsmn` / `fsmn-quant`	16k	tracks upstream	FunASR FSMN-VAD; needs `vadonnx[fsmn]`
`speechbrain`	16k	MAE 0	SpeechBrain CRDNN, LibriParty-trained
`ten`	16k	—	feature extractor provided by TEN's native library

See docs/backends.md for per-model detail and the benchmark for measured comparisons across datasets, including WebRTC and energy baselines.

Models other than the bundled Silero are downloaded on first use from the TigreGotico HuggingFace org and cached under $XDG_DATA_HOME/vadonnx. See docs/backends.md for per-model detail and parity notes.

CLI

vadonnx list                       # list available models
vadonnx probe silero               # print a model's ONNX input/output signature
vadonnx segment speech.wav         # print detected speech segments of a WAV

Documentation

License

Apache-2.0. Bundled/downloaded model weights retain their upstream licenses — see docs/licensing.md.

Project details

These details have not been verified by PyPI

Project links

Release history Release notifications | RSS feed

0.1.0

Jun 16, 2026

This version

0.1.0a2 pre-release

Jun 16, 2026

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

vadonnx-0.1.0a2.tar.gz (2.3 MB view details)

Uploaded Jun 16, 2026 Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

The dropdown lists show the available interpreters, ABIs, and platforms. Enable javascript to be able to filter the list of wheel files.

vadonnx-0.1.0a2-py3-none-any.whl (2.0 MB view details)

Uploaded Jun 16, 2026 Python 3

File details

Details for the file vadonnx-0.1.0a2.tar.gz.

File metadata

Download URL: vadonnx-0.1.0a2.tar.gz
Upload date: Jun 16, 2026
Size: 2.3 MB
Tags: Source
Uploaded using Trusted Publishing? No
Uploaded via: twine/6.1.0 CPython/3.13.12

File hashes

Hashes for vadonnx-0.1.0a2.tar.gz
Algorithm	Hash digest
SHA256	`4bd68b7354ddc21d3eaaae20f5bb888db675e2d39d3e620d50a06eac292f93a9`
MD5	`d74fa1d8b2288cada3c22419e2e648c5`
BLAKE2b-256	`23a44bd2a53abe6a386d4e384b423454d3b21292f49d0be6651af1ed7988a67f`

See more details on using hashes here.

File details

Details for the file vadonnx-0.1.0a2-py3-none-any.whl.

File metadata

Download URL: vadonnx-0.1.0a2-py3-none-any.whl
Upload date: Jun 16, 2026
Size: 2.0 MB
Tags: Python 3
Uploaded using Trusted Publishing? No
Uploaded via: twine/6.1.0 CPython/3.13.12

File hashes

Hashes for vadonnx-0.1.0a2-py3-none-any.whl
Algorithm	Hash digest
SHA256	`efba50933fec70d7c6c4ebdf0779d97ce7dbba34e69c3e382ee22d1879f6cd44`
MD5	`f88bff5cd25967d15baed28d604a4507`
BLAKE2b-256	`9e8be4f50671c4f0717e504e15cf4f1d47b5218190b9f4c84fd8814261ed2f58`

See more details on using hashes here.

vadonnx 0.1.0a2

Navigation

Verified details

Maintainers

Unverified details

Project links

Meta

Classifiers

Project description

vadonnx

Why

Install

Models

CLI

Documentation

License

Project details

Verified details

Maintainers

Unverified details

Project links

Meta

Classifiers

Release history Release notifications | RSS feed

Download files

Source Distribution

Built Distribution

File details

File metadata

File hashes

File details

File metadata

File hashes