Skip to main content

Silero VAD model for fasr

Project description

fasr-vad-silero

Chinese documentation

Streaming Silero VAD for fasr. The model is loaded through torch.hub and wrapped as a fasr streaming VADModel that emits AudioChunk objects with segment_start, segment_mid, and segment_end states.

Install

pip install fasr-vad-silero

Registered Model

Registry name Class Best for
stream_silero SileroStreamVAD Lightweight streaming VAD

Streaming Usage

from fasr.config import registry

vad = registry.vad_models.get("stream_silero")(
    threshold=0.5,
    silence_duration_ms=400,
)

for input_chunk in audio_chunks:
    for speech_chunk in vad.push_chunk(input_chunk):
        print(speech_chunk.vad_state, speech_chunk.start_ms, speech_chunk.end_ms)

Quick choices:

Goal Use Result
Reduce noise triggers threshold=0.65 Requires higher speech probability
Keep quiet speech threshold=0.35 More sensitive, with more false-positive risk
End speech sooner silence_duration_ms=200 Lower endpoint latency
Avoid chopping pauses silence_duration_ms=700 More tolerant of short pauses

Confection Config

[vad_model]
@vad_models = "stream_silero"
threshold = 0.5
silence_duration_ms = 400
sample_rate = 16000
chunk_size_ms = 32

Parameters

Parameter Type / range Default Higher value Lower value Change when
threshold float, 0.0 to 1.0 0.5 More conservative; fewer noise starts More sensitive; more weak speech Noise triggers starts, or quiet speech is missed
silence_duration_ms int >= 0 400 Longer pauses before ending speech Faster endpoint Speech is chopped, or endpoint is late
sample_rate int 16000 Keep at 16 kHz for Silero Keep at 16 kHz for Silero Usually do not change
chunk_size_ms int 32 Larger input chunks, fewer calls Smaller chunks, lower latency Realtime scheduling needs tuning

Tuning Guide

Symptom Try first
Background noise starts speech Raise threshold to 0.6 or 0.7
Quiet speech is missed Lower threshold to 0.35 or 0.4
Speech ends too late Lower silence_duration_ms to 200 or 300
Speech is split during pauses Raise silence_duration_ms to 600 or 800

Dependencies

  • fasr
  • numpy >= 1.24
  • torch >= 2.0.0
  • Python 3.10-3.12

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

fasr_vad_silero-0.5.2.tar.gz (5.4 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

fasr_vad_silero-0.5.2-py3-none-any.whl (5.9 kB view details)

Uploaded Python 3

File details

Details for the file fasr_vad_silero-0.5.2.tar.gz.

File metadata

  • Download URL: fasr_vad_silero-0.5.2.tar.gz
  • Upload date:
  • Size: 5.4 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: uv/0.10.11 {"installer":{"name":"uv","version":"0.10.11","subcommand":["publish"]},"python":null,"implementation":{"name":null,"version":null},"distro":{"name":"Ubuntu","version":"22.04","id":"jammy","libc":null},"system":{"name":null,"release":null},"cpu":null,"openssl_version":null,"setuptools_version":null,"rustc_version":null,"ci":null}

File hashes

Hashes for fasr_vad_silero-0.5.2.tar.gz
Algorithm Hash digest
SHA256 9888744af8456272685b92377800152461d24d6ab5f33324d661e71342aa427c
MD5 51a0dd6970179b3779b4783c7a190951
BLAKE2b-256 f2bbe07754cd57a5577449bf7818d91c5b11a702dc476478ab15f09b10f763a9

See more details on using hashes here.

File details

Details for the file fasr_vad_silero-0.5.2-py3-none-any.whl.

File metadata

  • Download URL: fasr_vad_silero-0.5.2-py3-none-any.whl
  • Upload date:
  • Size: 5.9 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: uv/0.10.11 {"installer":{"name":"uv","version":"0.10.11","subcommand":["publish"]},"python":null,"implementation":{"name":null,"version":null},"distro":{"name":"Ubuntu","version":"22.04","id":"jammy","libc":null},"system":{"name":null,"release":null},"cpu":null,"openssl_version":null,"setuptools_version":null,"rustc_version":null,"ci":null}

File hashes

Hashes for fasr_vad_silero-0.5.2-py3-none-any.whl
Algorithm Hash digest
SHA256 e580e877c8f7757f942d0cd481cd677bb9441d508641e094495b174b87cf440f
MD5 408df3714dfb89cf6b563ca0b31f27d0
BLAKE2b-256 08b74888154470d3f92fc6e1d6e32f6e32844a1cb633c3532cd5955490cf4bd0

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page