fsmn vad model for fasr

Project description

fasr-vad-fsmn

FSMN voice activity detection for fasr. The offline fsmn model delegates feature extraction and ONNX inference to funasr_onnx; the plugin also provides fsmn_online for streaming VAD.

Install

pip install fasr-vad-fsmn

Registered Models

Registry name	Class	Best for
`fsmn`	`FSMNVad`	Offline VAD, segmenting complete audio into speech spans
`fsmn_online`	`FSMNVadOnline`	Streaming VAD, emitting speech chunks as audio arrives

Pipeline Usage

Any keyword argument after component, model, batch_size, and other pipe options is forwarded to the detector model. Put FSMN parameters directly on the detector pipe:

from fasr import AudioPipeline

pipeline = (
    AudioPipeline()
    .add_pipe(
        "detector",
        model="fsmn",
        max_end_silence_time=600,
        speech_noise_thres=0.55,
        num_threads=4,
    )
    .add_pipe("recognizer", model="paraformer")
    .add_pipe("sentencizer", model="ct_transformer")
)

Quick choices:

Goal	Use	Result
Keep long sentences together	`max_end_silence_time=1000`	Short pauses inside a sentence are less likely to split the segment
Lower endpoint latency	`max_end_silence_time=300`	Segments end sooner, but sentences may be split more often
Suppress noisy backgrounds	`speech_noise_thres=0.7`	Fewer noise false positives, with higher risk of missing quiet speech
Keep quiet or far-field speech	`speech_noise_thres=0.45`	More sensitive detection, with higher risk of including noise
Increase CPU throughput	`num_threads=4` or `num_threads=8`	More ONNX Runtime CPU parallelism, with higher CPU usage
Use GPU	`device_id=0`	Uses GPU 0 through ONNX Runtime, after installing `onnxruntime-gpu`

Confection Config

fasr config files use Confection's TOML-style syntax, not YAML.

To configure only the VAD model:

[vad_model]
@vad_models = "fsmn"
max_end_silence_time = 600
speech_noise_thres = 0.55
num_threads = 4

Inside a pipeline, model parameters live under pipeline.pipes.detector.component.model:

[pipeline]
@pipelines = "AudioPipeline.v1"
pipe_order = ["detector"]

[pipeline.pipes]

[pipeline.pipes.detector]
@pipes = "thread_pipe"
batch_size = 4
batch_timeout = 0.1

[pipeline.pipes.detector.component]
@components = "detector"
num_threads = 2
max_segment_duration = 30.0

[pipeline.pipes.detector.component.model]
@vad_models = "fsmn"
max_end_silence_time = 600
speech_noise_thres = 0.55
num_threads = 4

Direct Model Usage

Model construction automatically downloads and loads the checkpoint.

from fasr.config import registry
from fasr.data import AudioSpan, Waveform

model = registry.vad_models.get("fsmn")(
    max_end_silence_time=600,
    speech_noise_thres=0.55,
)

audio = AudioSpan(waveform=Waveform.from_file("example.wav"), start_ms=0)
segments = model.detect(audio)
for segment in segments:
    print(f"{segment.start_ms}ms - {segment.end_ms}ms")

Use a local checkpoint directory when needed:

model.load_checkpoint("/path/to/fsmn-vad")

Parameters

Offline fsmn exposes only the parameters that still affect funasr_onnx inference. Generic checkpoint fields such as checkpoint, cache_dir, endpoint, revision, and force_download are inherited from the base model.

Parameter	Type / range	Default	Higher value	Lower value	Change when
`sample_rate`	`int`, recommended `16000`	`16000`	Not recommended; adds resampling/inference cost	Not recommended; may lose speech detail	Usually never; keep model input at 16 kHz
`device_id`	`None`, `-1`, `"cpu"`, or GPU id like `0`	`None`	GPU id uses that GPU	`None` / `-1` / `"cpu"` uses CPU	You need lower latency or higher concurrency
`num_threads`	`int >= 0`	`2`	Often faster on CPU, but uses more cores	Saves CPU, may slow inference	CPU deployment needs tuning
`max_end_silence_time`	`int >= 0`, milliseconds	`800`	More tolerant of pauses; longer, more complete segments; later endpoint	Faster endpoint; more fragmented segments	Sentences are split too often, or endpoint latency is too high
`speech_noise_thres`	`float`, `0.0` to `1.0`	`0.6`	More conservative; fewer noise false positives; may miss quiet speech	More sensitive; keeps weak speech; may include noise	Noise is detected as speech, or quiet speech is missed

Tuning Guide

Symptom	Try first
One sentence is split into many pieces	Raise `max_end_silence_time` to `1000` or `1200`
Speech end is detected too late	Lower `max_end_silence_time` to `300` to `500`
Background noise becomes speech	Raise `speech_noise_thres` to `0.7` or `0.8`
Quiet or far-field speech is missed	Lower `speech_noise_thres` to `0.45` or `0.5`
CPU usage is too high	Lower `num_threads`
CPU inference is too slow	Raise `num_threads`, or install `onnxruntime-gpu` and set `device_id=0`

For fsmn_online, use device="cpu" or device="cuda" instead of device_id. It also exposes chunk_size_ms: smaller chunks improve realtime responsiveness but increase scheduling overhead; larger chunks improve throughput but delay output. The default 100 ms is a good starting point.

CPU / GPU

The default runtime is CPU ONNX Runtime. During model loading, the plugin logs whether CPU or GPU is being used.

For GPU inference:

uv pip install onnxruntime-gpu

model = registry.vad_models.get("fsmn")(device_id=0)
stream_model = registry.vad_models.get("fsmn_online")(device="cuda")

Dependencies

fasr
funasr-onnx
numpy >= 1.24
onnxruntime >= 1.16, < 1.24
Python 3.10-3.12

Project details

Release history Release notifications | RSS feed

This version

0.5.2

May 1, 2026

0.5.1

Apr 23, 2026

0.5.0

Apr 20, 2026

0.4.0

Apr 1, 2026

0.3.9

Mar 30, 2026

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

fasr_vad_fsmn-0.5.2.tar.gz (3.2 MB view details)

Uploaded May 1, 2026 Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

The dropdown lists show the available interpreters, ABIs, and platforms. Enable javascript to be able to filter the list of wheel files.

fasr_vad_fsmn-0.5.2-py3-none-any.whl (3.2 MB view details)

Uploaded May 1, 2026 Python 3

File details

Details for the file fasr_vad_fsmn-0.5.2.tar.gz.

File metadata

Download URL: fasr_vad_fsmn-0.5.2.tar.gz
Upload date: May 1, 2026
Size: 3.2 MB
Tags: Source
Uploaded using Trusted Publishing? No
Uploaded via: uv/0.10.11 {"installer":{"name":"uv","version":"0.10.11","subcommand":["publish"]},"python":null,"implementation":{"name":null,"version":null},"distro":{"name":"Ubuntu","version":"22.04","id":"jammy","libc":null},"system":{"name":null,"release":null},"cpu":null,"openssl_version":null,"setuptools_version":null,"rustc_version":null,"ci":null}

File hashes

Hashes for fasr_vad_fsmn-0.5.2.tar.gz
Algorithm	Hash digest
SHA256	`8b279e3e16f41c6fc76f7d619b37b14cfae23d70fd39a7a7afb9e3817d7a3471`
MD5	`32f7a910d093a10bd9a398e9c8ea45fe`
BLAKE2b-256	`79b4da697f2b00520a3d9a68062d9633d15356d69fb0f66e688cf2d47adad8c2`

See more details on using hashes here.

File details

Details for the file fasr_vad_fsmn-0.5.2-py3-none-any.whl.

File metadata

Download URL: fasr_vad_fsmn-0.5.2-py3-none-any.whl
Upload date: May 1, 2026
Size: 3.2 MB
Tags: Python 3
Uploaded using Trusted Publishing? No
Uploaded via: uv/0.10.11 {"installer":{"name":"uv","version":"0.10.11","subcommand":["publish"]},"python":null,"implementation":{"name":null,"version":null},"distro":{"name":"Ubuntu","version":"22.04","id":"jammy","libc":null},"system":{"name":null,"release":null},"cpu":null,"openssl_version":null,"setuptools_version":null,"rustc_version":null,"ci":null}

File hashes

Hashes for fasr_vad_fsmn-0.5.2-py3-none-any.whl
Algorithm	Hash digest
SHA256	`c137228b999cd82f19b543c92f4afddae360e704210df3c1f549c501c80f9567`
MD5	`3c5d5b9f983f100606771f55f32fd563`
BLAKE2b-256	`e2ef7c0ba335dc23cb028c86dde745b38b7e72db1289c9b53cf96e6473ddb4ed`

See more details on using hashes here.

fasr-vad-fsmn 0.5.2

Navigation

Verified details

Maintainers

Meta

Unverified details

Meta

Project description

fasr-vad-fsmn

Install

Registered Models

Pipeline Usage

Confection Config

Direct Model Usage

Parameters

Tuning Guide

CPU / GPU

Dependencies

Project details

Verified details

Maintainers

Meta

Unverified details

Meta

Release history Release notifications | RSS feed

Download files

Source Distribution

Built Distribution

File details

File metadata

File hashes

File details

File metadata

File hashes