FireRedVAD for fasr (bundled fireredvad inference)
Project description
fasr-vad-firered
FireRedVAD voice activity detection for fasr. This is an offline neural VAD
that loads FireRed's PyTorch checkpoint and returns AudioSpan speech
segments.
Install
pip install fasr-vad-firered
Registered Model
| Registry name | Class | Best for |
|---|---|---|
firered |
FireRedForVAD |
Offline VAD with FireRed checkpoints |
The default checkpoint is FireRedTeam/FireRedVAD. Local checkpoint directories
must contain the upstream VAD files, typically cmvn.ark and model.pth.tar
or a VAD/ subdirectory containing them.
Pipeline Usage
from fasr import AudioPipeline
pipeline = (
AudioPipeline()
.add_pipe(
"detector",
model="firered",
speech_threshold=0.4,
use_gpu=False,
)
.add_pipe("recognizer", model="firered_aed")
.add_pipe("sentencizer", model="ct_transformer")
)
Quick choices:
| Goal | Use | Result |
|---|---|---|
| Reduce noise false positives | speech_threshold=0.55 |
Requires stronger speech posterior |
| Keep weak speech | speech_threshold=0.3 |
More sensitive, but may include noise |
| Use GPU inference | use_gpu=True |
Faster when CUDA is available |
Confection Config
[vad_model]
@vad_models = "firered"
use_gpu = false
speech_threshold = 0.4
Inside a pipeline:
[pipeline]
@pipelines = "AudioPipeline.v1"
pipe_order = ["detector"]
[pipeline.pipes]
[pipeline.pipes.detector]
@pipes = "thread_pipe"
[pipeline.pipes.detector.component]
@components = "detector"
[pipeline.pipes.detector.component.model]
@vad_models = "firered"
use_gpu = false
speech_threshold = 0.4
Direct Model Usage
from fasr.config import registry
from fasr.data import AudioSpan, Waveform
model = registry.vad_models.get("firered")(
speech_threshold=0.4,
use_gpu=True,
)
audio = AudioSpan(waveform=Waveform.from_file("example.wav"), start_ms=0)
segments = model.detect(audio)
for segment in segments:
print(f"{segment.start_ms}ms - {segment.end_ms}ms")
Use local weights:
model.load_checkpoint("/path/to/FireRedVAD")
Parameters
| Parameter | Type / range | Default | Higher value | Lower value | Change when |
|---|---|---|---|---|---|
use_gpu |
bool |
False |
Enables CUDA inference | Uses CPU | You have CUDA available and need speed |
speech_threshold |
float, 0.0 to 1.0 |
0.4 |
More conservative; fewer false positives | More sensitive; more weak speech retained | Noise leaks in, or speech is missed |
Generic checkpoint fields such as checkpoint, cache_dir, endpoint,
revision, and force_download are inherited from the base model.
Tuning Guide
| Symptom | Try first |
|---|---|
| Noise is detected as speech | Raise speech_threshold to 0.5 or 0.6 |
| Quiet speech is missed | Lower speech_threshold to 0.3 |
| CPU inference is too slow | Set use_gpu=True on a CUDA machine |
Dependencies
fasrtorch >= 2.0.0soundfile >= 0.12.0kaldiio >= 2.18.0kaldi-native-fbank >= 1.19.0- Python 3.10-3.12
Project details
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file fasr_vad_firered-0.5.2.tar.gz.
File metadata
- Download URL: fasr_vad_firered-0.5.2.tar.gz
- Upload date:
- Size: 14.9 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: uv/0.10.11 {"installer":{"name":"uv","version":"0.10.11","subcommand":["publish"]},"python":null,"implementation":{"name":null,"version":null},"distro":{"name":"Ubuntu","version":"22.04","id":"jammy","libc":null},"system":{"name":null,"release":null},"cpu":null,"openssl_version":null,"setuptools_version":null,"rustc_version":null,"ci":null}
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
3352af0ccc995583f2153a25f9089484f0c2936138db40b568c7214936307d10
|
|
| MD5 |
0a16ffe666fdb7a357cd6664146dbe55
|
|
| BLAKE2b-256 |
fb4410f5de10fd588cdb2c2790da02dc6db55c68cfa1c7af24a6f53219a291c7
|
File details
Details for the file fasr_vad_firered-0.5.2-py3-none-any.whl.
File metadata
- Download URL: fasr_vad_firered-0.5.2-py3-none-any.whl
- Upload date:
- Size: 19.6 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: uv/0.10.11 {"installer":{"name":"uv","version":"0.10.11","subcommand":["publish"]},"python":null,"implementation":{"name":null,"version":null},"distro":{"name":"Ubuntu","version":"22.04","id":"jammy","libc":null},"system":{"name":null,"release":null},"cpu":null,"openssl_version":null,"setuptools_version":null,"rustc_version":null,"ci":null}
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
5d301ce14ebed21486d96b675782e17882fc23d5881c82d2b658a5f2da1fd173
|
|
| MD5 |
9751bdb1e7dd7bdd00e12050608c48e8
|
|
| BLAKE2b-256 |
c0a6d695aaa877e7e7486f0f25170ae292761be25e05fafa96093e573fde259c
|