FireRedVAD integration for Pipecat — SOTA streaming Voice Activity Detection supporting 100+ languages
Project description
pipecat-ai-fireredvad
A Pipecat integration for FireRedVAD — a SOTA industrial-grade streaming Voice Activity Detection model that supports 100+ languages and outperforms Silero-VAD, TEN-VAD, FunASR-VAD, and WebRTC-VAD on the FLEURS-VAD-102 benchmark.
| Metric | FireRedVAD | Silero-VAD | TEN-VAD | WebRTC-VAD |
|---|---|---|---|---|
| F1 Score ↑ | 97.57 | 95.95 | 95.19 | 52.30 |
| AUC-ROC ↑ | 99.60 | 97.99 | 97.81 | — |
| False Alarm ↓ | 2.69 | 9.41 | 15.47 | 2.83 |
Requirements
- Python 3.10+
pipecat-ai >= 0.0.90fireredvad(installed manually from GitHub — see setup below)- Audio: 16 kHz, 16-bit mono PCM
Installation
1. Install this package
pip install pipecat-firered-vad
2. Install FireRedVAD
fireredvad is not on PyPI. Clone and install it from GitHub:
git clone https://github.com/FireRedTeam/FireRedVAD.git
cd FireRedVAD
pip install -r requirements.txt
export PYTHONPATH=$PWD:$PYTHONPATH
3. Download model weights
# via Hugging Face
pip install -U "huggingface_hub[cli]"
huggingface-cli download FireRedTeam/FireRedVAD \
--local-dir ./pretrained_models/FireRedVAD
# or via ModelScope (recommended if you're in China)
pip install -U modelscope
modelscope download --model xukaituo/FireRedVAD \
--local_dir ./pretrained_models/FireRedVAD
4. Configure environment
cp .env.example .env
# Edit .env and set FIREREDVAD_MODEL_DIR to your downloaded weights path
Quick Start
import asyncio
from dotenv import load_dotenv
import os
from pipecat.pipeline.pipeline import Pipeline
from pipecat.pipeline.runner import PipelineRunner
from pipecat.pipeline.task import PipelineParams, PipelineTask
from pipecat.audio.vad.vad_analyzer import VADParams
from pipecat_ai_fireredvad import FireVadAnalyzer
load_dotenv()
async def main():
vad = FireVadAnalyzer(
model_dir=os.environ["FIREREDVAD_MODEL_DIR"],
params=VADParams(
confidence=0.7,
start_secs=0.2,
stop_secs=0.5,
),
speech_threshold=0.4,
smooth_window_size=5,
)
# Pass the analyzer to your transport, e.g. DailyTransport:
# transport = DailyTransport(..., vad_analyzer=vad)
asyncio.run(main())
Configuration Reference
FireVadAnalyzer constructor parameters
| Parameter | Type | Default | Description |
|---|---|---|---|
model_dir |
str |
— | Required. Path to the Stream-VAD model directory. |
sample_rate |
int |
None |
Must be 16000 if provided (enforced). |
params |
VADParams |
None |
Pipecat-level smoothing (confidence, start/stop secs). |
use_gpu |
bool |
False |
Run inference on GPU. |
smooth_window_size |
int |
5 |
Frame-level confidence smoothing window inside FireRedVAD. |
speech_threshold |
float |
0.4 |
Raw model threshold for speech vs silence. |
pad_start_frame |
int |
5 |
Frames prepended at speech onset to avoid clipping. |
min_speech_frame |
int |
8 |
Minimum consecutive frames before segment is confirmed. |
max_speech_frame |
int |
2000 |
Maximum frames in a single speech segment. |
min_silence_frame |
int |
20 |
Silence frames required before a segment ends. |
max_buffer_frames |
int |
50 |
Ring buffer capacity (oldest frames evicted on overflow). |
Audio requirements
FireRedVAD only accepts 16 kHz, 16-bit mono PCM. Convert other formats with:
ffmpeg -i input.wav -ar 16000 -ac 1 -acodec pcm_s16le output.wav
Environment Variables
| Variable | Description |
|---|---|
FIREREDVAD_MODEL_DIR |
Path to the downloaded Stream-VAD model directory. |
FIREREDVAD_USE_GPU |
Set to 1 to enable GPU inference (default: 0). |
See .env.example for a ready-to-copy template.
Related packages
pipecat-firered-vad— this packagepipecat-ai-ten-vad— TEN-VAD integration (same author)- FireRedVAD on GitHub
- FireRedVAD on HuggingFace
Contributing
Pull requests are welcome. For major changes, please open an issue first.
- Fork the repository
- Create a feature branch:
git checkout -b feature/my-change - Run linting:
ruff check . && ruff format . - Run tests:
pytest - Open a PR
License
Apache License 2.0 — see LICENSE for details.
FireRedVAD model weights are released under their own license. See the FireRedVAD repository for details.
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file pipecat_firered_vad-0.1.0.tar.gz.
File metadata
- Download URL: pipecat_firered_vad-0.1.0.tar.gz
- Upload date:
- Size: 171.9 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.2.0 CPython/3.14.3
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
ac5a218e744569a0e09c1886d97fda9275dd65bdad8329eff9c8a2944e4020ca
|
|
| MD5 |
b1df59f34cddb283d85b22eb883453ee
|
|
| BLAKE2b-256 |
45af9826c46aed02af3db7e0ab3f24521b77e0e1b1507d72c8f356ed4991fd3d
|
File details
Details for the file pipecat_firered_vad-0.1.0-py3-none-any.whl.
File metadata
- Download URL: pipecat_firered_vad-0.1.0-py3-none-any.whl
- Upload date:
- Size: 9.6 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.2.0 CPython/3.14.3
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
feff5008bc7c8102e824d0378a8fb39255dd5e132eae2d5d55b7116e3e1bac23
|
|
| MD5 |
c2615114186edbe62024a60c6d89771a
|
|
| BLAKE2b-256 |
be29e85cd268540d72bd5c544c52080eab52820ded12ab9ae08cfcb6602656b5
|