Real-time voice recognition using Silero VAD and Whisper
Project description
voicelistener
Real-time voice recognition using Silero VAD and Whisper.
Structure
voicelistener/
├── __init__.py
├── __main__.py # CLI entry point
├── voicelistener.py # VoiceListener class (audio + VAD + threading)
├── requirements.txt
└── transcribers/
├── __init__.py
└── whispertranscriber.py # WhisperTranscriber class
Setup
pip install -r voicelistener/requirements.txt
CLI usage
python -m voicelistener
Listens to your microphone, detects speech, and prints transcriptions to stdout. Press Ctrl+C to stop.
Library usage
from voicelistener import VoiceListener, WhisperTranscriber
transcriber = WhisperTranscriber(model="base.en")
listener = VoiceListener(transcriber=transcriber)
for text in listener:
print(text)
Callback style
def handle(text):
print(f"Heard: {text}")
listener = VoiceListener(
transcriber=WhisperTranscriber(),
on_transcription=handle,
)
listener.start()
VoiceListener options
| Parameter | Default | Description |
|---|---|---|
transcriber |
(required) | Object with a transcribe(audio) -> str method |
silence_timeout_ms |
2000 |
Silence duration (ms) to finalize an utterance |
min_utterance_ms |
250 |
Minimum speech length to transcribe |
pre_buffer_ms |
150 |
Audio kept before VAD triggers |
vad_threshold |
0.5 |
Silero VAD confidence threshold |
on_transcription |
None |
Callback invoked with each transcription |
Custom transcriber
Implement a class with a transcribe method:
class MyTranscriber:
def transcribe(self, audio):
# audio is a float32 numpy array at 16kHz
return "transcribed text"
listener = VoiceListener(transcriber=MyTranscriber())
Project details
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
voicelistener-1.0.0.tar.gz
(5.0 kB
view details)
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file voicelistener-1.0.0.tar.gz.
File metadata
- Download URL: voicelistener-1.0.0.tar.gz
- Upload date:
- Size: 5.0 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.2.0 CPython/3.12.3
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
23b4ff87274cc26ce109a46b0262993fb34ef35ad413bf85c22edf5cd354ac3c
|
|
| MD5 |
36961787549311035151fc7f1d78b7b1
|
|
| BLAKE2b-256 |
6404c4ef35001cf553387b875d1ea32ad5e193bb5e70ffa5b1d3386f08cc15a6
|
File details
Details for the file voicelistener-1.0.0-py3-none-any.whl.
File metadata
- Download URL: voicelistener-1.0.0-py3-none-any.whl
- Upload date:
- Size: 6.1 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.2.0 CPython/3.12.3
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
7041dcfc091b8445963797f86c9a2e0e06d243e488be7ac2217682674397e11f
|
|
| MD5 |
1470154e20610b53d41f4ced19aa68ba
|
|
| BLAKE2b-256 |
d8934045f7200d2bebcc9dceaf982ed5840442ebbd3bdb53a50a3cb352cdaf1e
|