A fast Voice Activity Detection and Transcription System

These details have not been verified by PyPI

Project links

Homepage

Project description

RealtimeSTT lets you choose the transcription and wake-word dependencies you want to install.

Recommended default local Whisper install:

pip install "realtimestt[faster-whisper]"

Core package only, without a transcription engine or wake-word backend:

pip install realtimestt

Install multiple extras by separating them with commas:

pip install "realtimestt[faster-whisper,porcupine]"
pip install "realtimestt[whisper-cpp,openwakeword]"

Available extras include:

faster-whisper: default CTranslate2 Whisper backend
whisper-cpp: whisper.cpp backend through pywhispercpp
openai-whisper: original OpenAI Whisper Python backend
sherpa-onnx: sherpa-onnx CPU backends
parakeet: NVIDIA NeMo Parakeet backend
transformers: shared Transformers dependency for Moonshine, Granite, and Cohere
moonshine, granite, cohere: aliases for the Transformers dependency set
qwen: Qwen ASR backend
qwen-vllm: Qwen ASR with vLLM extras
porcupine: Porcupine wake-word backend
openwakeword: OpenWakeWord wake-word backend
wakewords: both wake-word backends
recommended/default: faster-whisper backend
all: all PyPI-installable optional backends

The WebRTC VAD and Silero VAD dependencies are still part of the core install because AudioToTextRecorder currently initializes both VAD paths.

RealtimeSTT

RealtimeSTT is a Python speech-to-text library for applications that need voice activity detection, fast transcription, optional realtime text updates, wake words, and direct access to audio streams. It is designed for assistants, dictation tools, browser streaming servers, and prototypes that need to turn speech into text with only a few lines of code.

The recommended default path uses faster_whisper. Other engines are available through install extras when their optional dependencies and models are present.

Install

pip install "RealtimeSTT[faster-whisper]"

On Linux, install PortAudio headers before installing the package:

sudo apt-get update
sudo apt-get install python3-dev portaudio19-dev

On macOS:

brew install portaudio

For CUDA, platform notes, and optional engine stacks, see docs/installation.md.

Microphone Example

This waits for speech, stops after the detected utterance, and prints the final transcript:

from RealtimeSTT import AudioToTextRecorder

if __name__ == "__main__":
    with AudioToTextRecorder() as recorder:
        print("Speak now")
        print(recorder.text())

Use the if __name__ == "__main__": guard when running scripts, especially on Windows, because RealtimeSTT uses multiprocessing for model work.

Automatic Recording Loop

For continuous dictation, pass a callback to text() so transcription work can complete asynchronously while your loop keeps listening:

from RealtimeSTT import AudioToTextRecorder


def process_text(text):
    print(text)


if __name__ == "__main__":
    recorder = AudioToTextRecorder()

    while True:
        recorder.text(process_text)

External Audio

Set use_microphone=False when audio comes from a file, stream, websocket, or another process. Feed 16-bit mono PCM chunks at 16 kHz, or pass the original sample rate so RealtimeSTT can resample:

from RealtimeSTT import AudioToTextRecorder

if __name__ == "__main__":
    recorder = AudioToTextRecorder(use_microphone=False)

    with open("audio_chunk.pcm", "rb") as audio_file:
        recorder.feed_audio(audio_file.read(), original_sample_rate=16000)

    print(recorder.text())
    recorder.shutdown()

More examples are in docs/quick-start.md and docs/external-audio.md.

Configuration Reference

Every AudioToTextRecorder constructor parameter is documented in docs/configuration.md, including model/engine selection, realtime transcription, VAD timing, wake words, callbacks, external audio, logging, and executor injection.

Features

Voice activity detection with WebRTC VAD and Silero VAD.
Final and realtime transcription with selectable engines.
Optional wake word activation through Porcupine or OpenWakeWord.
Direct microphone input or application-fed audio chunks.
Event callbacks for recording, VAD, realtime text, transcription, and wake word state.
A FastAPI browser streaming server example with multi-user session isolation, shared inference resources, metrics, and health endpoints.

Documentation

Quick start: shortest demos and common recording patterns.
Installation: platform setup, CUDA notes, and optional dependencies.
Configuration: complete AudioToTextRecorder parameter reference.
Transcription engines: engine selection and setup links.
Wake words: Porcupine and OpenWakeWord setup.
External audio: feeding audio without a microphone.
Testing: maintained unit and opt-in golden test workflow.
Test scripts: demos, manual tests, regressions, and legacy experiments under tests/.
FastAPI server: browser server configuration, protocol, metrics, and deployment notes.
Troubleshooting: common install, audio, CUDA, model, dependency, and runtime errors.

Engine-specific references:

Server Example

The browser server lives in example_fastapi_server:

python -m pip install -r example_fastapi_server/requirements.txt
python example_fastapi_server/server.py --host 0.0.0.0 --port 8010

Open http://localhost:8010. See docs/fastapi-server.md for engine recipes, websocket protocol details, health checks, and metrics.

Contributing

Focused tests and small changes are easiest to review. The project keeps fast unit tests separate from opt-in real-model tests; see docs/testing.md.

License

MIT

Author

Kolja Beigel

Project details

These details have not been verified by PyPI

Project links

Homepage

Release history Release notifications | RSS feed

This version

1.0.0

May 10, 2026

0.3.104

May 3, 2025

0.3.103

Apr 19, 2025

0.3.102 yanked

Apr 19, 2025

Reason this release was yanked:

buggy

0.3.101

Apr 11, 2025

0.3.100

Mar 23, 2025

0.3.99

Mar 21, 2025

0.3.98

Mar 10, 2025

0.3.97

Mar 10, 2025

0.3.96

Mar 10, 2025

0.3.95

Feb 15, 2025

0.3.94

Jan 23, 2025

0.3.93

Dec 18, 2024

0.3.92

Dec 13, 2024

0.3.91

Dec 12, 2024

0.3.81

Nov 25, 2024

0.3.9

Dec 11, 2024

0.3.8 yanked

Nov 25, 2024

Reason this release was yanked:

buggy

0.3.7

Nov 3, 2024

0.3.6

Nov 2, 2024

0.3.5

Oct 29, 2024

0.3.4

Oct 27, 2024

0.3.3 yanked

Oct 27, 2024

0.3.2 yanked

Oct 27, 2024

0.3.1

Oct 21, 2024

0.3.0

Oct 2, 2024

0.2.42

Sep 26, 2024

0.2.41

Aug 18, 2024

0.2.4 yanked

Aug 17, 2024

Reason this release was yanked:

version is bugged due to type

0.2.3

Aug 16, 2024

0.2.2

Aug 7, 2024

0.2.1

Jul 19, 2024

0.2.0

Jun 28, 2024

0.1.28 yanked

Sep 5, 2023

Reason this release was yanked:

New interface

0.1.16

Jun 2, 2024

0.1.15

Apr 14, 2024

0.1.14

Apr 10, 2024

0.1.13

Apr 8, 2024

0.1.12

Mar 30, 2024

0.1.11

Mar 16, 2024

0.1.9

Jan 29, 2024

0.1.8

Dec 15, 2023

0.1.7

Nov 9, 2023

0.1.6

Oct 17, 2023

0.1.5

Oct 4, 2023

0.1.4

Sep 10, 2023

0.1.3

Sep 6, 2023

0.1.2 yanked

Sep 5, 2023

Reason this release was yanked:

New interface

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

realtimestt-1.0.0.tar.gz (136.2 kB view details)

Uploaded May 10, 2026 Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

The dropdown lists show the available interpreters, ABIs, and platforms. Enable javascript to be able to filter the list of wheel files.

realtimestt-1.0.0-py3-none-any.whl (142.3 kB view details)

Uploaded May 10, 2026 Python 3

File details

Details for the file realtimestt-1.0.0.tar.gz.

File metadata

Download URL: realtimestt-1.0.0.tar.gz
Upload date: May 10, 2026
Size: 136.2 kB
Tags: Source
Uploaded using Trusted Publishing? No
Uploaded via: twine/6.2.0 CPython/3.11.5

File hashes

Hashes for realtimestt-1.0.0.tar.gz
Algorithm	Hash digest
SHA256	`a7e254169f0f7ca1b1f71b96e88f30269e61bc45ec3fd86fcf5b9ca57618d2f7`
MD5	`47f1a6bdc764cc170fb92878c51417a0`
BLAKE2b-256	`ee1ba721b4111734598d6636b4277352e43dea225e664cf9a6ad5fd21ba98743`

See more details on using hashes here.

File details

Details for the file realtimestt-1.0.0-py3-none-any.whl.

File metadata

Download URL: realtimestt-1.0.0-py3-none-any.whl
Upload date: May 10, 2026
Size: 142.3 kB
Tags: Python 3
Uploaded using Trusted Publishing? No
Uploaded via: twine/6.2.0 CPython/3.11.5

File hashes

Hashes for realtimestt-1.0.0-py3-none-any.whl
Algorithm	Hash digest
SHA256	`d69f8a30149e3e714b2ab7ccdd6208354e90cd1bf7d8f3fc76eb6e0da4aecea2`
MD5	`3e54f8eacbd4c9eee58e2f2aab95b32e`
BLAKE2b-256	`1d3835b7502508b28d856403570fdcefdd6dc909203f712fb29d518b706618b4`

See more details on using hashes here.

realtimestt 1.0.0

Navigation

Verified details

Maintainers

Unverified details

Project links

Meta

Project description

RealtimeSTT

Install

Microphone Example

Automatic Recording Loop

External Audio

Configuration Reference

Features

Documentation

Server Example

Contributing

License

Author

Project details

Verified details

Maintainers

Unverified details

Project links

Meta

Release history Release notifications | RSS feed

Download files

Source Distribution

Built Distribution

File details

File metadata

File hashes

File details

File metadata

File hashes