Async Python client for OpenAI Realtime transcription with microphone input, streamed transcript deltas, speech boundary events, and WebSocket lifecycle management.

Project description

realtime-whisper

Small async Python client for OpenAI Realtime transcription with microphone input, streamed transcript deltas, completed transcript events, speech boundary events, and manual buffer flush support.

Requirements

Python 3.14 or newer
An OpenAI API key, or Azure OpenAI Realtime credentials
A working audio input device when using the default microphone input

Installation

With uv (recommended):

uv add realtime-whisper

# Include microphone support
uv add "realtime-whisper[audio]"

With pip:

pip install realtime-whisper

# Include microphone support
pip install "realtime-whisper[audio]"

From source (this repository):

uv sync --extra audio
# or
pip install -e ".[audio]"

The audio extra installs sounddevice, which is required by the default MicrophoneInput. If you provide your own audio input implementation, the base dependencies are enough.

Set your OpenAI API key before running the examples:

export OPENAI_API_KEY="your-api-key"

On PowerShell:

$env:OPENAI_API_KEY = "your-api-key"

Quick Start

import asyncio

from realtime_whisper import RealtimeTranscriber, TranscriptCompleted, TranscriptDelta


async def main() -> None:
	transcriber = RealtimeTranscriber(language="en")

	async for event in transcriber.stream():
		match event:
			case TranscriptDelta(delta=delta):
				print(delta, end="", flush=True)
			case TranscriptCompleted(transcript=transcript):
				print(f"\n>>> {transcript}\n")


asyncio.run(main())

Run the included examples:

# Continuous transcription
uv run python -m examples.transcribe_console

# Push-to-talk (press Enter to flush the buffer)
uv run python -m examples.transcribe_push_to_talk

API Overview

RealtimeTranscriber

Basic streaming — reads from your default microphone and prints every delta and completed transcript segment:

import asyncio

from realtime_whisper import (
    NoiseReduction,
    RealtimeTranscriber,
    TranscriptionDelay,
    TranscriptCompleted,
    TranscriptDelta,
)


async def main() -> None:
    transcriber = RealtimeTranscriber(
        language="en",                             # BCP-47 tag, or None for auto-detect
        delay=TranscriptionDelay.MEDIUM,           # latency vs. completeness trade-off
        noise_reduction=NoiseReduction.FAR_FIELD,  # FAR_FIELD or NEAR_FIELD
        include_logprobs=False,                    # True → per-token log-probabilities
    )

    async for event in transcriber.stream():
        match event:
            case TranscriptDelta(delta=delta):
                print(delta, end="", flush=True)
            case TranscriptCompleted(transcript=transcript):
                print(f"\n>>> {transcript}\n")


asyncio.run(main())

See examples/transcribe_console.py for the full runnable version of this pattern.

Push-to-talk — call flush() to commit the audio buffer and trigger transcription on demand (e.g. when the user releases a key):

import asyncio

from realtime_whisper import RealtimeTranscriber, TranscriptCompleted, TranscriptDelta


async def read_enter_loop(transcriber: RealtimeTranscriber) -> None:
    loop = asyncio.get_running_loop()
    while True:
        await loop.run_in_executor(None, input)  # blocks until Enter is pressed
        await transcriber.flush()


async def main() -> None:
    transcriber = RealtimeTranscriber(language="en")
    asyncio.create_task(read_enter_loop(transcriber))

    async for event in transcriber.stream():
        match event:
            case TranscriptDelta(delta=delta):
                print(delta, end="", flush=True)
            case TranscriptCompleted(transcript=transcript):
                print(f"\n>>> {transcript}\n")


asyncio.run(main())

See examples/transcribe_push_to_talk.py for the full runnable version of this pattern.

As an async context manager — stop() is called automatically on exit:

async with RealtimeTranscriber(language="en") as transcriber:
    async for event in transcriber.stream():
        ...

### Events

The public event types are exported from `realtime_whisper`:

- `SessionConnected`
- `TranscriptDelta`
- `TranscriptCompleted`
- `SpeechStarted`
- `SpeechStopped`
- `TranscriberError`

### Options

Use `TranscriptionDelay` to control latency versus completeness:

- `TranscriptionDelay.MINIMAL`
- `TranscriptionDelay.LOW`
- `TranscriptionDelay.MEDIUM`
- `TranscriptionDelay.HIGH`
- `TranscriptionDelay.XHIGH`

Use `NoiseReduction` for input noise reduction:

- `NoiseReduction.NEAR_FIELD`
- `NoiseReduction.FAR_FIELD`

Example:

```python
from realtime_whisper import NoiseReduction, RealtimeTranscriber, TranscriptionDelay

transcriber = RealtimeTranscriber(
	language="de",
	delay=TranscriptionDelay.LOW,
	noise_reduction=NoiseReduction.NEAR_FIELD,
)

Providers

By default, RealtimeTranscriber uses OpenAIProvider and reads OPENAI_API_KEY from the environment. You can also pass api_key directly:

transcriber = RealtimeTranscriber(api_key="your-api-key")

For Azure OpenAI, pass an AzureOpenAIProvider:

from realtime_whisper import AzureOpenAIProvider, RealtimeTranscriber

provider = AzureOpenAIProvider(
	resource="my-resource",
	deployment="my-realtime-deployment",
	api_key="my-api-key",
)

transcriber = RealtimeTranscriber(provider=provider)

The Azure provider can also read these environment variables:

AZURE_OPENAI_RESOURCE
AZURE_OPENAI_DEPLOYMENT
AZURE_OPENAI_API_KEY

Custom Audio Input

Pass an object implementing AudioInputDevice to use a custom audio source. Audio chunks must be raw 24 kHz mono PCM bytes unless you also change the session settings in the package internals.

from collections.abc import AsyncIterator

from realtime_whisper.audio import AudioInputDevice


class MyAudioInput(AudioInputDevice):
	async def start(self) -> None:
		...

	async def stop(self) -> None:
		...

	async def stream_chunks(self) -> AsyncIterator[bytes]:
		...

	@property
	def is_active(self) -> bool:
		...

Development

uv sync --extra audio --group dev
uv run ruff check .
uv run pytest

Project details

Release history Release notifications | RSS feed

This version

0.1.0

May 23, 2026

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

realtime_whisper-0.1.0.tar.gz (52.4 kB view details)

Uploaded May 23, 2026 Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

The dropdown lists show the available interpreters, ABIs, and platforms. Enable javascript to be able to filter the list of wheel files.

realtime_whisper-0.1.0-py3-none-any.whl (12.5 kB view details)

Uploaded May 23, 2026 Python 3

File details

Details for the file realtime_whisper-0.1.0.tar.gz.

File metadata

Download URL: realtime_whisper-0.1.0.tar.gz
Upload date: May 23, 2026
Size: 52.4 kB
Tags: Source
Uploaded using Trusted Publishing? No
Uploaded via: uv/0.9.2

File hashes

Hashes for realtime_whisper-0.1.0.tar.gz
Algorithm	Hash digest
SHA256	`412f793611c672e4eed73e600fa25fa0320590c5c0597235e354f1f945fb59b6`
MD5	`91da3df13e1b34d9a3c1b2cbd651ddfd`
BLAKE2b-256	`a2064be9454e88915ed3475dc0d04d59d8488ec4710fd64ab56d520da5f3d778`

See more details on using hashes here.

File details

Details for the file realtime_whisper-0.1.0-py3-none-any.whl.

File metadata

Download URL: realtime_whisper-0.1.0-py3-none-any.whl
Upload date: May 23, 2026
Size: 12.5 kB
Tags: Python 3
Uploaded using Trusted Publishing? No
Uploaded via: uv/0.9.2

File hashes

Hashes for realtime_whisper-0.1.0-py3-none-any.whl
Algorithm	Hash digest
SHA256	`829fb3c3536cb10635e3f6f32f68dedcaf2a60701fc35a65dcff5c723e87f306`
MD5	`8abc910c397cee84fb001707fc24aa65`
BLAKE2b-256	`05ea1db573ec032e8c6acc1ffa5526068dc8ed80203686ccc69d4bc5b3109c56`

See more details on using hashes here.

realtime-whisper 0.1.0

Navigation

Verified details

Maintainers

Unverified details

Meta

Project description

realtime-whisper

Requirements

Installation

Quick Start

API Overview

RealtimeTranscriber

Providers

Custom Audio Input

Development

Project details

Verified details

Maintainers

Unverified details

Meta

Release history Release notifications | RSS feed

Download files

Source Distribution

Built Distribution

File details

File metadata

File hashes

File details

File metadata

File hashes