az-realtime-webrtc

Python SDK for Azure OpenAI Realtime API with WebSocket support, streaming, and tool calling

These details have not been verified by PyPI

Project links

Project description

azure-realtime-webrtc
Python SDK for Azure OpenAI Realtime API — async streaming, tools, Flask & FastAPI

license python typed async

azure-realtime-webrtc is the Python companion to the npm package. It provides an async WebSocket client, streaming iterators, function calling, and server middleware for Flask & FastAPI — so you can build real-time AI voice/text applications in Python.

from azure_realtime_webrtc import RealtimeClient
from azure_realtime_webrtc.types import ApiKeyAuth

client = RealtimeClient(
    resource="my-resource",
    deployment="gpt-4o-realtime-preview",
    auth=ApiKeyAuth(api_key=os.environ["AZURE_OPENAI_API_KEY"]),
)

async with client.connect() as session:
    session.send_text("Hello!")
    async for chunk in session.transcript_stream():
        print(chunk.text, end="", flush=True)

What's Inside

Module	Purpose
`azure_realtime_webrtc`	Async WebSocket client, token manager, typed events
`azure_realtime_webrtc.sdk`	High-level classes: TextChat, ToolAgent
`azure_realtime_webrtc.server`	Flask blueprint & FastAPI router for token server
`azure_realtime_webrtc.types`	All dataclass types with full type hints

Features

Feature	Details
Async WebSocket Client	Full duplex communication with Azure OpenAI Realtime API
Streaming Iterators	`async for chunk in session.transcript_stream()`
Function Calling	`register_tool()` with automatic call → execute → respond cycle
SDK: TextChat	Streaming text chat with message history
SDK: ToolAgent	Autonomous multi-step tool calling with execution trace
Flask Middleware	`create_flask_blueprint()` — drop-in token server
FastAPI Middleware	`create_fastapi_router()` — async token server with Swagger UI
Entra ID Auth	Microsoft Entra ID support via `azure-identity`
Fully Typed	`py.typed` marker, dataclasses, full type hints

Install

pip install azure-realtime-webrtc

With Flask:

pip install azure-realtime-webrtc[flask]

With FastAPI:

pip install azure-realtime-webrtc[fastapi]

With Entra ID:

pip install azure-realtime-webrtc[azure]

Everything:

pip install azure-realtime-webrtc[all]

Prerequisites

You need three things from the Azure Portal:

Value	Where to find it	Example
Resource name	Your Azure OpenAI resource URL: `https://<THIS>.openai.azure.com`	`my-openai-resource`
API Key	Azure Portal → Your OpenAI resource → Keys and Endpoint	`abc123...`
Deployment name	Azure AI Foundry → Deployments (must be a realtime model)	`gpt-4o-realtime-preview`

Deployment must be in East US 2 or Sweden Central.

Quick Start

1. WebSocket Client (Async Streaming)

import asyncio
import os
from azure_realtime_webrtc import RealtimeClient
from azure_realtime_webrtc.types import ApiKeyAuth, SessionConfig, AudioConfig, AudioOutputConfig

client = RealtimeClient(
    resource=os.environ["AZURE_RESOURCE"],
    deployment=os.environ["AZURE_DEPLOYMENT"],
    auth=ApiKeyAuth(api_key=os.environ["AZURE_OPENAI_API_KEY"]),
    session=SessionConfig(
        instructions="You are a helpful assistant. Be concise.",
        audio=AudioConfig(output=AudioOutputConfig(voice="alloy")),
    ),
)

async def main():
    async with client.connect() as session:
        session.send_text("What are three facts about WebRTC?")

        async for chunk in session.transcript_stream():
            if chunk.type == "delta":
                print(chunk.text, end="", flush=True)
            elif chunk.type == "done":
                print(f"\n\n[{chunk.role}] Complete.")
                break

asyncio.run(main())

2. Function Calling (Tools)

import json
from azure_realtime_webrtc.types import ToolDefinition, ToolRegistration

def get_weather(args: dict) -> str:
    city = args.get("city", "unknown")
    return json.dumps({"city": city, "temp": 72, "condition": "Sunny"})

client.register_tool(ToolRegistration(
    definition=ToolDefinition(
        name="get_weather",
        description="Get the current weather for a city",
        parameters={
            "type": "object",
            "properties": {"city": {"type": "string"}},
            "required": ["city"],
        },
    ),
    handler=get_weather,
))

async with client.connect() as session:
    session.send_text("What's the weather in Tokyo?")
    # Tool calls are handled automatically!
    async for chunk in session.transcript_stream():
        if chunk.role == "assistant":
            print(chunk.text, end="", flush=True)
            if chunk.type == "done":
                break

3. Flask Token Server

import os
from flask import Flask
from azure_realtime_webrtc.server import create_flask_blueprint
from azure_realtime_webrtc.types import ApiKeyAuth

app = Flask(__name__)

bp = create_flask_blueprint(
    resource=os.environ["AZURE_RESOURCE"],
    deployment=os.environ["AZURE_DEPLOYMENT"],
    auth=ApiKeyAuth(api_key=os.environ["AZURE_OPENAI_API_KEY"]),
    session={
        "instructions": "You are a helpful assistant.",
        "audio": {"output": {"voice": "alloy"}},
    },
)

app.register_blueprint(bp)
app.run(port=3001)
# POST /api/realtime/token → {"token": "ek_..."}
# GET  /api/realtime/health → {"status": "ok"}

4. FastAPI Token Server

import os
from fastapi import FastAPI
from azure_realtime_webrtc.server import create_fastapi_router
from azure_realtime_webrtc.types import ApiKeyAuth

app = FastAPI(title="Realtime Token Server")

router = create_fastapi_router(
    resource=os.environ["AZURE_RESOURCE"],
    deployment=os.environ["AZURE_DEPLOYMENT"],
    auth=ApiKeyAuth(api_key=os.environ["AZURE_OPENAI_API_KEY"]),
)

app.include_router(router)
# Run: uvicorn server:app --port 3001
# Swagger UI at http://localhost:3001/docs

5. Tool Agent (Multi-Step Autonomous)

import json
from azure_realtime_webrtc.sdk import ToolAgent
from azure_realtime_webrtc.types import ApiKeyAuth, ToolDefinition, ToolRegistration

agent = ToolAgent(
    resource="my-resource",
    deployment="gpt-4o-realtime-preview",
    auth=ApiKeyAuth(api_key="..."),
    instructions="You are a research assistant. Use tools to find information.",
    max_tool_rounds=10,
)

agent.register_tool(ToolRegistration(
    definition=ToolDefinition(
        name="search", description="Search the web",
        parameters={"type": "object", "properties": {"query": {"type": "string"}}, "required": ["query"]},
    ),
    handler=lambda args: json.dumps(search_web(args["query"])),
))

async with agent.connect().connect() as session:
    result = await agent.run(session, "Find the latest WebRTC news")
    print(f"Response: {result.response}")
    print(f"Tool calls: {result.tool_call_count}")
    for step in result.steps:
        print(f"  [{step.type}] {step.content[:100]}")

6. Text Chat (Streaming)

from azure_realtime_webrtc.sdk import TextChat
from azure_realtime_webrtc.types import ApiKeyAuth

chat = TextChat(
    resource="my-resource",
    deployment="gpt-4o-realtime-preview",
    auth=ApiKeyAuth(api_key="..."),
    instructions="You are customer support.",
)

async with chat.connect().connect() as session:
    async for msg in chat.send_and_stream(session, "How do I reset my password?"):
        if msg.streaming:
            print(f"\r{msg.content}", end="", flush=True)
        else:
            print(f"\n{msg.content}")

Streaming

Streaming is built into RealtimeSession — no extra imports needed:

async with client.connect() as session:
    session.send_text("Tell me a story.")

    # Transcript stream (user + assistant text, word by word)
    async for chunk in session.transcript_stream():
        if chunk.type == "delta":
            print(chunk.text, end="", flush=True)
        elif chunk.type == "done":
            print(f"\n[{chunk.role}] Complete")
            break

    # Audio stream (base64 audio data chunks)
    async for chunk in session.audio_stream():
        if not chunk.done:
            save_audio(base64.b64decode(chunk.data))

    # All events stream
    async for event in session.event_stream():
        print(f"Event: {event.type}")
        if event.type == "response.done":
            break

Event Listener Pattern

async with client.connect() as session:
    session.on("session.created", lambda e: print("Session ready!"))
    session.on("error", lambda e: print(f"Error: {e.data['error']['message']}"))
    session.on("*", lambda e: print(f"[{e.type}]"))  # wildcard

    session.send_text("Hello!")
    async for event in session.event_stream():
        if event.type == "response.done":
            break

API Reference

`RealtimeClient`

client = RealtimeClient(
    resource="my-resource",           # Azure resource name (required)
    deployment="gpt-4o-realtime",     # Model deployment name (required)
    auth=ApiKeyAuth(api_key="..."),   # Or EntraAuth(get_token=...)
    session=SessionConfig(...),        # Optional session config
    base_url="https://...",           # Optional URL override
    ephemeral_token="ek_...",         # Optional pre-fetched token
)
client.register_tool(ToolRegistration(...))

`RealtimeSession`

Method	Returns	Description
`send(event)`	—	Send any `ClientEvent`
`send_text(text)`	—	Send text + trigger response
`add_item(item)`	—	Add a conversation item
`create_response()`	—	Trigger model response
`update_session(**kwargs)`	—	Update session config
`transcript_stream()`	`AsyncIterator[TranscriptChunk]`	User + AI text stream
`audio_stream()`	`AsyncIterator[AudioChunk]`	Audio data stream
`event_stream()`	`AsyncIterator[ServerEvent]`	All events stream
`on(event, handler)`	`() -> None`	Subscribe (returns unsubscribe fn)
`close()`	—	Close the session

Server Middleware

Function	Framework	Endpoints
`create_flask_blueprint(...)`	Flask	`POST /api/realtime/token` · `GET /api/realtime/health`
`create_fastapi_router(...)`	FastAPI	`POST /api/realtime/token` · `GET /api/realtime/health`

SDK Classes

Class	Method	Description
`TextChat`	`send_and_stream(session, text)`	`AsyncIterator[ChatMessage]` with streaming
`ToolAgent`	`run(session, task)`	`AgentRunResult` with full execution trace

Session Configuration

from azure_realtime_webrtc.types import (
    SessionConfig, AudioConfig, AudioInputConfig, AudioOutputConfig, TurnDetectionConfig
)

session = SessionConfig(
    instructions="You are a helpful assistant.",
    audio=AudioConfig(
        output=AudioOutputConfig(voice="alloy", format="pcm16"),
        input=AudioInputConfig(
            format="pcm16",
            transcription={"model": "whisper-1"},
            turn_detection=TurnDetectionConfig(
                threshold=0.5,
                prefix_padding_ms=300,
                silence_duration_ms=200,
                create_response=True,
            ),
        ),
    ),
    modalities=["audio", "text"],
    temperature=0.8,
    max_response_output_tokens=4096,
    tools=[ToolDefinition(name="...", description="...", parameters={...})],
    tool_choice="auto",
)

Voices: alloy · ash · ballad · coral · echo · sage · shimmer · verse · marin

Supported Models

Model	Version
`gpt-4o-mini-realtime-preview`	2024-12-17
`gpt-4o-realtime-preview`	2024-12-17
`gpt-realtime`	2025-08-28
`gpt-realtime-mini`	2025-10-06, 2025-12-15
`gpt-realtime-1.5`	2026-02-23

Regions: East US 2 and Sweden Central only.

Security

Measure	Details
Token isolation	API keys stay server-side — only ephemeral tokens sent to clients
Security headers	`Cache-Control: no-store` · `X-Content-Type-Options: nosniff`
CORS	Enabled by default on Flask blueprint
No eval	All JSON parsed with `json.loads` — no `exec()` or `eval()`
Typed	`py.typed` marker for mypy / pyright static analysis

Troubleshooting

Issue	Solution
Token request 500	Use nested format: `audio.output.voice` not flat `voice`
No transcript	Listen for BOTH `response.audio_transcript.delta` AND `response.output_audio_transcript.delta` (SDK handles this automatically)
Import error	`pip install azure-realtime-webrtc[all]`
Async errors	Use `async with client.connect()` — the client is async-first
Flask blocking	Token generation uses `asyncio.run()` internally — works in sync Flask

npm Companion

This is the Python companion to the npm package. Use them together:

Package	Registry	Install	Adds
`azure-realtime-webrtc`	npm	`npm install azure-realtime-webrtc`	WebRTC browser client, VoiceAssistant, ReadableStreams, SSE, Express middleware
`azure-realtime-webrtc`	PyPI	`pip install azure-realtime-webrtc`	WebSocket client, Flask/FastAPI middleware, TextChat, ToolAgent

Author & Maintainer

Komal Vardhan Lolugu Lead Product Engineer — Agentic AI & Generative Models

Platform	Link
Portfolio	komalsrinivas.vercel.app
LinkedIn	linkedin.com/in/komalvardhanlolugu
GitHub	github.com/komalSrinivasan
Medium	komalvardhan.medium.com
Topmate	topmate.io/komal_vardhan_lolugu

For bugs, questions, or collaboration — reach out via LinkedIn or open an issue.

License

MIT

Project details

These details have not been verified by PyPI

Project links

Release history Release notifications | RSS feed

This version

0.2.0

Apr 6, 2026

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

az_realtime_webrtc-0.2.0.tar.gz (18.0 kB view details)

Uploaded Apr 6, 2026 Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

The dropdown lists show the available interpreters, ABIs, and platforms. Enable javascript to be able to filter the list of wheel files.

az_realtime_webrtc-0.2.0-py3-none-any.whl (21.6 kB view details)

Uploaded Apr 6, 2026 Python 3

File details

Details for the file az_realtime_webrtc-0.2.0.tar.gz.

File metadata

Download URL: az_realtime_webrtc-0.2.0.tar.gz
Upload date: Apr 6, 2026
Size: 18.0 kB
Tags: Source
Uploaded using Trusted Publishing? No
Uploaded via: twine/6.2.0 CPython/3.14.3

File hashes

Hashes for az_realtime_webrtc-0.2.0.tar.gz
Algorithm	Hash digest
SHA256	`f54432575839cd5009a20c22408c91b1a2bf62bff0f5cbd211226fbd9adaebd6`
MD5	`4b3561832afc725fa4ffef19d388cc99`
BLAKE2b-256	`55af5054d6bb751f38524a728bb7f4c30588c937eb70794b45d2aceea237b478`

See more details on using hashes here.

File details

Details for the file az_realtime_webrtc-0.2.0-py3-none-any.whl.

File metadata

Download URL: az_realtime_webrtc-0.2.0-py3-none-any.whl
Upload date: Apr 6, 2026
Size: 21.6 kB
Tags: Python 3
Uploaded using Trusted Publishing? No
Uploaded via: twine/6.2.0 CPython/3.14.3

File hashes

Hashes for az_realtime_webrtc-0.2.0-py3-none-any.whl
Algorithm	Hash digest
SHA256	`3d8987fda9fd69d8a0a03eda33a8b847a812c939156d7f850f75c795f6976a7e`
MD5	`7bac2337ba784154bffcf2d3857e2f29`
BLAKE2b-256	`bc4a336bdcfd53ad4e5667b3e27718504fa2a1b74ab378b679b9a31b9c300627`

See more details on using hashes here.

az-realtime-webrtc 0.2.0

Navigation

Verified details

Maintainers

Unverified details

Project links

Meta

Classifiers

Project description

What's Inside

Features

Install

Prerequisites

Quick Start

1. WebSocket Client (Async Streaming)

2. Function Calling (Tools)

3. Flask Token Server

4. FastAPI Token Server

5. Tool Agent (Multi-Step Autonomous)

6. Text Chat (Streaming)

Streaming

Event Listener Pattern

API Reference

RealtimeClient

RealtimeSession

Server Middleware

SDK Classes

Session Configuration

Supported Models

Security

Troubleshooting

npm Companion

Author & Maintainer

License

Project details

Verified details

Maintainers

Unverified details

Project links

Meta

Classifiers

Release history Release notifications | RSS feed

Download files

Source Distribution

Built Distribution

File details

File metadata

File hashes

File details

File metadata

File hashes

`RealtimeClient`

`RealtimeSession`