Skip to main content

Python SDK for Azure OpenAI Realtime API with WebSocket support, streaming, and tool calling

Project description

azure-realtime-webrtc
Python SDK for Azure OpenAI Realtime API — async streaming, tools, Flask & FastAPI

pypi version downloads license python typed async


azure-realtime-webrtc is the Python companion to the npm package. It provides an async WebSocket client, streaming iterators, function calling, and server middleware for Flask & FastAPI — so you can build real-time AI voice/text applications in Python.

from azure_realtime_webrtc import RealtimeClient
from azure_realtime_webrtc.types import ApiKeyAuth

client = RealtimeClient(
    resource="my-resource",
    deployment="gpt-4o-realtime-preview",
    auth=ApiKeyAuth(api_key=os.environ["AZURE_OPENAI_API_KEY"]),
)

async with client.connect() as session:
    session.send_text("Hello!")
    async for chunk in session.transcript_stream():
        print(chunk.text, end="", flush=True)

What's Inside

Module Purpose
azure_realtime_webrtc Async WebSocket client, token manager, typed events
azure_realtime_webrtc.sdk High-level classes: TextChat, ToolAgent
azure_realtime_webrtc.server Flask blueprint & FastAPI router for token server
azure_realtime_webrtc.types All dataclass types with full type hints

Features

Feature Details
Async WebSocket Client Full duplex communication with Azure OpenAI Realtime API
Streaming Iterators async for chunk in session.transcript_stream()
Function Calling register_tool() with automatic call → execute → respond cycle
SDK: TextChat Streaming text chat with message history
SDK: ToolAgent Autonomous multi-step tool calling with execution trace
Flask Middleware create_flask_blueprint() — drop-in token server
FastAPI Middleware create_fastapi_router() — async token server with Swagger UI
Entra ID Auth Microsoft Entra ID support via azure-identity
Fully Typed py.typed marker, dataclasses, full type hints

Install

pip install azure-realtime-webrtc

With Flask:

pip install azure-realtime-webrtc[flask]

With FastAPI:

pip install azure-realtime-webrtc[fastapi]

With Entra ID:

pip install azure-realtime-webrtc[azure]

Everything:

pip install azure-realtime-webrtc[all]

Prerequisites

You need three things from the Azure Portal:

Value Where to find it Example
Resource name Your Azure OpenAI resource URL: https://<THIS>.openai.azure.com my-openai-resource
API Key Azure Portal → Your OpenAI resource → Keys and Endpoint abc123...
Deployment name Azure AI Foundry → Deployments (must be a realtime model) gpt-4o-realtime-preview

Deployment must be in East US 2 or Sweden Central.


Quick Start

1. WebSocket Client (Async Streaming)

import asyncio
import os
from azure_realtime_webrtc import RealtimeClient
from azure_realtime_webrtc.types import ApiKeyAuth, SessionConfig, AudioConfig, AudioOutputConfig

client = RealtimeClient(
    resource=os.environ["AZURE_RESOURCE"],
    deployment=os.environ["AZURE_DEPLOYMENT"],
    auth=ApiKeyAuth(api_key=os.environ["AZURE_OPENAI_API_KEY"]),
    session=SessionConfig(
        instructions="You are a helpful assistant. Be concise.",
        audio=AudioConfig(output=AudioOutputConfig(voice="alloy")),
    ),
)

async def main():
    async with client.connect() as session:
        session.send_text("What are three facts about WebRTC?")

        async for chunk in session.transcript_stream():
            if chunk.type == "delta":
                print(chunk.text, end="", flush=True)
            elif chunk.type == "done":
                print(f"\n\n[{chunk.role}] Complete.")
                break

asyncio.run(main())

2. Function Calling (Tools)

import json
from azure_realtime_webrtc.types import ToolDefinition, ToolRegistration

def get_weather(args: dict) -> str:
    city = args.get("city", "unknown")
    return json.dumps({"city": city, "temp": 72, "condition": "Sunny"})

client.register_tool(ToolRegistration(
    definition=ToolDefinition(
        name="get_weather",
        description="Get the current weather for a city",
        parameters={
            "type": "object",
            "properties": {"city": {"type": "string"}},
            "required": ["city"],
        },
    ),
    handler=get_weather,
))

async with client.connect() as session:
    session.send_text("What's the weather in Tokyo?")
    # Tool calls are handled automatically!
    async for chunk in session.transcript_stream():
        if chunk.role == "assistant":
            print(chunk.text, end="", flush=True)
            if chunk.type == "done":
                break

3. Flask Token Server

import os
from flask import Flask
from azure_realtime_webrtc.server import create_flask_blueprint
from azure_realtime_webrtc.types import ApiKeyAuth

app = Flask(__name__)

bp = create_flask_blueprint(
    resource=os.environ["AZURE_RESOURCE"],
    deployment=os.environ["AZURE_DEPLOYMENT"],
    auth=ApiKeyAuth(api_key=os.environ["AZURE_OPENAI_API_KEY"]),
    session={
        "instructions": "You are a helpful assistant.",
        "audio": {"output": {"voice": "alloy"}},
    },
)

app.register_blueprint(bp)
app.run(port=3001)
# POST /api/realtime/token → {"token": "ek_..."}
# GET  /api/realtime/health → {"status": "ok"}

4. FastAPI Token Server

import os
from fastapi import FastAPI
from azure_realtime_webrtc.server import create_fastapi_router
from azure_realtime_webrtc.types import ApiKeyAuth

app = FastAPI(title="Realtime Token Server")

router = create_fastapi_router(
    resource=os.environ["AZURE_RESOURCE"],
    deployment=os.environ["AZURE_DEPLOYMENT"],
    auth=ApiKeyAuth(api_key=os.environ["AZURE_OPENAI_API_KEY"]),
)

app.include_router(router)
# Run: uvicorn server:app --port 3001
# Swagger UI at http://localhost:3001/docs

5. Tool Agent (Multi-Step Autonomous)

import json
from azure_realtime_webrtc.sdk import ToolAgent
from azure_realtime_webrtc.types import ApiKeyAuth, ToolDefinition, ToolRegistration

agent = ToolAgent(
    resource="my-resource",
    deployment="gpt-4o-realtime-preview",
    auth=ApiKeyAuth(api_key="..."),
    instructions="You are a research assistant. Use tools to find information.",
    max_tool_rounds=10,
)

agent.register_tool(ToolRegistration(
    definition=ToolDefinition(
        name="search", description="Search the web",
        parameters={"type": "object", "properties": {"query": {"type": "string"}}, "required": ["query"]},
    ),
    handler=lambda args: json.dumps(search_web(args["query"])),
))

async with agent.connect().connect() as session:
    result = await agent.run(session, "Find the latest WebRTC news")
    print(f"Response: {result.response}")
    print(f"Tool calls: {result.tool_call_count}")
    for step in result.steps:
        print(f"  [{step.type}] {step.content[:100]}")

6. Text Chat (Streaming)

from azure_realtime_webrtc.sdk import TextChat
from azure_realtime_webrtc.types import ApiKeyAuth

chat = TextChat(
    resource="my-resource",
    deployment="gpt-4o-realtime-preview",
    auth=ApiKeyAuth(api_key="..."),
    instructions="You are customer support.",
)

async with chat.connect().connect() as session:
    async for msg in chat.send_and_stream(session, "How do I reset my password?"):
        if msg.streaming:
            print(f"\r{msg.content}", end="", flush=True)
        else:
            print(f"\n{msg.content}")

Streaming

Streaming is built into RealtimeSession — no extra imports needed:

async with client.connect() as session:
    session.send_text("Tell me a story.")

    # Transcript stream (user + assistant text, word by word)
    async for chunk in session.transcript_stream():
        if chunk.type == "delta":
            print(chunk.text, end="", flush=True)
        elif chunk.type == "done":
            print(f"\n[{chunk.role}] Complete")
            break

    # Audio stream (base64 audio data chunks)
    async for chunk in session.audio_stream():
        if not chunk.done:
            save_audio(base64.b64decode(chunk.data))

    # All events stream
    async for event in session.event_stream():
        print(f"Event: {event.type}")
        if event.type == "response.done":
            break

Event Listener Pattern

async with client.connect() as session:
    session.on("session.created", lambda e: print("Session ready!"))
    session.on("error", lambda e: print(f"Error: {e.data['error']['message']}"))
    session.on("*", lambda e: print(f"[{e.type}]"))  # wildcard

    session.send_text("Hello!")
    async for event in session.event_stream():
        if event.type == "response.done":
            break

API Reference

RealtimeClient

client = RealtimeClient(
    resource="my-resource",           # Azure resource name (required)
    deployment="gpt-4o-realtime",     # Model deployment name (required)
    auth=ApiKeyAuth(api_key="..."),   # Or EntraAuth(get_token=...)
    session=SessionConfig(...),        # Optional session config
    base_url="https://...",           # Optional URL override
    ephemeral_token="ek_...",         # Optional pre-fetched token
)
client.register_tool(ToolRegistration(...))

RealtimeSession

Method Returns Description
send(event) Send any ClientEvent
send_text(text) Send text + trigger response
add_item(item) Add a conversation item
create_response() Trigger model response
update_session(**kwargs) Update session config
transcript_stream() AsyncIterator[TranscriptChunk] User + AI text stream
audio_stream() AsyncIterator[AudioChunk] Audio data stream
event_stream() AsyncIterator[ServerEvent] All events stream
on(event, handler) () -> None Subscribe (returns unsubscribe fn)
close() Close the session

Server Middleware

Function Framework Endpoints
create_flask_blueprint(...) Flask POST /api/realtime/token · GET /api/realtime/health
create_fastapi_router(...) FastAPI POST /api/realtime/token · GET /api/realtime/health

SDK Classes

Class Method Description
TextChat send_and_stream(session, text) AsyncIterator[ChatMessage] with streaming
ToolAgent run(session, task) AgentRunResult with full execution trace

Session Configuration

from azure_realtime_webrtc.types import (
    SessionConfig, AudioConfig, AudioInputConfig, AudioOutputConfig, TurnDetectionConfig
)

session = SessionConfig(
    instructions="You are a helpful assistant.",
    audio=AudioConfig(
        output=AudioOutputConfig(voice="alloy", format="pcm16"),
        input=AudioInputConfig(
            format="pcm16",
            transcription={"model": "whisper-1"},
            turn_detection=TurnDetectionConfig(
                threshold=0.5,
                prefix_padding_ms=300,
                silence_duration_ms=200,
                create_response=True,
            ),
        ),
    ),
    modalities=["audio", "text"],
    temperature=0.8,
    max_response_output_tokens=4096,
    tools=[ToolDefinition(name="...", description="...", parameters={...})],
    tool_choice="auto",
)

Voices: alloy · ash · ballad · coral · echo · sage · shimmer · verse · marin

Supported Models

Model Version
gpt-4o-mini-realtime-preview 2024-12-17
gpt-4o-realtime-preview 2024-12-17
gpt-realtime 2025-08-28
gpt-realtime-mini 2025-10-06, 2025-12-15
gpt-realtime-1.5 2026-02-23

Regions: East US 2 and Sweden Central only.

Security

Measure Details
Token isolation API keys stay server-side — only ephemeral tokens sent to clients
Security headers Cache-Control: no-store · X-Content-Type-Options: nosniff
CORS Enabled by default on Flask blueprint
No eval All JSON parsed with json.loads — no exec() or eval()
Typed py.typed marker for mypy / pyright static analysis

Troubleshooting

Issue Solution
Token request 500 Use nested format: audio.output.voice not flat voice
No transcript Listen for BOTH response.audio_transcript.delta AND response.output_audio_transcript.delta (SDK handles this automatically)
Import error pip install azure-realtime-webrtc[all]
Async errors Use async with client.connect() — the client is async-first
Flask blocking Token generation uses asyncio.run() internally — works in sync Flask

npm Companion

This is the Python companion to the npm package. Use them together:

Package Registry Install Adds
azure-realtime-webrtc npm npm install azure-realtime-webrtc WebRTC browser client, VoiceAssistant, ReadableStreams, SSE, Express middleware
azure-realtime-webrtc PyPI pip install azure-realtime-webrtc WebSocket client, Flask/FastAPI middleware, TextChat, ToolAgent

Author & Maintainer

Komal Vardhan Lolugu Lead Product Engineer — Agentic AI & Generative Models

Platform Link
Portfolio komalsrinivas.vercel.app
LinkedIn linkedin.com/in/komalvardhanlolugu
GitHub github.com/komalSrinivasan
Medium komalvardhan.medium.com
Topmate topmate.io/komal_vardhan_lolugu

For bugs, questions, or collaboration — reach out via LinkedIn or open an issue.

License

MIT

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

az_realtime_webrtc-0.2.0.tar.gz (18.0 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

az_realtime_webrtc-0.2.0-py3-none-any.whl (21.6 kB view details)

Uploaded Python 3

File details

Details for the file az_realtime_webrtc-0.2.0.tar.gz.

File metadata

  • Download URL: az_realtime_webrtc-0.2.0.tar.gz
  • Upload date:
  • Size: 18.0 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.14.3

File hashes

Hashes for az_realtime_webrtc-0.2.0.tar.gz
Algorithm Hash digest
SHA256 f54432575839cd5009a20c22408c91b1a2bf62bff0f5cbd211226fbd9adaebd6
MD5 4b3561832afc725fa4ffef19d388cc99
BLAKE2b-256 55af5054d6bb751f38524a728bb7f4c30588c937eb70794b45d2aceea237b478

See more details on using hashes here.

File details

Details for the file az_realtime_webrtc-0.2.0-py3-none-any.whl.

File metadata

File hashes

Hashes for az_realtime_webrtc-0.2.0-py3-none-any.whl
Algorithm Hash digest
SHA256 3d8987fda9fd69d8a0a03eda33a8b847a812c939156d7f850f75c795f6976a7e
MD5 7bac2337ba784154bffcf2d3857e2f29
BLAKE2b-256 bc4a336bdcfd53ad4e5667b3e27718504fa2a1b74ab378b679b9a31b9c300627

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page