Streamlit component for real-time audio conversation with OpenAI using WebRTC

These details have not been verified by PyPI

Project links

Homepage

Project description

Streamlit Real-time Audio Component

A Streamlit custom component that enables real-time voice conversations with OpenAI's GPT using WebRTC for low-latency audio streaming.

Features

🎤 Real-time Audio: Low-latency audio streaming using WebRTC
🤖 OpenAI Integration: Direct integration with OpenAI's Real-time API
💬 Live Transcription: Real-time transcription of both user and AI speech
⏸️ Conversation Control: Start, pause, resume, and stop conversations
🎛️ Configurable: Customizable voice, instructions, and AI parameters
📝 Full Transcript: Complete conversation history with timestamps

Installation

Prerequisites

Python 3.9+
Node.js and npm (for development)
OpenAI API key

From Source

Clone or download this component
Install Python dependencies:
```
cd realtime_audio
pip install -e .
```
Install and build frontend dependencies:
```
cd frontend
npm install
npm run build
```

Usage

Basic Example

import streamlit as st
from realtime_audio import realtime_audio_conversation

st.title("AI Voice Assistant")

# Get API key from user
api_key = st.text_input("OpenAI API Key", type="password")

if api_key:
    # Create the real-time audio conversation
    result = realtime_audio_conversation(
        api_key=api_key,
        instructions="You are a helpful AI assistant. Keep responses concise.",
        voice="alloy",
        temperature=0.8
    )
    
    # Display conversation status
    st.write(f"Status: {result['status']}")
    
    # Show any errors
    if result['error']:
        st.error(result['error'])
    
    # Display transcript
    for message in result['transcript']:
        if message['type'] == 'user':
            st.chat_message("user").write(message['content'])
        else:
            st.chat_message("assistant").write(message['content'])

Advanced Example

import streamlit as st
from realtime_audio import realtime_audio_conversation

st.set_page_config(page_title="Advanced AI Assistant", layout="wide")

col1, col2 = st.columns([3, 1])

with col2:
    st.header("Settings")
    api_key = st.text_input("API Key", type="password")
    voice = st.selectbox("Voice", ["alloy", "echo", "fable", "onyx", "nova", "shimmer"])
    temperature = st.slider("Temperature", 0.0, 2.0, 0.8)
    instructions = st.text_area(
        "Instructions", 
        "You are a helpful AI assistant.",
        height=100
    )

with col1:
    st.header("Conversation")
    
    if api_key:
        conversation = realtime_audio_conversation(
            api_key=api_key,
            voice=voice,
            instructions=instructions,
            temperature=temperature,
            turn_detection_threshold=0.5,
            key="advanced_conversation"
        )
        
        # Handle the conversation result
        if conversation['error']:
            st.error(conversation['error'])
        
        # Display metrics
        col_a, col_b, col_c = st.columns(3)
        with col_a:
            st.metric("Status", conversation['status'])
        with col_b:
            st.metric("Messages", len(conversation['transcript']))
        with col_c:
            recording = "🔴" if conversation['is_recording'] else "⚪"
            st.metric("Recording", recording)

API Reference

`realtime_audio_conversation()`

Creates a real-time audio conversation component.

Parameters

api_key (str): OpenAI API key for authentication
voice (str, default="alloy"): Voice for TTS. Options: "alloy", "echo", "fable", "onyx", "nova", "shimmer"
instructions (str, default="You are a helpful AI assistant..."): System instructions for the AI
prompt (str, default=""): Initial prompt to start the conversation
auto_start (bool, default=False): Whether to automatically start the conversation
temperature (float, default=0.8): AI response randomness (0.0-2.0)
turn_detection_threshold (float, default=0.5): Voice activity detection sensitivity (0.0-1.0)
key (str, optional): Unique component key

Returns

Dictionary with the following structure:

{
    "transcript": [
        {
            "id": "unique_message_id",
            "type": "user" | "assistant", 
            "content": "Message content",
            "timestamp": 1640995200000,
            "status": "completed" | "in_progress"
        }
    ],
    "status": "idle" | "connecting" | "connected" | "recording" | "speaking" | "error",
    "error": "Error message" | None,
    "session_id": "unique_session_id" | None,
    "connection_state": "new" | "connecting" | "connected" | "disconnected" | "failed" | "closed",
    "is_recording": bool,
    "is_paused": bool
}

Development

Setting up Development Environment

Create a virtual environment:

python -m venv venv
source venv/bin/activate  # On Windows: venv\Scripts\activate

Install development dependencies:
```
pip install streamlit
pip install -e .
```

Set up frontend development:

cd frontend
npm install
npm run start  # Starts development server on port 3001

Run the example:
```
streamlit run example.py
```

Project Structure

realtime_audio/
├── __init__.py              # Python API
├── example.py               # Usage example
├── setup.py                 # Package setup
├── README.md               # Documentation
└── frontend/
    ├── package.json
    ├── tsconfig.json
    ├── vite.config.ts
    └── src/
        ├── index.tsx        # Entry point
        ├── RealtimeAudio.tsx # Main component
        ├── types/           # TypeScript definitions
        └── utils/           # Utility functions

Browser Requirements

HTTPS required: WebRTC requires a secure connection
Microphone access: Component will request microphone permissions
Supported browsers: Chrome, Firefox, Safari, Edge (latest versions)

Troubleshooting

Common Issues

Microphone Access Denied

Ensure you're using HTTPS
Check browser permissions
Try refreshing the page

Connection Failed

Verify your OpenAI API key
Check internet connection
Ensure firewall isn't blocking WebRTC

No Audio Playback

Check system audio settings
Try different browsers
Verify speakers/headphones are working

Debug Mode

Enable debug logging by opening browser developer tools (F12) and checking the console for detailed connection information.

License

MIT License - see LICENSE file for details.

Contributing

Contributions are welcome! Please feel free to submit a Pull Request.

Support

For issues and questions:

Check the troubleshooting section
Review browser console for errors
Create an issue with detailed error information

Project details

These details have not been verified by PyPI

Project links

Homepage

Release history Release notifications | RSS feed

0.0.8

Mar 21, 2026

0.0.7

Jul 18, 2025

0.0.6

Jul 18, 2025

0.0.5

Jul 18, 2025

0.0.4

Jul 18, 2025

0.0.3

Jul 18, 2025

0.0.2

Jul 18, 2025

This version

0.0.1

Jul 18, 2025

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

streamlit_realtime_audio-0.0.1.tar.gz (12.2 kB view details)

Uploaded Jul 18, 2025 Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

The dropdown lists show the available interpreters, ABIs, and platforms. Enable javascript to be able to filter the list of wheel files.

streamlit_realtime_audio-0.0.1-py3-none-any.whl (11.4 kB view details)

Uploaded Jul 18, 2025 Python 3

File details

Details for the file streamlit_realtime_audio-0.0.1.tar.gz.

File metadata

Download URL: streamlit_realtime_audio-0.0.1.tar.gz
Upload date: Jul 18, 2025
Size: 12.2 kB
Tags: Source
Uploaded using Trusted Publishing? No
Uploaded via: twine/6.1.0 CPython/3.13.5

File hashes

Hashes for streamlit_realtime_audio-0.0.1.tar.gz
Algorithm	Hash digest
SHA256	`aa7239401e7b983715b7dadb41d16470ebab8fee3187ef8a9422c7628ee0b4e5`
MD5	`92729400d7ed5aa40702fb1703617dc2`
BLAKE2b-256	`c95b947472de5e00321297e5ca2c934a96b40ac626ea88ed32ba15715dfb5a63`

See more details on using hashes here.

File details

Details for the file streamlit_realtime_audio-0.0.1-py3-none-any.whl.

File metadata

Download URL: streamlit_realtime_audio-0.0.1-py3-none-any.whl
Upload date: Jul 18, 2025
Size: 11.4 kB
Tags: Python 3
Uploaded using Trusted Publishing? No
Uploaded via: twine/6.1.0 CPython/3.13.5

File hashes

Hashes for streamlit_realtime_audio-0.0.1-py3-none-any.whl
Algorithm	Hash digest
SHA256	`1f92738126cdfdf3b46ccdd1824b07ece043a3f809f427f6836a25849b05300a`
MD5	`bb62981a0b8ab8e63b7bf5eb6f277c85`
BLAKE2b-256	`3303a628b15dbd40a2891d1b2a666e105eb9e837329e3e6a28a7e9385c36ea9e`

See more details on using hashes here.

streamlit-realtime-audio 0.0.1

Navigation

Verified details

Maintainers

Unverified details

Project links

Meta

Classifiers

Project description

Streamlit Real-time Audio Component

Features

Installation

Prerequisites

From Source

Usage

Basic Example

Advanced Example

API Reference

realtime_audio_conversation()

Parameters

Returns

Development

Setting up Development Environment

Project Structure

Browser Requirements

Troubleshooting

Common Issues

Debug Mode

License

Contributing

Support

Project details

Verified details

Maintainers

Unverified details

Project links

Meta

Classifiers

Release history Release notifications | RSS feed

Download files

Source Distribution

Built Distribution

File details

File metadata

File hashes

File details

File metadata

File hashes

`realtime_audio_conversation()`