Skip to main content

Streamlit component for real-time audio conversation with OpenAI using WebRTC

Project description

Streamlit Real-time Audio

A Streamlit custom component for real-time voice conversations with OpenAI's GPT using WebRTC for low-latency audio streaming.

Features

  • 🎤 Real-time Audio: Low-latency audio streaming using WebRTC
  • 🤖 OpenAI Integration: Direct integration with OpenAI's Real-time API
  • 💬 Live Transcription: Real-time transcription of both user and AI speech
  • ⏸️ Conversation Control: Start, pause, resume, and stop conversations
  • 🎛️ Configurable: Customizable voice, instructions, and AI parameters
  • 📝 Full Transcript: Complete conversation history with timestamps

Installation

pip install streamlit-realtime-audio

Quick Start

import streamlit as st
from st_realtime_audio import realtime_audio_conversation

st.title("AI Voice Assistant")

# Get API key from user
api_key = st.text_input("OpenAI API Key", type="password")

if api_key:
    # Create the real-time audio conversation
    result = realtime_audio_conversation(
        api_key=api_key,
        instructions="You are a helpful AI assistant. Keep responses concise.",
        voice="alloy"
    )
    
    # Display conversation status
    st.write(f"Status: {result['status']}")
    
    # Show any errors
    if result['error']:
        st.error(result['error'])
    
    # Display transcript
    for message in result['transcript']:
        if message['type'] == 'user':
            st.chat_message("user").write(message['content'])
        else:
            st.chat_message("assistant").write(message['content'])

Usage

Basic Parameters

  • api_key (str, required): OpenAI API key for authentication
  • voice (str, default="alloy"): Voice for TTS. Options: "alloy", "ash", "ballad", "cedar", "coral", "echo", "marin", "sage", "shimmer", "verse"
  • instructions (str): System instructions for the AI
  • temperature (float, default=0.8): AI response randomness (0.0-2.0)
  • auto_start (bool, default=False): Whether to automatically start the conversation

Advanced Example

import streamlit as st
from st_realtime_audio import realtime_audio_conversation

st.set_page_config(page_title="Advanced AI Assistant", layout="wide")

col1, col2 = st.columns([3, 1])

with col2:
    st.header("Settings")
    api_key = st.text_input("API Key", type="password")
    voice = st.selectbox("Voice", ["alloy", "ash", "ballad", "cedar", "coral", "echo", "marin", "sage", "shimmer", "verse"])
    temperature = st.slider("Temperature", 0.0, 2.0, 0.8)
    instructions = st.text_area(
        "Instructions", 
        "You are a helpful AI assistant.",
        height=100
    )

with col1:
    st.header("Conversation")
    
    if api_key:
        conversation = realtime_audio_conversation(
            api_key=api_key,
            voice=voice,
            instructions=instructions,
            temperature=temperature,
            turn_detection_threshold=0.5,
            key="advanced_conversation"
        )
        
        # Handle the conversation result
        if conversation['error']:
            st.error(conversation['error'])
        
        # Display metrics
        col_a, col_b, col_c = st.columns(3)
        with col_a:
            st.metric("Status", conversation['status'])
        with col_b:
            st.metric("Messages", len(conversation['transcript']))
        with col_c:
            recording = "🔴" if conversation['is_recording'] else "⚪"
            st.metric("Recording", recording)

Return Value

The component returns a dictionary with conversation state:

{
    "transcript": [
        {
            "id": "unique_message_id",
            "type": "user" | "assistant", 
            "content": "Message content",
            "timestamp": 1640995200000,
            "status": "completed" | "in_progress"
        }
    ],
    "status": "idle" | "connecting" | "connected" | "recording" | "speaking" | "error",
    "error": "Error message" | None,
    "session_id": "unique_session_id" | None,
    "connection_state": "new" | "connecting" | "connected" | "disconnected" | "failed" | "closed",
    "is_recording": bool,
    "is_paused": bool
}

Requirements

  • Python 3.9+
  • Streamlit >= 1.28.0
  • OpenAI API key
  • HTTPS connection (required for WebRTC)
  • Microphone access

Browser Support

  • Chrome, Firefox, Safari, Edge (latest versions)
  • HTTPS required for WebRTC functionality
  • Microphone permissions required

Troubleshooting

Common Issues

Microphone Access Denied

  • Ensure you're using HTTPS
  • Check browser permissions
  • Try refreshing the page

Connection Failed

  • Verify your OpenAI API key
  • Check internet connection
  • Ensure firewall isn't blocking WebRTC

No Audio Playback

  • Check system audio settings
  • Try different browsers
  • Verify speakers/headphones are working

Development

To test changes locally instead of using the published PyPI version, clone the repo, set up a virtual environment, and use the rebuild script:

git clone https://github.com/bronsonhill/streamlit-realtime-audio.git
cd streamlit-realtime-audio
python3 -m venv venv
source venv/bin/activate
npm install --prefix st_realtime_audio/frontend

After making changes to the Python or frontend code, run:

./rebuild.sh

This rebuilds the frontend and reinstalls the package locally. Then run the example app with:

streamlit run st_realtime_audio/example.py

License

MIT License - see LICENSE file for details.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

streamlit_realtime_audio-0.0.8.tar.gz (106.8 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

streamlit_realtime_audio-0.0.8-py3-none-any.whl (106.2 kB view details)

Uploaded Python 3

File details

Details for the file streamlit_realtime_audio-0.0.8.tar.gz.

File metadata

  • Download URL: streamlit_realtime_audio-0.0.8.tar.gz
  • Upload date:
  • Size: 106.8 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.9.6

File hashes

Hashes for streamlit_realtime_audio-0.0.8.tar.gz
Algorithm Hash digest
SHA256 c8dc7b5ed62a0e44d9de8af4ffafab3663d6b5d23f44000e0597e16e9ffbd541
MD5 ad31ebeee9dd1053cd781372987d2dd6
BLAKE2b-256 7cdeed39b0f57c4316c2ec6be4d39c59af469cddf363098ac7dc8e1f7c269c4f

See more details on using hashes here.

File details

Details for the file streamlit_realtime_audio-0.0.8-py3-none-any.whl.

File metadata

File hashes

Hashes for streamlit_realtime_audio-0.0.8-py3-none-any.whl
Algorithm Hash digest
SHA256 e5ffb9073c78a6cb5f8e366574feed4e139346f480898c9b43a305ed9cb7a128
MD5 d450a76a965682ee6edb3b544cefcf2f
BLAKE2b-256 2b7b60a8166a973c90eb6c42ec6ca39df712d3d7dc528a40aa6d086988d55d13

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page