Streamlit component for real-time audio conversation with OpenAI using WebRTC
Project description
Streamlit Real-time Audio
A Streamlit custom component for real-time voice conversations with OpenAI's GPT using WebRTC for low-latency audio streaming.
Features
- 🎤 Real-time Audio: Low-latency audio streaming using WebRTC
- 🤖 OpenAI Integration: Direct integration with OpenAI's Real-time API
- 💬 Live Transcription: Real-time transcription of both user and AI speech
- ⏸️ Conversation Control: Start, pause, resume, and stop conversations
- 🎛️ Configurable: Customizable voice, instructions, and AI parameters
- 📝 Full Transcript: Complete conversation history with timestamps
Installation
pip install streamlit-realtime-audio
Quick Start
import streamlit as st
from st_realtime_audio import realtime_audio_conversation
st.title("AI Voice Assistant")
# Get API key from user
api_key = st.text_input("OpenAI API Key", type="password")
if api_key:
# Create the real-time audio conversation
result = realtime_audio_conversation(
api_key=api_key,
instructions="You are a helpful AI assistant. Keep responses concise.",
voice="alloy"
)
# Display conversation status
st.write(f"Status: {result['status']}")
# Show any errors
if result['error']:
st.error(result['error'])
# Display transcript
for message in result['transcript']:
if message['type'] == 'user':
st.chat_message("user").write(message['content'])
else:
st.chat_message("assistant").write(message['content'])
Usage
Basic Parameters
api_key(str, required): OpenAI API key for authenticationvoice(str, default="alloy"): Voice for TTS. Options: "alloy", "ash", "ballad", "cedar", "coral", "echo", "marin", "sage", "shimmer", "verse"instructions(str): System instructions for the AItemperature(float, default=0.8): AI response randomness (0.0-2.0)auto_start(bool, default=False): Whether to automatically start the conversation
Advanced Example
import streamlit as st
from st_realtime_audio import realtime_audio_conversation
st.set_page_config(page_title="Advanced AI Assistant", layout="wide")
col1, col2 = st.columns([3, 1])
with col2:
st.header("Settings")
api_key = st.text_input("API Key", type="password")
voice = st.selectbox("Voice", ["alloy", "ash", "ballad", "cedar", "coral", "echo", "marin", "sage", "shimmer", "verse"])
temperature = st.slider("Temperature", 0.0, 2.0, 0.8)
instructions = st.text_area(
"Instructions",
"You are a helpful AI assistant.",
height=100
)
with col1:
st.header("Conversation")
if api_key:
conversation = realtime_audio_conversation(
api_key=api_key,
voice=voice,
instructions=instructions,
temperature=temperature,
turn_detection_threshold=0.5,
key="advanced_conversation"
)
# Handle the conversation result
if conversation['error']:
st.error(conversation['error'])
# Display metrics
col_a, col_b, col_c = st.columns(3)
with col_a:
st.metric("Status", conversation['status'])
with col_b:
st.metric("Messages", len(conversation['transcript']))
with col_c:
recording = "🔴" if conversation['is_recording'] else "⚪"
st.metric("Recording", recording)
Return Value
The component returns a dictionary with conversation state:
{
"transcript": [
{
"id": "unique_message_id",
"type": "user" | "assistant",
"content": "Message content",
"timestamp": 1640995200000,
"status": "completed" | "in_progress"
}
],
"status": "idle" | "connecting" | "connected" | "recording" | "speaking" | "error",
"error": "Error message" | None,
"session_id": "unique_session_id" | None,
"connection_state": "new" | "connecting" | "connected" | "disconnected" | "failed" | "closed",
"is_recording": bool,
"is_paused": bool
}
Requirements
- Python 3.9+
- Streamlit >= 1.28.0
- OpenAI API key
- HTTPS connection (required for WebRTC)
- Microphone access
Browser Support
- Chrome, Firefox, Safari, Edge (latest versions)
- HTTPS required for WebRTC functionality
- Microphone permissions required
Troubleshooting
Common Issues
Microphone Access Denied
- Ensure you're using HTTPS
- Check browser permissions
- Try refreshing the page
Connection Failed
- Verify your OpenAI API key
- Check internet connection
- Ensure firewall isn't blocking WebRTC
No Audio Playback
- Check system audio settings
- Try different browsers
- Verify speakers/headphones are working
Development
To test changes locally instead of using the published PyPI version, clone the repo, set up a virtual environment, and use the rebuild script:
git clone https://github.com/bronsonhill/streamlit-realtime-audio.git
cd streamlit-realtime-audio
python3 -m venv venv
source venv/bin/activate
npm install --prefix st_realtime_audio/frontend
After making changes to the Python or frontend code, run:
./rebuild.sh
This rebuilds the frontend and reinstalls the package locally. Then run the example app with:
streamlit run st_realtime_audio/example.py
License
MIT License - see LICENSE file for details.
Project details
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file streamlit_realtime_audio-0.0.8.tar.gz.
File metadata
- Download URL: streamlit_realtime_audio-0.0.8.tar.gz
- Upload date:
- Size: 106.8 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.2.0 CPython/3.9.6
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
c8dc7b5ed62a0e44d9de8af4ffafab3663d6b5d23f44000e0597e16e9ffbd541
|
|
| MD5 |
ad31ebeee9dd1053cd781372987d2dd6
|
|
| BLAKE2b-256 |
7cdeed39b0f57c4316c2ec6be4d39c59af469cddf363098ac7dc8e1f7c269c4f
|
File details
Details for the file streamlit_realtime_audio-0.0.8-py3-none-any.whl.
File metadata
- Download URL: streamlit_realtime_audio-0.0.8-py3-none-any.whl
- Upload date:
- Size: 106.2 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.2.0 CPython/3.9.6
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
e5ffb9073c78a6cb5f8e366574feed4e139346f480898c9b43a305ed9cb7a128
|
|
| MD5 |
d450a76a965682ee6edb3b544cefcf2f
|
|
| BLAKE2b-256 |
2b7b60a8166a973c90eb6c42ec6ca39df712d3d7dc528a40aa6d086988d55d13
|