Real-time voice call interface for AI assistants

These details have not been verified by PyPI

Project links

Development Status
- 4 - Beta
Intended Audience
- Developers
License
- OSI Approved :: MIT License
Programming Language

Project description

Olaf Voice 🎙️

Real-time voice call interface for AI assistants

Have natural voice conversations with your AI assistant through a sleek, browser-based interface. Think "phone call with your AI" — speak naturally, get instant voice responses, and enjoy ambient background music while you chat.

Demo Python License

✨ Features

🎤 Real-time voice interaction — speak and hear responses instantly
🎨 Clean, dark UI — minimal, elegant interface designed for voice-first interaction
🎵 Ambient background music — lo-fi tones that auto-duck when Olaf speaks
🎙️ Dual input modes — Voice Activity Detection (VAD) or Push-to-Talk (PTT)
🏃 Running mode — simplified UI with larger controls for hands-free use
📊 Live waveform visualizer — see who's speaking in real-time
🔌 OpenAI-compatible APIs — works with OpenAI, OpenClaw proxy, or any compatible endpoint
⚙️ Fully configurable — customize voices, models, music, and behavior

📸 Screenshots

Main Interface

Clean, centered call button with waveform visualizer and status display.

Running Mode

Simplified interface with larger controls, perfect for hands-free use while exercising or driving.

🚀 Quick Start

Installation

# Install from PyPI
pip install olaf-voice

# Or with uv (recommended)
uv pip install olaf-voice

Basic Usage

# Set your OpenAI API key
export OLAF_VOICE_OPENAI_API_KEY="your-api-key"

# Start the server
olaf-voice

# Open browser to http://localhost:8765

That's it! Click the green call button and start talking.

🛠️ Configuration

Environment Variables

# Server settings
export OLAF_VOICE_HOST="0.0.0.0"
export OLAF_VOICE_PORT="8765"

# OpenAI API
export OLAF_VOICE_OPENAI_API_KEY="sk-..."
export OLAF_VOICE_OPENAI_BASE_URL="https://api.openai.com/v1"  # optional

# Models
export OLAF_VOICE_WHISPER_MODEL="whisper-1"
export OLAF_VOICE_TTS_MODEL="tts-1"
export OLAF_VOICE_TTS_VOICE="alloy"  # alloy, echo, fable, onyx, nova, shimmer
export OLAF_VOICE_AI_MODEL="gpt-4o"

# Audio preferences
export OLAF_VOICE_VAD_ENABLED="true"
export OLAF_VOICE_MUSIC_VOLUME="0.3"

YAML Configuration

Generate an example config file:

olaf-voice --generate-config config.yaml

Edit config.yaml:

# Server
host: 0.0.0.0
port: 8765

# API Keys
openai_api_key: sk-...
openai_base_url: null  # Use default OpenAI, or set custom endpoint

# STT (Speech-to-Text)
whisper_model: whisper-1
whisper_language: null  # Auto-detect, or set 'en', 'es', etc.

# TTS (Text-to-Speech)
tts_model: tts-1
tts_voice: alloy
tts_speed: 1.0

# AI Model
ai_model: gpt-4o
ai_system_prompt: "You are Olaf, a helpful AI assistant..."
ai_max_tokens: 500
ai_temperature: 0.7

# Audio
vad_enabled: true
vad_sensitivity: 0.5
music_volume: 0.3
music_duck_volume: 0.1

Run with config:

olaf-voice --config config.yaml

Command-Line Options

olaf-voice --help

# Start with custom host/port
olaf-voice --host 127.0.0.1 --port 9000

# Enable verbose logging
olaf-voice --verbose

# Use config file
olaf-voice --config /path/to/config.yaml

🎯 Usage Guide

Starting a Call

Open http://localhost:8765 in your browser
Click the green Call button
Grant microphone permissions when prompted
Start talking!

Controls

Mute 🎤 — Temporarily mute your microphone
Music 🎵 — Toggle background ambient music
VAD/PTT 🎙️ — Switch between Voice Activity Detection and Push-to-Talk
- VAD mode (default): Automatically detects when you're speaking
- PTT mode: Hold SPACE to talk, release to send
Running 🏃 — Toggle simplified UI for hands-free use
Volume slider — Adjust output volume

Keyboard Shortcuts

SPACE — Push-to-talk (when in PTT mode)
Click controls to toggle features

Running Mode

Perfect for when you're exercising, driving, or want a simplified experience:

Click the Running 🏃 button
UI simplifies to just:
- Large call button
- Status display
- Transcript (shows conversation)

Press Running again to return to full interface.

🏗️ Architecture

┌─────────────┐      WebSocket      ┌──────────────┐
│   Browser   │ ◄─────────────────► │  FastAPI     │
│             │                      │  Server      │
│ - Mic input │   Audio (base64)    │              │
│ - Speaker   │ ──────────────────► │ - Whisper    │
│ - Visualizer│                      │ - AI Chat    │
│ - Controls  │ ◄────────────────── │ - TTS        │
└─────────────┘   Audio + Text      └──────────────┘

Flow

Capture: Browser captures microphone audio (WebM format)
Send: Audio sent via WebSocket as base64
Transcribe: Backend uses OpenAI Whisper API to convert speech to text
Think: Text sent to AI model (GPT-4, etc.) for response
Synthesize: AI response converted to speech via TTS API
Play: Audio streamed back to browser and played

Background Music

Ambient lo-fi music is generated client-side using Web Audio API oscillators and filters — no audio files needed! The music automatically "ducks" (reduces volume) when Olaf is speaking, then returns to normal.

🧪 Development

Setup

# Clone repository
git clone https://github.com/Olafs-World/olaf-voice.git
cd olaf-voice

# Install with dev dependencies
uv pip install -e ".[dev]"

# Run tests
pytest

# Run with coverage
pytest --cov=olaf_voice --cov-report=html

Project Structure

olaf-voice/
├── src/olaf_voice/
│   ├── __init__.py          # Package exports
│   ├── __main__.py          # CLI entry point
│   ├── config.py            # Configuration management
│   ├── server.py            # FastAPI server + WebSocket
│   ├── transcribe.py        # Whisper STT integration
│   ├── tts.py              # OpenAI TTS integration
│   ├── ai.py               # AI chat integration
│   └── static/
│       ├── index.html       # Main UI
│       ├── style.css        # Styling
│       └── app.js          # Client-side logic
├── tests/                   # Pytest tests
├── pyproject.toml          # Package metadata
└── README.md

Running Tests

# Run all tests
pytest

# Run specific test file
pytest tests/test_config.py

# Run with coverage
pytest --cov=olaf_voice

# Verbose output
pytest -v

Code Quality

# Format code
black src/ tests/

# Lint
ruff check src/ tests/

# Type check
mypy src/

🐛 Troubleshooting

Microphone not working

Check browser permissions — look for 🎤 icon in address bar
Ensure HTTPS or localhost (mic requires secure context)
Try a different browser (Chrome/Edge recommended)

WebSocket connection fails

Check firewall settings
Verify port 8765 (or custom port) is not in use
Check browser console for errors

No audio playback

Check volume slider in the app
Verify system volume is not muted
Check browser's autoplay policies (click UI first to enable audio)

API errors

Verify OLAF_VOICE_OPENAI_API_KEY is set correctly
Check API quota and billing status
Enable verbose logging: olaf-voice --verbose

High latency

Use tts-1 model (faster than tts-1-hd)
Reduce ai_max_tokens for shorter responses
Use a faster AI model (e.g., gpt-4o-mini)

🤝 Contributing

Contributions welcome! Please:

Fork the repository
Create a feature branch (git checkout -b feature/amazing-feature)
Commit changes (git commit -m 'Add amazing feature')
Push to branch (git push origin feature/amazing-feature)
Open a Pull Request

📝 License

MIT License - see LICENSE for details.

🙏 Credits

Built with:

FastAPI — Modern web framework
OpenAI APIs — Whisper, TTS, GPT
Web Audio API — Browser audio processing

📬 Support

Issues: GitHub Issues
Discussions: GitHub Discussions

Made with ❤️ for seamless AI conversations

Project details

These details have not been verified by PyPI

Project links

Development Status
- 4 - Beta
Intended Audience
- Developers
License
- OSI Approved :: MIT License
Programming Language

Release history Release notifications | RSS feed

This version

0.1.0

Feb 12, 2026

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

olaf_voice-0.1.0.tar.gz (22.1 kB view details)

Uploaded Feb 12, 2026 Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

The dropdown lists show the available interpreters, ABIs, and platforms. Enable javascript to be able to filter the list of wheel files.

olaf_voice-0.1.0-py3-none-any.whl (21.9 kB view details)

Uploaded Feb 12, 2026 Python 3

File details

Details for the file olaf_voice-0.1.0.tar.gz.

File metadata

Download URL: olaf_voice-0.1.0.tar.gz
Upload date: Feb 12, 2026
Size: 22.1 kB
Tags: Source
Uploaded using Trusted Publishing? No
Uploaded via: twine/6.2.0 CPython/3.11.14

File hashes

Hashes for olaf_voice-0.1.0.tar.gz
Algorithm	Hash digest
SHA256	`241e3ddb522b69c27d1abbbf8aee7f882f8769ac3c2e4e77008b8275b96ac53b`
MD5	`64eddf89f92d650fc6d82a6681b54e68`
BLAKE2b-256	`5252068a1ca5a7e5397f605ef55cf333c042a13e5fa2983ad15a220e2d08cc9c`

See more details on using hashes here.

File details

Details for the file olaf_voice-0.1.0-py3-none-any.whl.

File metadata

Download URL: olaf_voice-0.1.0-py3-none-any.whl
Upload date: Feb 12, 2026
Size: 21.9 kB
Tags: Python 3
Uploaded using Trusted Publishing? No
Uploaded via: twine/6.2.0 CPython/3.11.14

File hashes

Hashes for olaf_voice-0.1.0-py3-none-any.whl
Algorithm	Hash digest
SHA256	`72b7e78a16b7a37370e1e766e6c8e5c6ce434ed4947aa3b152c02d24047854f7`
MD5	`419e8a85ecac3f25b789aa71e8fd6278`
BLAKE2b-256	`c347f0012b22b8882fee1f9053d7a6896955f6eeaa16502258ce0400ac192685`

See more details on using hashes here.

olaf-voice 0.1.0

Navigation

Verified details

Maintainers

Unverified details

Project links

Meta

Classifiers

Project description

Olaf Voice 🎙️

✨ Features

📸 Screenshots

Main Interface

Running Mode

🚀 Quick Start

Installation

Basic Usage

🛠️ Configuration

Environment Variables

YAML Configuration

Command-Line Options

🎯 Usage Guide

Starting a Call

Controls

Keyboard Shortcuts

Running Mode

🏗️ Architecture

Flow

Background Music

🧪 Development

Setup

Project Structure

Running Tests

Code Quality

🐛 Troubleshooting

Microphone not working

WebSocket connection fails

No audio playback

API errors

High latency

🤝 Contributing

📝 License

🙏 Credits

📬 Support

Project details

Verified details

Maintainers

Unverified details

Project links

Meta

Classifiers

Release history Release notifications | RSS feed

Download files

Source Distribution

Built Distribution

File details

File metadata

File hashes

File details

File metadata

File hashes