Real-time full-duplex voice assistant (FastAPI backend)
Project description
Real-Time Full-Duplex Voice Assistant
Low-latency, interruptible, full-duplex (talk & listen at the same time) voice assistant with a web UI, streaming ASR, TTS, and LLM orchestration. Built for real conversations, barge-in, and hands-free control.
โจ Features
- Full-duplex audio: talk and listen simultaneously (barge-in / interruption supported).
- Streaming ASR: incremental transcripts while you speak.
- Streaming TTS: assistant responds with audio before text finishes.
- LLM orchestration: tool use/function calls and stateful dialog.
- Web UI: mic capture, waveforms, and live captions in-browser.
- Production-ready stack: Traefik reverse proxy + auto TLS, Nginx static hosting, FastAPI backend.
- Single command up: deploy with
docker compose up -d.
๐งญ Architecture
Application Flow
Browser (Web UI)
โโ Mic capture (WebAudio) โ WebSocket โ Assistant (FastAPI)
โ โ
โ partial transcripts
โ โผ
โโ Live captions โ ASR (streaming via Assistant)
โ โ
โ โผ
โโ TTS audio playback โ TTS (streaming chunks)
โ โฒ
โ โ
โโ Controls/Events โ LLM Orchestrator
๐ Docker Stack & Routing
โโโโโโโโโโโโโโโโโโโโโโโโโโโโโ
โ Internet โ
โโโโโโโโโโโโโโฌโโโโโโโโโโโโโโโ
โ :80 / :443
โผ
โโโโโโโโโโโโโโโโโโโ
โ Traefik โ
โ (Reverse Proxy) โ
โโโโโโโโโฌโโโโโโโโโโ
โโโโโโโโโโโโโโโผโโโโโโโโโโโโโโ
โ โ โ
โโโโโโโโโโผโโโโ โโโโโโโผโโโโโโ โโโโผโโโโโโโโโ
โ / โ โ /api โ โ /ws โ
โ Web UI โ โ Assistantโ โ Assistant โ
โ (Nginx) โ โ (FastAPI) โ โ (FastAPI) โ
โโโโโโโโโโโโโโ โโโโโโโโโโโโโ โโโโโโโโโโโโโ
Services in this repo
- traefik: reverse proxy, automatic HTTPS via Letโs Encrypt.
- web: static frontend (served by Nginx).
- assistant: FastAPI backend (ASR, TTS, LLM orchestration, WebSockets).
- init_letsencrypt: bootstrap storage for ACME certificates.
๐ Quick Start
1. Prerequisites
- Docker & Docker Compose
- Domain pointing to your server:
com-cloud.cloud - DNS A/AAAA records configured
- API keys for ASR, TTS, and LLM providers
2. Configure Environment
Create `src/assistant/.env` with your secrets:
# LLM / Orchestrator
LLM_PROVIDER=openai
OPENAI_API_KEY=sk-...
# ASR
ASR_PROVIDER=openai_realtime
ASR_API_KEY=...
# TTS
TTS_PROVIDER=openai_realtime
TTS_API_KEY=...
# CORS / ORIGINS
ALLOWED_ORIGINS=https://com-cloud.cloud
# Optional
LOG_LEVEL=info
3. ๐ฅ๏ธ Local Development
Run backend directly:
cd src/assistant
python -m venv .venv && source .venv/bin/activate
pip install -r requirements.txt
uvicorn assistant.app:app --reload --host 0.0.0.0 --port 8000
Frontend
cd web
npm install
npm run dev
๐๏ธ Using the Assistant
Open https://com-cloud.cloud
Click on ORB to Connect to establish WebSocket session.
Speak naturally; interrupt the assistant mid-sentence.
Watch live captions, hear real-time TTS playback.
DONT FOTGET TO CLOSE THE TAB!!!
โ๏ธ Configuration
Key options:
ASR: model, language hints, VAD sensitivity.
TTS: voice, speed, sample rate.
LLM: model, temperature, tool schemas.
Traefik: TLS challenge type, timeouts, rate limits.
๐ API
GET /healthz โ service health
WS /ws/asr โ audio in โ transcript out
WS /ws/assistant โ dialog orchestration (events + responses)
WS /ws/tts โ text in โ audio out
POST /api/tools/<name> โ trigger server-side tool functions
๐ Security
HTTPS enforced (TLS via Letโs Encrypt + Traefik).
Strict CORS (limited to https://com-cloud.cloud).
API rate limiting enabled (/api).
Secrets kept in .env (not in frontend).
๐ฆ Deployment Notes
Reverse proxy: Traefik v3 with ACME TLS challenge.
Certificates stored in ./letsencrypt/acme.json.
Static frontend served by Nginx (web service).
Backend served via assistant (FastAPI) behind Traefik.
Scale with Docker Swarm / k8s if needed.
๐บ๏ธ Roadmap
Wake-word hotword detection
Speaker diarization
Plug-and-play tool registry
Persistent transcripts
Multi-voice TTS
๐ค Contributing
Fork this repo
Create a feature branch
Submit PR with screenshots/logs if UI/backend affected
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file full_duplex_assistant-0.1.1.tar.gz.
File metadata
- Download URL: full_duplex_assistant-0.1.1.tar.gz
- Upload date:
- Size: 31.0 kB
- Tags: Source
- Uploaded using Trusted Publishing? Yes
- Uploaded via: twine/6.1.0 CPython/3.13.7
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
bca3e1e3b7246bcc67cb29879369dca0048a96feecaceb1ff109db23dcd36e50
|
|
| MD5 |
cfe8351a7b4bca73fab2f16868d3e4f1
|
|
| BLAKE2b-256 |
aa67ecdc4614a58ea3ca9c360c8bc0f3d3e2f5c07362fdb1fea409812e912fa4
|
Provenance
The following attestation bundles were made for full_duplex_assistant-0.1.1.tar.gz:
Publisher:
release-py.yml on leo007-htun/full_duplex_assistant
-
Statement:
-
Statement type:
https://in-toto.io/Statement/v1 -
Predicate type:
https://docs.pypi.org/attestations/publish/v1 -
Subject name:
full_duplex_assistant-0.1.1.tar.gz -
Subject digest:
bca3e1e3b7246bcc67cb29879369dca0048a96feecaceb1ff109db23dcd36e50 - Sigstore transparency entry: 500876918
- Sigstore integration time:
-
Permalink:
leo007-htun/full_duplex_assistant@58a0a849c3067f79c94f111554b530b6f52154e1 -
Branch / Tag:
refs/tags/py-v0.1.1 - Owner: https://github.com/leo007-htun
-
Access:
public
-
Token Issuer:
https://token.actions.githubusercontent.com -
Runner Environment:
github-hosted -
Publication workflow:
release-py.yml@58a0a849c3067f79c94f111554b530b6f52154e1 -
Trigger Event:
push
-
Statement type:
File details
Details for the file full_duplex_assistant-0.1.1-py3-none-any.whl.
File metadata
- Download URL: full_duplex_assistant-0.1.1-py3-none-any.whl
- Upload date:
- Size: 19.0 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? Yes
- Uploaded via: twine/6.1.0 CPython/3.13.7
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
19ce6aacecf290271e52c67eec4a06d6a89c1342e526014028917bbc88d8b383
|
|
| MD5 |
e7b3b3d720c5a66782c58e08506c221e
|
|
| BLAKE2b-256 |
4179636bbc0f72c4c1279cc79e53ff9785d67fc94e8129753f6c520d342e8efd
|
Provenance
The following attestation bundles were made for full_duplex_assistant-0.1.1-py3-none-any.whl:
Publisher:
release-py.yml on leo007-htun/full_duplex_assistant
-
Statement:
-
Statement type:
https://in-toto.io/Statement/v1 -
Predicate type:
https://docs.pypi.org/attestations/publish/v1 -
Subject name:
full_duplex_assistant-0.1.1-py3-none-any.whl -
Subject digest:
19ce6aacecf290271e52c67eec4a06d6a89c1342e526014028917bbc88d8b383 - Sigstore transparency entry: 500876946
- Sigstore integration time:
-
Permalink:
leo007-htun/full_duplex_assistant@58a0a849c3067f79c94f111554b530b6f52154e1 -
Branch / Tag:
refs/tags/py-v0.1.1 - Owner: https://github.com/leo007-htun
-
Access:
public
-
Token Issuer:
https://token.actions.githubusercontent.com -
Runner Environment:
github-hosted -
Publication workflow:
release-py.yml@58a0a849c3067f79c94f111554b530b6f52154e1 -
Trigger Event:
push
-
Statement type: