Skip to main content

Real-time full-duplex voice assistant (FastAPI backend)

Project description

Real-Time Full-Duplex Voice Assistant ๐ŸŽ™๏ธ Live Demo โ€” com-cloud.cloud

Low-latency, interruptible, full-duplex (talk & listen at the same time) voice assistant with a web UI, streaming ASR, TTS, and LLM orchestration. Built for real conversations, barge-in, and hands-free control.

full duplex assistant banner docker compose realtime


โœจ Features

  • Full-duplex audio: talk and listen simultaneously (barge-in / interruption supported).
  • Streaming ASR: incremental transcripts while you speak.
  • Streaming TTS: assistant responds with audio before text finishes.
  • LLM orchestration: tool use/function calls and stateful dialog.
  • Web UI: mic capture, waveforms, and live captions in-browser.
  • Production-ready stack: Traefik reverse proxy + auto TLS, Nginx static hosting, FastAPI backend.
  • Single command up: deploy with docker compose up -d.

๐Ÿงญ Architecture

Application Flow
Browser (Web UI)
โ”œโ”€ Mic capture (WebAudio) โ†’ WebSocket โ†’ Assistant (FastAPI)
โ”‚ โ”‚
โ”‚ partial transcripts
โ”‚ โ–ผ
โ”œโ”€ Live captions โ† ASR (streaming via Assistant)
โ”‚ โ”‚
โ”‚ โ–ผ
โ”œโ”€ TTS audio playback โ† TTS (streaming chunks)
โ”‚ โ–ฒ
โ”‚ โ”‚
โ””โ”€ Controls/Events โ†’ LLM Orchestrator

๐Ÿ‹ Docker Stack & Routing

           โ”Œโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”
           โ”‚        Internet            โ”‚
           โ””โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”ฌโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”˜
                        โ”‚  :80 / :443
                        โ–ผ
               โ”Œโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”
               โ”‚     Traefik     โ”‚
               โ”‚ (Reverse Proxy) โ”‚
               โ””โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”ฌโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”˜
         โ”Œโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”ผโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”
         โ”‚             โ”‚             โ”‚
โ”Œโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ–ผโ”€โ”€โ”€โ”   โ”Œโ”€โ”€โ”€โ”€โ”€โ–ผโ”€โ”€โ”€โ”€โ”€โ”   โ”Œโ”€โ”€โ–ผโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”
โ”‚   /        โ”‚   โ”‚   /api    โ”‚   โ”‚   /ws     โ”‚
โ”‚   Web UI   โ”‚   โ”‚  Assistantโ”‚   โ”‚ Assistant โ”‚
โ”‚ (Nginx)    โ”‚   โ”‚ (FastAPI) โ”‚   โ”‚ (FastAPI) โ”‚
โ””โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”˜   โ””โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”˜   โ””โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”˜

Services in this repo

  • traefik: reverse proxy, automatic HTTPS via Letโ€™s Encrypt.
  • web: static frontend (served by Nginx).
  • assistant: FastAPI backend (ASR, TTS, LLM orchestration, WebSockets).
  • init_letsencrypt: bootstrap storage for ACME certificates.

๐Ÿš€ Quick Start

1. Prerequisites
  • Docker & Docker Compose
  • Domain pointing to your server: com-cloud.cloud
  • DNS A/AAAA records configured
  • API keys for ASR, TTS, and LLM providers
2. Configure Environment
Create `src/assistant/.env` with your secrets:

# LLM / Orchestrator
LLM_PROVIDER=openai
OPENAI_API_KEY=sk-...

# ASR
ASR_PROVIDER=openai_realtime
ASR_API_KEY=...

# TTS
TTS_PROVIDER=openai_realtime
TTS_API_KEY=...

# CORS / ORIGINS
ALLOWED_ORIGINS=https://com-cloud.cloud

# Optional
LOG_LEVEL=info
3. ๐Ÿ–ฅ๏ธ Local Development
Run backend directly:
cd src/assistant
python -m venv .venv && source .venv/bin/activate
pip install -r requirements.txt
uvicorn assistant.app:app --reload --host 0.0.0.0 --port 8000
Frontend
cd web
npm install
npm run dev

๐ŸŽ™๏ธ Using the Assistant

Open https://com-cloud.cloud

Click on ORB to Connect to establish WebSocket session.

Speak naturally; interrupt the assistant mid-sentence.

Watch live captions, hear real-time TTS playback.

DONT FOTGET TO CLOSE THE TAB!!!

โš™๏ธ Configuration

Key options:

ASR: model, language hints, VAD sensitivity.

TTS: voice, speed, sample rate.

LLM: model, temperature, tool schemas.

Traefik: TLS challenge type, timeouts, rate limits.

๐Ÿ”Œ API

GET /healthz โ€“ service health

WS /ws/asr โ€“ audio in โ†” transcript out

WS /ws/assistant โ€“ dialog orchestration (events + responses)

WS /ws/tts โ€“ text in โ†” audio out

POST /api/tools/<name> โ€“ trigger server-side tool functions

๐Ÿ” Security

HTTPS enforced (TLS via Letโ€™s Encrypt + Traefik).

Strict CORS (limited to https://com-cloud.cloud).

API rate limiting enabled (/api).

Secrets kept in .env (not in frontend).

๐Ÿ“ฆ Deployment Notes

Reverse proxy: Traefik v3 with ACME TLS challenge.

Certificates stored in ./letsencrypt/acme.json.

Static frontend served by Nginx (web service).

Backend served via assistant (FastAPI) behind Traefik.

Scale with Docker Swarm / k8s if needed.

๐Ÿ—บ๏ธ Roadmap

 Wake-word hotword detection

 Speaker diarization

 Plug-and-play tool registry

 Persistent transcripts

 Multi-voice TTS

๐Ÿค Contributing

Fork this repo

Create a feature branch

Submit PR with screenshots/logs if UI/backend affected

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

full_duplex_assistant-0.1.1.tar.gz (31.0 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

full_duplex_assistant-0.1.1-py3-none-any.whl (19.0 kB view details)

Uploaded Python 3

File details

Details for the file full_duplex_assistant-0.1.1.tar.gz.

File metadata

  • Download URL: full_duplex_assistant-0.1.1.tar.gz
  • Upload date:
  • Size: 31.0 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.13.7

File hashes

Hashes for full_duplex_assistant-0.1.1.tar.gz
Algorithm Hash digest
SHA256 bca3e1e3b7246bcc67cb29879369dca0048a96feecaceb1ff109db23dcd36e50
MD5 cfe8351a7b4bca73fab2f16868d3e4f1
BLAKE2b-256 aa67ecdc4614a58ea3ca9c360c8bc0f3d3e2f5c07362fdb1fea409812e912fa4

See more details on using hashes here.

Provenance

The following attestation bundles were made for full_duplex_assistant-0.1.1.tar.gz:

Publisher: release-py.yml on leo007-htun/full_duplex_assistant

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file full_duplex_assistant-0.1.1-py3-none-any.whl.

File metadata

File hashes

Hashes for full_duplex_assistant-0.1.1-py3-none-any.whl
Algorithm Hash digest
SHA256 19ce6aacecf290271e52c67eec4a06d6a89c1342e526014028917bbc88d8b383
MD5 e7b3b3d720c5a66782c58e08506c221e
BLAKE2b-256 4179636bbc0f72c4c1279cc79e53ff9785d67fc94e8129753f6c520d342e8efd

See more details on using hashes here.

Provenance

The following attestation bundles were made for full_duplex_assistant-0.1.1-py3-none-any.whl:

Publisher: release-py.yml on leo007-htun/full_duplex_assistant

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page