Real-time full-duplex voice assistant (FastAPI backend)

These details have been verified by PyPI

Project links

GitHub Statistics

Maintainers

Project description

Real-Time Full-Duplex Voice Assistant

Low-latency, interruptible, full-duplex (talk & listen at the same time) voice assistant with a web UI, streaming ASR, TTS, and LLM orchestration. Built for real conversations, barge-in, and hands-free control.

full duplex assistant banner docker compose realtime

✨ Features

Full-duplex audio: talk and listen simultaneously (barge-in / interruption supported).
Streaming ASR: incremental transcripts while you speak.
Streaming TTS: assistant responds with audio before text finishes.
LLM orchestration: tool use/function calls and stateful dialog.
Web UI: mic capture, waveforms, and live captions in-browser.
Production-ready stack: Traefik reverse proxy + auto TLS, Nginx static hosting, FastAPI backend.
Single command up: deploy with docker compose up -d.

🧭 Architecture

Application Flow

Browser (Web UI)
├─ Mic capture (WebAudio) → WebSocket → Assistant (FastAPI)
│ │
│ partial transcripts
│ ▼
├─ Live captions ← ASR (streaming via Assistant)
│ │
│ ▼
├─ TTS audio playback ← TTS (streaming chunks)
│ ▲
│ │
└─ Controls/Events → LLM Orchestrator

🐋 Docker Stack & Routing

           ┌───────────────────────────┐
           │        Internet            │
           └────────────┬──────────────┘
                        │  :80 / :443
                        ▼
               ┌─────────────────┐
               │     Traefik     │
               │ (Reverse Proxy) │
               └───────┬─────────┘
         ┌─────────────┼─────────────┐
         │             │             │
┌────────▼───┐   ┌─────▼─────┐   ┌──▼────────┐
│   /        │   │   /api    │   │   /ws     │
│   Web UI   │   │  Assistant│   │ Assistant │
│ (Nginx)    │   │ (FastAPI) │   │ (FastAPI) │
└────────────┘   └───────────┘   └───────────┘

Services in this repo

traefik: reverse proxy, automatic HTTPS via Let’s Encrypt.
web: static frontend (served by Nginx).
assistant: FastAPI backend (ASR, TTS, LLM orchestration, WebSockets).
init_letsencrypt: bootstrap storage for ACME certificates.

🚀 Quick Start

1. Prerequisites

Docker & Docker Compose
Domain pointing to your server: com-cloud.cloud
DNS A/AAAA records configured
API keys for ASR, TTS, and LLM providers

2. Configure Environment

Create `src/assistant/.env` with your secrets:

# LLM / Orchestrator
LLM_PROVIDER=openai
OPENAI_API_KEY=sk-...

# ASR
ASR_PROVIDER=openai_realtime
ASR_API_KEY=...

# TTS
TTS_PROVIDER=openai_realtime
TTS_API_KEY=...

# CORS / ORIGINS
ALLOWED_ORIGINS=https://com-cloud.cloud

# Optional
LOG_LEVEL=info

3. 🖥️ Local Development

Run backend directly:

cd src/assistant
python -m venv .venv && source .venv/bin/activate
pip install -r requirements.txt
uvicorn assistant.app:app --reload --host 0.0.0.0 --port 8000

Frontend

cd web
npm install
npm run dev

🎙️ Using the Assistant

Open https://com-cloud.cloud

Click on ORB to Connect to establish WebSocket session.

Speak naturally; interrupt the assistant mid-sentence.

Watch live captions, hear real-time TTS playback.

DONT FOTGET TO CLOSE THE TAB!!!

⚙️ Configuration

Key options:

ASR: model, language hints, VAD sensitivity.

TTS: voice, speed, sample rate.

LLM: model, temperature, tool schemas.

Traefik: TLS challenge type, timeouts, rate limits.

🔌 API

GET /healthz – service health

WS /ws/asr – audio in ↔ transcript out

WS /ws/assistant – dialog orchestration (events + responses)

WS /ws/tts – text in ↔ audio out

POST /api/tools/<name> – trigger server-side tool functions

🔐 Security

HTTPS enforced (TLS via Let’s Encrypt + Traefik).

Strict CORS (limited to https://com-cloud.cloud).

API rate limiting enabled (/api).

Secrets kept in .env (not in frontend).

📦 Deployment Notes

Reverse proxy: Traefik v3 with ACME TLS challenge.

Certificates stored in ./letsencrypt/acme.json.

Static frontend served by Nginx (web service).

Backend served via assistant (FastAPI) behind Traefik.

Scale with Docker Swarm / k8s if needed.

🗺️ Roadmap

 Wake-word hotword detection

 Speaker diarization

 Plug-and-play tool registry

 Persistent transcripts

 Multi-voice TTS

🤝 Contributing

Fork this repo

Create a feature branch

Submit PR with screenshots/logs if UI/backend affected

Project details

These details have been verified by PyPI

Project links

GitHub Statistics

Maintainers

leo007-htun

Release history Release notifications | RSS feed

This version

0.1.1

Sep 11, 2025

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

full_duplex_assistant-0.1.1.tar.gz (31.0 kB view details)

Uploaded Sep 11, 2025 Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

The dropdown lists show the available interpreters, ABIs, and platforms. Enable javascript to be able to filter the list of wheel files.

full_duplex_assistant-0.1.1-py3-none-any.whl (19.0 kB view details)

Uploaded Sep 11, 2025 Python 3

File details

Details for the file full_duplex_assistant-0.1.1.tar.gz.

File metadata

Download URL: full_duplex_assistant-0.1.1.tar.gz
Upload date: Sep 11, 2025
Size: 31.0 kB
Tags: Source
Uploaded using Trusted Publishing? Yes
Uploaded via: twine/6.1.0 CPython/3.13.7

File hashes

Hashes for full_duplex_assistant-0.1.1.tar.gz
Algorithm	Hash digest
SHA256	`bca3e1e3b7246bcc67cb29879369dca0048a96feecaceb1ff109db23dcd36e50`
MD5	`cfe8351a7b4bca73fab2f16868d3e4f1`
BLAKE2b-256	`aa67ecdc4614a58ea3ca9c360c8bc0f3d3e2f5c07362fdb1fea409812e912fa4`

See more details on using hashes here.

Provenance

The following attestation bundles were made for full_duplex_assistant-0.1.1.tar.gz:

Publisher: release-py.yml on leo007-htun/full_duplex_assistant

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

Statement:
- Statement type: https://in-toto.io/Statement/v1
- Predicate type: https://docs.pypi.org/attestations/publish/v1
- Subject name: full_duplex_assistant-0.1.1.tar.gz
- Subject digest: bca3e1e3b7246bcc67cb29879369dca0048a96feecaceb1ff109db23dcd36e50
- Sigstore transparency entry: 500876918
- Sigstore integration time: Sep 11, 2025
Source repository:
- Permalink: leo007-htun/full_duplex_assistant@58a0a849c3067f79c94f111554b530b6f52154e1
- Branch / Tag: refs/tags/py-v0.1.1
- Owner: https://github.com/leo007-htun
- Access: public
Publication detail:
- Token Issuer: https://token.actions.githubusercontent.com
- Runner Environment: github-hosted
- Publication workflow: release-py.yml@58a0a849c3067f79c94f111554b530b6f52154e1
- Trigger Event: push

File details

Details for the file full_duplex_assistant-0.1.1-py3-none-any.whl.

File metadata

Download URL: full_duplex_assistant-0.1.1-py3-none-any.whl
Upload date: Sep 11, 2025
Size: 19.0 kB
Tags: Python 3
Uploaded using Trusted Publishing? Yes
Uploaded via: twine/6.1.0 CPython/3.13.7

File hashes

Hashes for full_duplex_assistant-0.1.1-py3-none-any.whl
Algorithm	Hash digest
SHA256	`19ce6aacecf290271e52c67eec4a06d6a89c1342e526014028917bbc88d8b383`
MD5	`e7b3b3d720c5a66782c58e08506c221e`
BLAKE2b-256	`4179636bbc0f72c4c1279cc79e53ff9785d67fc94e8129753f6c520d342e8efd`

See more details on using hashes here.

Provenance

The following attestation bundles were made for full_duplex_assistant-0.1.1-py3-none-any.whl:

Publisher: release-py.yml on leo007-htun/full_duplex_assistant

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

Statement:
- Statement type: https://in-toto.io/Statement/v1
- Predicate type: https://docs.pypi.org/attestations/publish/v1
- Subject name: full_duplex_assistant-0.1.1-py3-none-any.whl
- Subject digest: 19ce6aacecf290271e52c67eec4a06d6a89c1342e526014028917bbc88d8b383
- Sigstore transparency entry: 500876946
- Sigstore integration time: Sep 11, 2025
Source repository:
- Permalink: leo007-htun/full_duplex_assistant@58a0a849c3067f79c94f111554b530b6f52154e1
- Branch / Tag: refs/tags/py-v0.1.1
- Owner: https://github.com/leo007-htun
- Access: public
Publication detail:
- Token Issuer: https://token.actions.githubusercontent.com
- Runner Environment: github-hosted
- Publication workflow: release-py.yml@58a0a849c3067f79c94f111554b530b6f52154e1
- Trigger Event: push

full-duplex-assistant 0.1.1

Navigation

Verified details

Project links

GitHub Statistics

Maintainers

Unverified details

Meta

Project description

Real-Time Full-Duplex Voice Assistant

✨ Features

🧭 Architecture

Application Flow

🐋 Docker Stack & Routing

Services in this repo

🚀 Quick Start

1. Prerequisites

2. Configure Environment

3. 🖥️ Local Development

Run backend directly:

Frontend

🎙️ Using the Assistant

⚙️ Configuration

🔌 API

🔐 Security

📦 Deployment Notes

🗺️ Roadmap

🤝 Contributing

Project details

Verified details

Project links

GitHub Statistics

Maintainers

Unverified details

Meta

Release history Release notifications | RSS feed

Download files

Source Distribution

Built Distribution

File details

File metadata

File hashes

Provenance

File details

File metadata

File hashes

Provenance