Open voice infrastructure for AI agents — bridging telephony and AI
Project description
Voxtra
Open voice infrastructure for AI agents.
Built by Rexplore Research Labs
Voxtra is a Python framework that bridges telephony infrastructure (Asterisk, FreeSWITCH, LiveKit) with AI voice agents (STT, LLM, TTS). It lets developers build AI-powered call centers without needing to understand telecom internals.
Architecture
graph LR
A[Cellular Provider] -->|SIP Trunk| B[Asterisk PBX]
B -->|ARI + Media| C[Voxtra]
C --> D[STT]
C --> E[LLM]
C --> F[TTS]
D -->|transcript| E
E -->|response| F
F -->|audio| C
style A fill:#4a90d9,stroke:#333,color:#fff
style B fill:#e67e22,stroke:#333,color:#fff
style C fill:#2ecc71,stroke:#333,color:#fff
style D fill:#9b59b6,stroke:#333,color:#fff
style E fill:#e74c3c,stroke:#333,color:#fff
style F fill:#1abc9c,stroke:#333,color:#fff
Layer Design
| Layer | Package | Responsibility |
|---|---|---|
| Core | voxtra.app, voxtra.router, voxtra.session |
App lifecycle, decorator-based routing, call sessions with say / listen / agent |
| Telephony | voxtra.telephony, voxtra.ari |
BaseTelephonyAdapter ABC; AsteriskAdapter wraps the async ARIClient |
| Audio | voxtra.audio |
AudioSocketServer — TCP audio I/O with Asterisk; μ-law / A-law / PCM codec helpers |
| Media | voxtra.media |
AudioFrame + BaseMediaTransport; CallSessionMediaTransport bridges sessions into the pipeline |
| AI | voxtra.ai |
STT, TTS, LLM, VAD provider abstractions; Registry plugin system |
| Pipeline | voxtra.core.pipeline |
Real-time STT → LLM → TTS orchestration; auto-wired per session when providers configured |
| Provisioning | voxtra.provisioning |
Per-tenant Asterisk pjsip / dialplan generation (optional, voxtra[provisioning]) |
Quick Start
Installation
From PyPI:
pip install voxtra
With provider extras (Asterisk is part of the core install — no extra needed):
pip install "voxtra[deepgram,openai,elevenlabs,cartesia]"
# or grab everything in one go
pip install "voxtra[all]"
Available extras: deepgram, openai, elevenlabs, cartesia, livekit, provisioning, all, dev.
From GitHub (latest development version):
pip install git+https://github.com/rexplore-ai/voxtra.git
From source (for development):
git clone https://github.com/rexplore-ai/voxtra.git
cd voxtra
python3 -m venv .venv && source .venv/bin/activate
pip install -e ".[dev]"
Code-First Usage
from voxtra import VoxtraApp
app = VoxtraApp.from_yaml("voxtra.yaml")
@app.route(extension="1000")
async def support_call(session):
await session.answer()
await session.say("Hello, welcome to support. How can I help you?")
text = await session.listen()
reply = await session.agent.respond(text)
await session.say(reply.text)
await session.hangup()
app.run()
Config-First Usage
Create voxtra.yaml:
app_name: my-call-center
telephony:
provider: asterisk
asterisk:
base_url: http://localhost:8088
username: asterisk
password: secret
app_name: voxtra
media:
transport: websocket
codec: ulaw
sample_rate: 8000
ai:
stt:
provider: deepgram
api_key: ${DEEPGRAM_API_KEY}
model: nova-2
llm:
provider: openai
api_key: ${OPENAI_API_KEY}
model: gpt-4o
system_prompt: "You are a helpful voice assistant for a call center."
tts:
provider: elevenlabs
api_key: ${ELEVENLABS_API_KEY}
voice_id: your-voice-id
routes:
- extension: "1000"
agent: support_agent
Then run:
voxtra start
Asterisk Integration
Voxtra connects to Asterisk on two channels:
- ARI (Asterisk REST Interface) — control plane. HTTP for call operations, WebSocket for events.
- AudioSocket — media plane. A simple framed TCP protocol (1-byte type + 3-byte length + payload). Voxtra's
AudioSocketServeraccepts the connection Asterisk opens; no RTP/NAT/SDP to worry about.
Add this to your dialplan to route inbound calls into the Voxtra Stasis app:
[voxtra-inbound]
exten => _X.,1,Stasis(voxtra)
same => n,Hangup()
Voxtra opens AudioSocket connections on demand the first time a handler calls session.audio_stream(), session.say(), session.listen(), or any other audio I/O.
Supported Providers
Telephony
- Asterisk (ARI) — Production ready
- LiveKit (SIP) — Planned
- FreeSWITCH — Planned
Speech-to-Text
- Deepgram (streaming)
- More coming soon
LLM / Agents
- OpenAI (GPT-4o, streaming)
- LangGraph integration planned
Text-to-Speech
- ElevenLabs (streaming)
- Cartesia (streaming)
- More coming soon
Project Structure
src/voxtra/
├── app.py # VoxtraApp — entry point, lifecycle, from_yaml/from_config
├── session.py # CallSession + AgentClient (say/listen/agent)
├── router.py # Decorator-based call routing
├── registry.py # Provider plugin registry (STT/TTS/LLM/VAD/telephony/media)
├── events.py # VoxtraEvent + typed subclasses
├── config.py # Pydantic config models + VoxtraConfig.from_yaml
├── middleware.py # Event middleware
├── exceptions.py # Custom exceptions
├── types.py # AudioChunk, CallState, AudioCodec, SIPTrunk, …
├── cli.py # `voxtra` CLI: start, init, info, check
├── ari/ # Asterisk ARI client
│ ├── client.py # async HTTP + WebSocket client
│ ├── events.py # ARIEvent typed model
│ └── models.py # Channel / Bridge / Playback Pydantic models
├── audio/ # AudioSocket — TCP audio I/O with Asterisk
│ ├── socket.py # AudioSocketServer + AudioSocketConnection
│ └── codec.py # μ-law / A-law / PCM-S16LE conversion
├── telephony/ # Backend abstraction
│ ├── base.py # BaseTelephonyAdapter ABC
│ ├── asterisk/adapter.py # AsteriskAdapter (wraps ARIClient)
│ └── livekit/ # LiveKit adapter (stub)
├── media/ # Frame-oriented media stack used by VoicePipeline
│ ├── audio.py # AudioFrame + codec helpers
│ ├── base.py # BaseMediaTransport ABC
│ ├── websocket.py # WebSocket transport
│ ├── buffer.py # Audio buffering
│ └── session_transport.py # Bridges CallSession ↔ BaseMediaTransport
├── core/
│ └── pipeline.py # VoicePipeline — STT → LLM → TTS orchestration
├── provisioning/ # Per-tenant Asterisk config generation (optional)
│ └── provisioner.py # pjsip / extensions / ari fragment writer
└── ai/
├── stt/ # Speech-to-Text providers (Deepgram, …)
├── tts/ # Text-to-Speech providers (ElevenLabs, Cartesia, …)
├── llm/ # LLM / Agent providers (OpenAI, …)
└── vad/ # Voice Activity Detection
Documentation
- Architecture — Deep-dive into every layer, component, data flow, and design decision
- Contributing — How to set up dev environment, add providers, submit PRs, and code standards
Development
git clone git@github.com:rexplore-ai/voxtra.git
cd voxtra
python3 -m venv .venv && source .venv/bin/activate
pip install -e ".[dev]"
pytest
Roadmap
Shipped in 0.3.0:
- Core abstractions (VoxtraApp, Router, CallSession, Events)
- Asterisk ARI adapter (wraps async
ARIClient, conforms toBaseTelephonyAdapter) - AudioSocket TCP transport + μ-law / A-law / PCM codec helpers
- AI provider interfaces (STT, TTS, LLM, VAD) +
Registryplugin system - Voice pipeline (STT → LLM → TTS), auto-wired per session when providers configured
- High-level session API:
say(text),listen(timeout=),agent.respond(text) -
VoxtraApp.from_yaml(path)/from_config(VoxtraConfig)+ workingvoxtra startCLI - WebSocket media transport
- Per-tenant Asterisk provisioning (config file generation)
Planned:
- End-to-end Asterisk + AI demo with recordings
- LiveKit adapter (currently a stub)
- FreeSWITCH adapter
- HMAC-signed backend webhook emitter
- Recording sinks (GCS / S3) — auto-upload after
record_stop() - Multi-tenant runtime supervisor (one ARI app per tenant)
- Provisioner stage 2: SSH +
asterisk -rx reload - Prometheus metrics ASGI sub-app
- LangGraph agent integration
- Multi-agent handoff
- Dashboard / Admin API
- Conversation analytics
Contributors
Thanks to everyone who has contributed to Voxtra!
Patrick Byamasu — Creator & Lead Maintainer
Want to contribute? Check out our Contributing Guide.
License
Apache 2.0 — See LICENSE
Voxtra — The LangGraph of AI Telephony Built by Rexplore Research Labs
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file voxtra-0.3.1.tar.gz.
File metadata
- Download URL: voxtra-0.3.1.tar.gz
- Upload date:
- Size: 136.1 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.1.0 CPython/3.13.12
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
75a2d9b1b5d829b47e9835cda2704174dcd0b97be5e019e90021b6e4465885d4
|
|
| MD5 |
3f19c872e6ae845334d5c76a6a3c5d33
|
|
| BLAKE2b-256 |
9befa90251f7fa884ba918cb74995442a74b7eecf9ce768e13b90f5de32c2c3d
|
File details
Details for the file voxtra-0.3.1-py3-none-any.whl.
File metadata
- Download URL: voxtra-0.3.1-py3-none-any.whl
- Upload date:
- Size: 89.8 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.1.0 CPython/3.13.12
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
54cd4334026bdebaeb0123785265958276fc37123d69febbab9df9678d39d9cc
|
|
| MD5 |
4fa877b567824662fc9d014ecac5886f
|
|
| BLAKE2b-256 |
ea1d9c9f25c58f5a3c5bc915f1a670580d7ae51172654a052a0c9c0a594af86d
|