Skip to main content

Open voice infrastructure for AI agents — bridging telephony and AI

Project description

Voxtra

Open voice infrastructure for AI agents.

Built by Rexplore Research Labs

Voxtra is a Python framework that bridges telephony infrastructure (Asterisk, FreeSWITCH, LiveKit) with AI voice agents (STT, LLM, TTS). It lets developers build AI-powered call centers without needing to understand telecom internals.

Architecture

graph LR
    A[Cellular Provider] -->|SIP Trunk| B[Asterisk PBX]
    B -->|ARI + Media| C[Voxtra]
    C --> D[STT]
    C --> E[LLM]
    C --> F[TTS]

    D -->|transcript| E
    E -->|response| F
    F -->|audio| C

    style A fill:#4a90d9,stroke:#333,color:#fff
    style B fill:#e67e22,stroke:#333,color:#fff
    style C fill:#2ecc71,stroke:#333,color:#fff
    style D fill:#9b59b6,stroke:#333,color:#fff
    style E fill:#e74c3c,stroke:#333,color:#fff
    style F fill:#1abc9c,stroke:#333,color:#fff

Layer Design

Layer Package Responsibility
Core voxtra.app, voxtra.router, voxtra.session App lifecycle, decorator-based routing, call sessions with say / listen / agent
Telephony voxtra.telephony, voxtra.ari BaseTelephonyAdapter ABC; AsteriskAdapter wraps the async ARIClient
Audio voxtra.audio AudioSocketServer — TCP audio I/O with Asterisk; μ-law / A-law / PCM codec helpers
Media voxtra.media AudioFrame + BaseMediaTransport; CallSessionMediaTransport bridges sessions into the pipeline
AI voxtra.ai STT, TTS, LLM, VAD provider abstractions; Registry plugin system
Pipeline voxtra.core.pipeline Real-time STT → LLM → TTS orchestration; auto-wired per session when providers configured
Provisioning voxtra.provisioning Per-tenant Asterisk pjsip / dialplan generation (optional, voxtra[provisioning])

Quick Start

Installation

From PyPI:

pip install voxtra

With provider extras (Asterisk is part of the core install — no extra needed):

pip install "voxtra[deepgram,openai,elevenlabs,cartesia]"
# or grab everything in one go
pip install "voxtra[all]"

Available extras: deepgram, openai, elevenlabs, cartesia, livekit, provisioning, all, dev.

From GitHub (latest development version):

pip install git+https://github.com/rexplore-ai/voxtra.git

From source (for development):

git clone https://github.com/rexplore-ai/voxtra.git
cd voxtra
python3 -m venv .venv && source .venv/bin/activate
pip install -e ".[dev]"

Code-First Usage

from voxtra import VoxtraApp

app = VoxtraApp.from_yaml("voxtra.yaml")

@app.route(extension="1000")
async def support_call(session):
    await session.answer()
    await session.say("Hello, welcome to support. How can I help you?")
    text = await session.listen()
    reply = await session.agent.respond(text)
    await session.say(reply.text)
    await session.hangup()

app.run()

Config-First Usage

Create voxtra.yaml:

app_name: my-call-center

telephony:
  provider: asterisk
  asterisk:
    base_url: http://localhost:8088
    username: asterisk
    password: secret
    app_name: voxtra

media:
  transport: websocket
  codec: ulaw
  sample_rate: 8000

ai:
  stt:
    provider: deepgram
    api_key: ${DEEPGRAM_API_KEY}
    model: nova-2
  llm:
    provider: openai
    api_key: ${OPENAI_API_KEY}
    model: gpt-4o
    system_prompt: "You are a helpful voice assistant for a call center."
  tts:
    provider: elevenlabs
    api_key: ${ELEVENLABS_API_KEY}
    voice_id: your-voice-id

routes:
  - extension: "1000"
    agent: support_agent

Then run:

voxtra start

Asterisk Integration

Voxtra connects to Asterisk on two channels:

  • ARI (Asterisk REST Interface) — control plane. HTTP for call operations, WebSocket for events.
  • AudioSocket — media plane. A simple framed TCP protocol (1-byte type + 3-byte length + payload). Voxtra's AudioSocketServer accepts the connection Asterisk opens; no RTP/NAT/SDP to worry about.

Add this to your dialplan to route inbound calls into the Voxtra Stasis app:

[voxtra-inbound]
exten => _X.,1,Stasis(voxtra)
 same => n,Hangup()

Voxtra opens AudioSocket connections on demand the first time a handler calls session.audio_stream(), session.say(), session.listen(), or any other audio I/O.

Supported Providers

Telephony

  • Asterisk (ARI) — Production ready
  • LiveKit (SIP) — Planned
  • FreeSWITCH — Planned

Speech-to-Text

  • Deepgram (streaming)
  • More coming soon

LLM / Agents

  • OpenAI (GPT-4o, streaming)
  • LangGraph integration planned

Text-to-Speech

  • ElevenLabs (streaming)
  • Cartesia (streaming)
  • More coming soon

Project Structure

src/voxtra/
├── app.py                       # VoxtraApp — entry point, lifecycle, from_yaml/from_config
├── session.py                   # CallSession + AgentClient (say/listen/agent)
├── router.py                    # Decorator-based call routing
├── registry.py                  # Provider plugin registry (STT/TTS/LLM/VAD/telephony/media)
├── events.py                    # VoxtraEvent + typed subclasses
├── config.py                    # Pydantic config models + VoxtraConfig.from_yaml
├── middleware.py                # Event middleware
├── exceptions.py                # Custom exceptions
├── types.py                     # AudioChunk, CallState, AudioCodec, SIPTrunk, …
├── cli.py                       # `voxtra` CLI: start, init, info, check
├── ari/                         # Asterisk ARI client
│   ├── client.py                #   async HTTP + WebSocket client
│   ├── events.py                #   ARIEvent typed model
│   └── models.py                #   Channel / Bridge / Playback Pydantic models
├── audio/                       # AudioSocket — TCP audio I/O with Asterisk
│   ├── socket.py                #   AudioSocketServer + AudioSocketConnection
│   └── codec.py                 #   μ-law / A-law / PCM-S16LE conversion
├── telephony/                   # Backend abstraction
│   ├── base.py                  #   BaseTelephonyAdapter ABC
│   ├── asterisk/adapter.py      #   AsteriskAdapter (wraps ARIClient)
│   └── livekit/                 #   LiveKit adapter (stub)
├── media/                       # Frame-oriented media stack used by VoicePipeline
│   ├── audio.py                 #   AudioFrame + codec helpers
│   ├── base.py                  #   BaseMediaTransport ABC
│   ├── websocket.py             #   WebSocket transport
│   ├── buffer.py                #   Audio buffering
│   └── session_transport.py     #   Bridges CallSession ↔ BaseMediaTransport
├── core/
│   └── pipeline.py              # VoicePipeline — STT → LLM → TTS orchestration
├── provisioning/                # Per-tenant Asterisk config generation (optional)
│   └── provisioner.py           #   pjsip / extensions / ari fragment writer
└── ai/
    ├── stt/                     # Speech-to-Text providers (Deepgram, …)
    ├── tts/                     # Text-to-Speech providers (ElevenLabs, Cartesia, …)
    ├── llm/                     # LLM / Agent providers (OpenAI, …)
    └── vad/                     # Voice Activity Detection

Documentation

  • Architecture — Deep-dive into every layer, component, data flow, and design decision
  • Contributing — How to set up dev environment, add providers, submit PRs, and code standards

Development

git clone git@github.com:rexplore-ai/voxtra.git
cd voxtra
python3 -m venv .venv && source .venv/bin/activate
pip install -e ".[dev]"
pytest

Roadmap

Shipped in 0.3.0:

  • Core abstractions (VoxtraApp, Router, CallSession, Events)
  • Asterisk ARI adapter (wraps async ARIClient, conforms to BaseTelephonyAdapter)
  • AudioSocket TCP transport + μ-law / A-law / PCM codec helpers
  • AI provider interfaces (STT, TTS, LLM, VAD) + Registry plugin system
  • Voice pipeline (STT → LLM → TTS), auto-wired per session when providers configured
  • High-level session API: say(text), listen(timeout=), agent.respond(text)
  • VoxtraApp.from_yaml(path) / from_config(VoxtraConfig) + working voxtra start CLI
  • WebSocket media transport
  • Per-tenant Asterisk provisioning (config file generation)

Planned:

  • End-to-end Asterisk + AI demo with recordings
  • LiveKit adapter (currently a stub)
  • FreeSWITCH adapter
  • HMAC-signed backend webhook emitter
  • Recording sinks (GCS / S3) — auto-upload after record_stop()
  • Multi-tenant runtime supervisor (one ARI app per tenant)
  • Provisioner stage 2: SSH + asterisk -rx reload
  • Prometheus metrics ASGI sub-app
  • LangGraph agent integration
  • Multi-agent handoff
  • Dashboard / Admin API
  • Conversation analytics

Contributors

Thanks to everyone who has contributed to Voxtra!

Patrick Byamasu

Patrick Byamasu — Creator & Lead Maintainer

Want to contribute? Check out our Contributing Guide.

License

Apache 2.0 — See LICENSE


VoxtraThe LangGraph of AI Telephony Built by Rexplore Research Labs

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

voxtra-0.3.1.tar.gz (136.1 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

voxtra-0.3.1-py3-none-any.whl (89.8 kB view details)

Uploaded Python 3

File details

Details for the file voxtra-0.3.1.tar.gz.

File metadata

  • Download URL: voxtra-0.3.1.tar.gz
  • Upload date:
  • Size: 136.1 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.1.0 CPython/3.13.12

File hashes

Hashes for voxtra-0.3.1.tar.gz
Algorithm Hash digest
SHA256 75a2d9b1b5d829b47e9835cda2704174dcd0b97be5e019e90021b6e4465885d4
MD5 3f19c872e6ae845334d5c76a6a3c5d33
BLAKE2b-256 9befa90251f7fa884ba918cb74995442a74b7eecf9ce768e13b90f5de32c2c3d

See more details on using hashes here.

File details

Details for the file voxtra-0.3.1-py3-none-any.whl.

File metadata

  • Download URL: voxtra-0.3.1-py3-none-any.whl
  • Upload date:
  • Size: 89.8 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.1.0 CPython/3.13.12

File hashes

Hashes for voxtra-0.3.1-py3-none-any.whl
Algorithm Hash digest
SHA256 54cd4334026bdebaeb0123785265958276fc37123d69febbab9df9678d39d9cc
MD5 4fa877b567824662fc9d014ecac5886f
BLAKE2b-256 ea1d9c9f25c58f5a3c5bc915f1a670580d7ae51172654a052a0c9c0a594af86d

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page