Open voice infrastructure for AI agents — bridging telephony and AI

These details have not been verified by PyPI

Project links

Project description

Voxtra

Open voice infrastructure for AI agents.

Voxtra is a Python framework that bridges telephony infrastructure (Asterisk, FreeSWITCH, LiveKit) with AI voice agents (STT, LLM, TTS). It lets developers build AI-powered call centers without needing to understand telecom internals.

Architecture

graph LR
    A[Cellular Provider] -->|SIP Trunk| B[Asterisk PBX]
    B -->|ARI + Media| C[Voxtra]
    C --> D[STT]
    C --> E[LLM]
    C --> F[TTS]

    D -->|transcript| E
    E -->|response| F
    F -->|audio| C

    style A fill:#4a90d9,stroke:#333,color:#fff
    style B fill:#e67e22,stroke:#333,color:#fff
    style C fill:#2ecc71,stroke:#333,color:#fff
    style D fill:#9b59b6,stroke:#333,color:#fff
    style E fill:#e74c3c,stroke:#333,color:#fff
    style F fill:#1abc9c,stroke:#333,color:#fff

Layer Design

Layer	Package	Responsibility
Core	`voxtra.app`, `voxtra.router`, `voxtra.session`	App lifecycle, decorator-based routing, call sessions with `say` / `listen` / `agent`
Telephony	`voxtra.telephony`, `voxtra.ari`	`BaseTelephonyAdapter` ABC; `AsteriskAdapter` wraps the async `ARIClient`
Audio	`voxtra.audio`	`AudioSocketServer` — TCP audio I/O with Asterisk; μ-law / A-law / PCM codec helpers
Media	`voxtra.media`	`AudioFrame` + `BaseMediaTransport`; `CallSessionMediaTransport` bridges sessions into the pipeline
AI	`voxtra.ai`	STT, TTS, LLM, VAD provider abstractions; `Registry` plugin system
Pipeline	`voxtra.core.pipeline`	Real-time STT → LLM → TTS orchestration; auto-wired per session when providers configured
Provisioning	`voxtra.provisioning`	Per-tenant Asterisk pjsip / dialplan generation (optional, `voxtra[provisioning]`)

Quick Start

Installation

From PyPI:

pip install voxtra

With provider extras (Asterisk is part of the core install — no extra needed):

pip install "voxtra[deepgram,openai,elevenlabs,cartesia]"
# or grab everything in one go
pip install "voxtra[all]"

Available extras: deepgram, openai, elevenlabs, cartesia, livekit, provisioning, all, dev.

From GitHub (latest development version):

pip install git+https://github.com/rexplore-ai/voxtra.git

From source (for development):

git clone https://github.com/rexplore-ai/voxtra.git
cd voxtra
python3 -m venv .venv && source .venv/bin/activate
pip install -e ".[dev]"

Code-First Usage

from voxtra import VoxtraApp

app = VoxtraApp.from_yaml("voxtra.yaml")

@app.route(extension="1000")
async def support_call(session):
    await session.answer()
    await session.say("Hello, welcome to support. How can I help you?")
    text = await session.listen()
    reply = await session.agent.respond(text)
    await session.say(reply.text)
    await session.hangup()

app.run()

Config-First Usage

Create voxtra.yaml:

app_name: my-call-center

telephony:
  provider: asterisk
  asterisk:
    base_url: http://localhost:8088
    username: asterisk
    password: secret
    app_name: voxtra

media:
  transport: websocket
  codec: ulaw
  sample_rate: 8000

ai:
  stt:
    provider: deepgram
    api_key: ${DEEPGRAM_API_KEY}
    model: nova-2
  llm:
    provider: openai
    api_key: ${OPENAI_API_KEY}
    model: gpt-4o
    system_prompt: "You are a helpful voice assistant for a call center."
  tts:
    provider: elevenlabs
    api_key: ${ELEVENLABS_API_KEY}
    voice_id: your-voice-id

routes:
  - extension: "1000"
    agent: support_agent

Then run:

voxtra start

Asterisk Integration

Voxtra connects to Asterisk on two channels:

ARI (Asterisk REST Interface) — control plane. HTTP for call operations, WebSocket for events.
AudioSocket — media plane. A simple framed TCP protocol (1-byte type + 3-byte length + payload). Voxtra's AudioSocketServer accepts the connection Asterisk opens; no RTP/NAT/SDP to worry about.

Add this to your dialplan to route inbound calls into the Voxtra Stasis app:

[voxtra-inbound]
exten => _X.,1,Stasis(voxtra)
 same => n,Hangup()

Voxtra opens AudioSocket connections on demand the first time a handler calls session.audio_stream(), session.say(), session.listen(), or any other audio I/O.

Supported Providers

Telephony

Asterisk (ARI) — Production ready
LiveKit (SIP) — Planned
FreeSWITCH — Planned

Speech-to-Text

Deepgram (streaming)
More coming soon

LLM / Agents

OpenAI (GPT-4o, streaming)
LangGraph integration planned

Text-to-Speech

ElevenLabs (streaming)
Cartesia (streaming)
More coming soon

Project Structure

src/voxtra/
├── app.py                       # VoxtraApp — entry point, lifecycle, from_yaml/from_config
├── session.py                   # CallSession + AgentClient (say/listen/agent)
├── router.py                    # Decorator-based call routing
├── registry.py                  # Provider plugin registry (STT/TTS/LLM/VAD/telephony/media)
├── events.py                    # VoxtraEvent + typed subclasses
├── config.py                    # Pydantic config models + VoxtraConfig.from_yaml
├── middleware.py                # Event middleware
├── exceptions.py                # Custom exceptions
├── types.py                     # AudioChunk, CallState, AudioCodec, SIPTrunk, …
├── cli.py                       # `voxtra` CLI: start, init, info, check
├── ari/                         # Asterisk ARI client
│   ├── client.py                #   async HTTP + WebSocket client
│   ├── events.py                #   ARIEvent typed model
│   └── models.py                #   Channel / Bridge / Playback Pydantic models
├── audio/                       # AudioSocket — TCP audio I/O with Asterisk
│   ├── socket.py                #   AudioSocketServer + AudioSocketConnection
│   └── codec.py                 #   μ-law / A-law / PCM-S16LE conversion
├── telephony/                   # Backend abstraction
│   ├── base.py                  #   BaseTelephonyAdapter ABC
│   ├── asterisk/adapter.py      #   AsteriskAdapter (wraps ARIClient)
│   └── livekit/                 #   LiveKit adapter (stub)
├── media/                       # Frame-oriented media stack used by VoicePipeline
│   ├── audio.py                 #   AudioFrame + codec helpers
│   ├── base.py                  #   BaseMediaTransport ABC
│   ├── websocket.py             #   WebSocket transport
│   ├── buffer.py                #   Audio buffering
│   └── session_transport.py     #   Bridges CallSession ↔ BaseMediaTransport
├── core/
│   └── pipeline.py              # VoicePipeline — STT → LLM → TTS orchestration
├── provisioning/                # Per-tenant Asterisk config generation (optional)
│   └── provisioner.py           #   pjsip / extensions / ari fragment writer
└── ai/
    ├── stt/                     # Speech-to-Text providers (Deepgram, …)
    ├── tts/                     # Text-to-Speech providers (ElevenLabs, Cartesia, …)
    ├── llm/                     # LLM / Agent providers (OpenAI, …)
    └── vad/                     # Voice Activity Detection

Documentation

Architecture — Deep-dive into every layer, component, data flow, and design decision
Contributing — How to set up dev environment, add providers, submit PRs, and code standards

Development

git clone git@github.com:rexplore-ai/voxtra.git
cd voxtra
python3 -m venv .venv && source .venv/bin/activate
pip install -e ".[dev]"
pytest

Roadmap

Shipped in 0.3.0:

Core abstractions (VoxtraApp, Router, CallSession, Events)
Asterisk ARI adapter (wraps async ARIClient, conforms to BaseTelephonyAdapter)
AudioSocket TCP transport + μ-law / A-law / PCM codec helpers
AI provider interfaces (STT, TTS, LLM, VAD) + Registry plugin system
Voice pipeline (STT → LLM → TTS), auto-wired per session when providers configured
High-level session API: say(text), listen(timeout=), agent.respond(text)
VoxtraApp.from_yaml(path) / from_config(VoxtraConfig) + working voxtra start CLI
WebSocket media transport
Per-tenant Asterisk provisioning (config file generation)

Planned:

End-to-end Asterisk + AI demo with recordings
LiveKit adapter (currently a stub)
FreeSWITCH adapter
HMAC-signed backend webhook emitter
Recording sinks (GCS / S3) — auto-upload after record_stop()
Multi-tenant runtime supervisor (one ARI app per tenant)
Provisioner stage 2: SSH + asterisk -rx reload
Prometheus metrics ASGI sub-app
LangGraph agent integration
Multi-agent handoff
Dashboard / Admin API
Conversation analytics

Contributors

Thanks to everyone who has contributed to Voxtra!

Patrick Byamasu — Creator & Lead Maintainer

Want to contribute? Check out our Contributing Guide.

License

Apache 2.0 — See LICENSE

Voxtra — The LangGraph of AI Telephony Built by Rexplore Research Labs

Project details

These details have not been verified by PyPI

Project links

Release history Release notifications | RSS feed

This version

0.3.1

May 4, 2026

0.3.0

May 4, 2026

0.1.0b2 pre-release

Mar 9, 2026

0.1.0b1 pre-release

Mar 9, 2026

0.1.0a1 pre-release

Mar 9, 2026

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

voxtra-0.3.1.tar.gz (136.1 kB view details)

Uploaded May 4, 2026 Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

The dropdown lists show the available interpreters, ABIs, and platforms. Enable javascript to be able to filter the list of wheel files.

voxtra-0.3.1-py3-none-any.whl (89.8 kB view details)

Uploaded May 4, 2026 Python 3

File details

Details for the file voxtra-0.3.1.tar.gz.

File metadata

Download URL: voxtra-0.3.1.tar.gz
Upload date: May 4, 2026
Size: 136.1 kB
Tags: Source
Uploaded using Trusted Publishing? No
Uploaded via: twine/6.1.0 CPython/3.13.12

File hashes

Hashes for voxtra-0.3.1.tar.gz
Algorithm	Hash digest
SHA256	`75a2d9b1b5d829b47e9835cda2704174dcd0b97be5e019e90021b6e4465885d4`
MD5	`3f19c872e6ae845334d5c76a6a3c5d33`
BLAKE2b-256	`9befa90251f7fa884ba918cb74995442a74b7eecf9ce768e13b90f5de32c2c3d`

See more details on using hashes here.

File details

Details for the file voxtra-0.3.1-py3-none-any.whl.

File metadata

Download URL: voxtra-0.3.1-py3-none-any.whl
Upload date: May 4, 2026
Size: 89.8 kB
Tags: Python 3
Uploaded using Trusted Publishing? No
Uploaded via: twine/6.1.0 CPython/3.13.12

File hashes

Hashes for voxtra-0.3.1-py3-none-any.whl
Algorithm	Hash digest
SHA256	`54cd4334026bdebaeb0123785265958276fc37123d69febbab9df9678d39d9cc`
MD5	`4fa877b567824662fc9d014ecac5886f`
BLAKE2b-256	`ea1d9c9f25c58f5a3c5bc915f1a670580d7ae51172654a052a0c9c0a594af86d`

See more details on using hashes here.

voxtra 0.3.1

Navigation

Verified details

Maintainers

Unverified details

Project links

Meta

Classifiers

Project description

Voxtra

Architecture

Layer Design

Quick Start

Installation

Code-First Usage

Config-First Usage

Asterisk Integration

Supported Providers

Telephony

Speech-to-Text

LLM / Agents

Text-to-Speech

Project Structure

Documentation

Development

Roadmap

Contributors

License

Project details

Verified details

Maintainers

Unverified details

Project links

Meta

Classifiers

Release history Release notifications | RSS feed

Download files

Source Distribution

Built Distribution

File details

File metadata

File hashes

File details

File metadata

File hashes