Skip to main content

Cost tracking and reconciliation for LiveKit voice agents: modality-aware unit accounting (audio-minutes, tokens, characters) backed by voice-prices.

Project description

🎙️ VoiceGateway

Voice AI cost transparency. Self-hosted, on your keys.

Drop-in for livekit.agents.inference. Per-call cost rows, voice metrics, conversation replay, multi-tenant attribution, cross-modality routing, voice guardrails. All open source.

PyPI version License: MIT Python 3.11+ LiveKit Agents 1.x Tests GitHub stars

🚀 Install in 60 seconds · 📚 Docs · 📊 Dashboard · 🎯 Roadmap · 🤝 Contributing

from livekit.agents import AgentSession
from voicegateway import inference          # <- the only line that changed

session = AgentSession(
    stt=inference.STT("deepgram/nova-3"),
    llm=inference.LLM("openai/gpt-4o-mini"),
    tts=inference.TTS("cartesia/sonic-3"),
)
# every call logged: provider, model, tokens, $cost, latency, session_id

A drop-in cost and quality observability layer for LiveKit Agents. Modality-aware unit accounting (audio-minutes, tokens, characters) with LLM, STT, and TTS prices from voice-prices. Reconcile recorded numbers against your actual provider invoices with one command. Self-hosted. Your keys. No data leaves your infra.

🚀 Install in 60 seconds

curl -fsSL https://voicegateway.mahimai.ca/install.sh | bash
voicegw onboard --install-daemon

The installer detects your OS, ensures Python 3.11+ and pipx, installs voicegateway[cloud,dashboard], runs a five-question wizard, validates your provider key, and registers a per-user daemon (LaunchAgent / systemd / Scheduled Task). Open the dashboard at http://localhost:9090 to see your first cost row land.

If uv is already on your PATH, the installer auto-detects it and uses uv tool install instead of pipx. Same ~/.local/bin/voicegw outcome, faster cold install.

Prefer manual install? pipx install "voicegateway[cloud,dashboard,mcp]" then voicegw init && voicegw dashboard.

🎯 Why VoiceGateway

Voice AI vendors hide three numbers. VoiceGateway exposes them.

Is this working? Voice has metrics text stacks do not: latency p50/p95 across the STT → LLM → TTS loop, interruption rate, dead air, talk-over. The dashboard shows all of them per call.

What does it cost? STT bills by audio seconds, LLM bills by tokens, TTS bills by characters. Every call is broken down by modality and totaled to the cent. Run voicegw reconcile to verify recorded numbers against your actual provider invoices.

How do I make it cheaper? Route by combined STT + LLM + TTS latency budget across providers. Switch models per call type. Per-tenant cost attribution so agency clients see only their own usage.

If you are building a text-only LLM application without a voice component, LiteLLM is likely a better fit. See the decision tree.

📦 What's in the box

Capability What it gives you
LiveKit Cloud parity Drop-in for livekit.agents.inference. Your keys, your config
Daemon-first onboarding Curl-bash install, OS daemon, five-question wizard, voicegw doctor
Terminal UI voicegw tui opens a vim-key Textual UI for SSH-in inspection
Public-API discipline Subpackage layout, CHANGELOG, CONTRIBUTING, SECURITY, explicit __all__
Voice-conversation metrics Per-minute cost, latency p50/p95, interruptions, dead air, talk-over
Conversation replay Scrub any past call. STT chunks, LLM tokens, TTS frames with timing and cost
Multi-tenant attribution Per-tenant cost, virtual API keys per team, agency-ready
Cross-modality routing Route by combined STT + LLM + TTS latency budget. Per-project rosters. White-label branding
Voice-specific guardrails Real-time PII detection in STT, prompt-injection detection, compliance hooks

Full release history: CHANGELOG.md.

🚧 Roadmap

  • Enterprise auth, audit log, SOC 2 prep
  • One-tap latency probe
  • Stability commitment, LTS branch policy

📊 The dashboard

A self-hosted web UI at http://localhost:9090. Bundled. No SaaS account. No data leaves your stack.

  • Overview — total requests, cost today, active models, per-project summary cards
  • Costs — daily spend with per-provider / model / project / tenant breakdown
  • Sessions — every call, every cost row, routing decisions, budget overruns
  • Metrics — p50/p95/p99 latency, interruption rate, dead air, talk-over
  • Replay — scrub through STT chunks, LLM tokens, TTS frames with timing
  • Routing — live per-provider latency observations, sortable
  • Virtual Keys — issue + revoke per-team scoped keys
  • Guardrails — PII / prompt-injection counts per project, session drilldown
  • Settings — providers, projects, branding (logo, accent color, product name)

White-label brand support: upload a logo, pick an accent color, set a product name; the whole dashboard re-skins for your project.

🤖 Manage from your coding agent (MCP)

VoiceGateway ships a first-class Model Context Protocol server. Claude Code, Cursor, Codex, Cline can configure providers, create projects, check costs, and tail logs through natural language.

Local (stdio):

pipx inject voicegateway "voicegateway[mcp]"
claude mcp add voicegateway --command "voicegw mcp --transport stdio"

Remote (HTTP/SSE with bearer auth):

export VOICEGW_MCP_TOKEN=$(openssl rand -hex 32)
voicegw mcp --transport http --port 8090
claude mcp add voicegateway \
  --transport sse \
  --url https://your-host.fly.dev/mcp/sse \
  --header "Authorization: Bearer $VOICEGW_MCP_TOKEN"

17 tools exposed: observability, providers, models, projects. Destructive ops (delete_*) require explicit confirm=True after a preview. Full MCP reference →

🛠️ Supported providers

11 providers across cloud and local. Mix and match per call.

Modality Cloud Local
STT Deepgram, OpenAI Whisper, AssemblyAI, Groq, Cartesia faster-whisper
LLM OpenAI, Anthropic, Groq Ollama (any compatible)
TTS Cartesia, ElevenLabs, Deepgram Aura-2, OpenAI Kokoro, Piper
VAD Silero Silero
Turn detector LiveKit MultilingualModel

Per-model IDs: voicegateway.mahimai.ca/docs/configuration/providers. Adding a provider takes ~10 steps: contributing/adding-a-provider.

🧱 Architecture

flowchart TB
    A[LiveKit Agent] --> B[voicegateway.inference]
    B --> C[Router]
    C --> D[Cloud Providers]
    C --> E[Local Providers]
    B --> F[Middleware Pipeline]
    F --> F1[Cost Tracker]
    F --> F2[Latency Monitor]
    F --> F3[Guardrails]
    F --> F4[Multi-tenant Attribution]
    F --> G[(SQLite · encrypted)]
    G --> H[Dashboard UI]
    G --> I[MCP Server]
    I --> J[Claude Code · Cursor · Codex]

Async throughout. Modular provider installs: pip install "voicegateway[openai,deepgram]" pulls only what you use. YAML config with ${ENV_VAR} substitution. SQLite at the bottom for portability; encrypted with Fernet at rest.

Architecture deep dive →

🐳 Docker Compose

services:
  voicegateway:
    image: mahimailabs/voicegateway:latest
    ports: ["8080:8080"]
    env_file: .env
    volumes:
      - ./voicegw.yaml:/app/voicegw.yaml:ro
      - voicegw_data:/data

  dashboard:
    image: mahimailabs/voicegateway-dashboard:latest
    ports: ["9090:9090"]
    depends_on: [voicegateway]
docker compose up -d                      # core + dashboard
docker compose --profile local up -d      # + Ollama for local LLMs

🌐 HTTP API

voicegw serve --port 8080
Endpoint Purpose
GET /health Health check
GET /v1/status Provider health + model count
GET /v1/models · GET /v1/providers · GET /v1/projects Resource CRUD
GET /v1/costs?period=today&project=X&tenant=Y Cost summary
GET /v1/sessions/{id}/turns · /v1/sessions/{id}/replay · /v1/sessions/{id}/dead_air Voice-conversation surfaces
GET /v1/routing/observations Live per-provider latency
GET /v1/virtual_keys + CRUD Per-team scoped keys
GET /v1/audit-log · GET /v1/metrics Audit + Prometheus metrics

Full reference: voicegateway.mahimai.ca/docs/api/http-api.

📦 Install options

pip install voicegateway                              # core engine
pip install "voicegateway[dashboard]"                 # + web UI
pip install "voicegateway[cloud]"                     # + cloud provider plugins
pip install "voicegateway[local]"                     # + local runtimes (Whisper, Kokoro, Piper)
pip install "voicegateway[mcp]"                       # + MCP server
pip install "voicegateway[tui]"                       # + voicegw tui
pip install "voicegateway[all,dashboard,mcp,tui]"     # everything

Python 3.11+. Local extras pull larger ML runtimes.

Zero-install one-shot (uvx). For CI smoke tests, status checks, or quick runs without a persistent install:

uvx --from "voicegateway[cloud]" voicegw status
uvx --from "voicegateway[cloud,dashboard]" voicegw serve --port 8080

uvx pulls the wheel into a throwaway environment per run; uv's wheel cache makes second runs fast. Pin a version in scripts (uvx --from "voicegateway[cloud]==0.5.0" voicegw status) to avoid surprise upgrades. Not for daemon mode: uvx cannot register a LaunchAgent / systemd unit; use the curl-bash installer for the persistent flow.

📚 Docs

Full documentation: voicegateway.mahimai.ca/docs.

Quick links: Quick start · First agent · Projects · Configuration · CLI reference · Decision tree

🤝 Contributing

Issues and PRs welcome.

git clone https://github.com/mahimailabs/voicegateway
cd voicegateway
pip install -e ".[all,dashboard,mcp,dev]"
pytest

Before submitting a PR, read CONTRIBUTING.md and CODE_OF_CONDUCT.md. Security issues go through the disclosure flow in SECURITY.md, not a public issue.

⭐ Stargazers and contributors

Star History Chart

Contributors

📜 License

MIT. Fork it, ship it.

🙌 Built by

Mahimai Raja, founder of Mahimai AI, a voice AI company. Building VoiceGateway in public.

Built on the shoulders of giants: LiveKit Agents, FastAPI, Pydantic, voice-prices (a fork of pydantic/genai-prices), cryptography, Model Context Protocol.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

voicegateway-0.8.1.tar.gz (1.4 MB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

voicegateway-0.8.1-py3-none-any.whl (517.8 kB view details)

Uploaded Python 3

File details

Details for the file voicegateway-0.8.1.tar.gz.

File metadata

  • Download URL: voicegateway-0.8.1.tar.gz
  • Upload date:
  • Size: 1.4 MB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.1.0 CPython/3.13.13

File hashes

Hashes for voicegateway-0.8.1.tar.gz
Algorithm Hash digest
SHA256 a81e131439095ce866558107ab6b2c2ece632a414f5d3fe86ca919283f35716a
MD5 be5e9c588710d0cc9554e5453455aa1a
BLAKE2b-256 04f34af6005960f545e647c1839d1262d524c234633fcdda189564817f10894f

See more details on using hashes here.

File details

Details for the file voicegateway-0.8.1-py3-none-any.whl.

File metadata

  • Download URL: voicegateway-0.8.1-py3-none-any.whl
  • Upload date:
  • Size: 517.8 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.1.0 CPython/3.13.13

File hashes

Hashes for voicegateway-0.8.1-py3-none-any.whl
Algorithm Hash digest
SHA256 f7664d09754889963cb4686905706a5c582431bc7aafd8d1a6f451bad1b16cee
MD5 f7beef74a4a128df0b1c4cb29d88984b
BLAKE2b-256 d4b41c60167236c0173d8fb246d3620b961d8a6798cd6284827a33511265693d

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page