Skip to main content

Cost tracking and reconciliation for LiveKit voice agents: modality-aware unit accounting (audio-minutes, tokens, characters) backed by voice-prices.

Project description

🎙️ VoiceGateway

Voice AI cost transparency. Self-hosted, on your keys.

Drop-in for livekit.agents.inference. Per-call cost rows, voice metrics, conversation replay, multi-tenant attribution, cross-modality routing, voice guardrails. All open source.

PyPI version License: MIT Python 3.11+ LiveKit Agents 1.x Tests GitHub stars

🚀 Install in 60 seconds · 📚 Docs · 📊 Dashboard · 🎯 Roadmap · 🤝 Contributing

from livekit.agents import AgentSession
from voicegateway import inference          # <- the only line that changed

session = AgentSession(
    stt=inference.STT("deepgram/nova-3"),
    llm=inference.LLM("openai/gpt-4o-mini"),
    tts=inference.TTS("cartesia/sonic-3"),
)
# every call logged: provider, model, tokens, $cost, latency, session_id

A drop-in cost and quality observability layer for LiveKit Agents. Modality-aware unit accounting (audio-minutes, tokens, characters) with LLM, STT, and TTS prices from voice-prices. Reconcile recorded numbers against your actual provider invoices with one command. Self-hosted. Your keys. No data leaves your infra.

🚀 Install in 60 seconds

curl -fsSL https://voicegateway.mahimai.ca/install.sh | bash
voicegw onboard --install-daemon

The installer detects your OS, ensures Python 3.11+ and pipx, installs voicegateway[cloud,dashboard], runs a five-question wizard, validates your provider key, and registers a per-user daemon (LaunchAgent / systemd / Scheduled Task). Open the dashboard at http://localhost:9090 to see your first cost row land.

If uv is already on your PATH, the installer auto-detects it and uses uv tool install instead of pipx. Same ~/.local/bin/voicegw outcome, faster cold install.

Prefer manual install? pipx install "voicegateway[cloud,dashboard,mcp]" then voicegw init && voicegw dashboard.

🎯 Why VoiceGateway

Voice AI vendors hide three numbers. VoiceGateway exposes them.

Is this working? Voice has metrics text stacks do not: latency p50/p95 across the STT → LLM → TTS loop, interruption rate, dead air, talk-over. The dashboard shows all of them per call.

What does it cost? STT bills by audio seconds, LLM bills by tokens, TTS bills by characters. Every call is broken down by modality and totaled to the cent. Run voicegw reconcile to verify recorded numbers against your actual provider invoices.

How do I make it cheaper? Route by combined STT + LLM + TTS latency budget across providers. Switch models per call type. Per-tenant cost attribution so agency clients see only their own usage.

If you are building a text-only LLM application without a voice component, LiteLLM is likely a better fit. See the decision tree.

📦 What's in the box

Capability What it gives you
LiveKit Cloud parity Drop-in for livekit.agents.inference. Your keys, your config
Daemon-first onboarding Curl-bash install, OS daemon, five-question wizard, voicegw doctor
Terminal UI voicegw tui opens a vim-key Textual UI for SSH-in inspection
Public-API discipline Subpackage layout, CHANGELOG, CONTRIBUTING, SECURITY, explicit __all__
Voice-conversation metrics Per-minute cost, latency p50/p95, interruptions, dead air, talk-over
Conversation replay Scrub any past call. STT chunks, LLM tokens, TTS frames with timing and cost
Multi-tenant attribution Per-tenant cost, virtual API keys per team, agency-ready
Cross-modality routing Route by combined STT + LLM + TTS latency budget. Per-project rosters. White-label branding
Voice-specific guardrails Real-time PII detection in STT, prompt-injection detection, compliance hooks

Full release history: CHANGELOG.md.

🚧 Roadmap

  • Enterprise auth, audit log, SOC 2 prep
  • One-tap latency probe
  • Stability commitment, LTS branch policy

📊 The dashboard

A self-hosted web UI at http://localhost:9090. Bundled. No SaaS account. No data leaves your stack.

  • Overview — total requests, cost today, active models, per-project summary cards
  • Costs — daily spend with per-provider / model / project / tenant breakdown
  • Sessions — every call, every cost row, routing decisions, budget overruns
  • Metrics — p50/p95/p99 latency, interruption rate, dead air, talk-over
  • Replay — scrub through STT chunks, LLM tokens, TTS frames with timing
  • Routing — live per-provider latency observations, sortable
  • Virtual Keys — issue + revoke per-team scoped keys
  • Guardrails — PII / prompt-injection counts per project, session drilldown
  • Settings — providers, projects, branding (logo, accent color, product name)

White-label brand support: upload a logo, pick an accent color, set a product name; the whole dashboard re-skins for your project.

🤖 Manage from your coding agent (MCP)

VoiceGateway ships a first-class Model Context Protocol server. Claude Code, Cursor, Codex, Cline can configure providers, create projects, check costs, and tail logs through natural language.

Local (stdio):

pipx inject voicegateway "voicegateway[mcp]"
claude mcp add voicegateway --command "voicegw mcp --transport stdio"

Remote (HTTP/SSE with bearer auth):

export VOICEGW_MCP_TOKEN=$(openssl rand -hex 32)
voicegw mcp --transport http --port 8090
claude mcp add voicegateway \
  --transport sse \
  --url https://your-host.fly.dev/mcp/sse \
  --header "Authorization: Bearer $VOICEGW_MCP_TOKEN"

17 tools exposed: observability, providers, models, projects. Destructive ops (delete_*) require explicit confirm=True after a preview. Full MCP reference →

🛠️ Supported providers

11 providers across cloud and local. Mix and match per call.

Modality Cloud Local
STT Deepgram, OpenAI Whisper, AssemblyAI, Groq, Cartesia faster-whisper
LLM OpenAI, Anthropic, Groq Ollama (any compatible)
TTS Cartesia, ElevenLabs, Deepgram Aura-2, OpenAI Kokoro, Piper
VAD Silero Silero
Turn detector LiveKit MultilingualModel

Per-model IDs: voicegateway.mahimai.ca/docs/configuration/providers. Adding a provider takes ~10 steps: contributing/adding-a-provider.

🧱 Architecture

flowchart TB
    A[LiveKit Agent] --> B[voicegateway.inference]
    B --> C[Router]
    C --> D[Cloud Providers]
    C --> E[Local Providers]
    B --> F[Middleware Pipeline]
    F --> F1[Cost Tracker]
    F --> F2[Latency Monitor]
    F --> F3[Guardrails]
    F --> F4[Multi-tenant Attribution]
    F --> G[(SQLite · encrypted)]
    G --> H[Dashboard UI]
    G --> I[MCP Server]
    I --> J[Claude Code · Cursor · Codex]

Async throughout. Modular provider installs: pip install "voicegateway[openai,deepgram]" pulls only what you use. YAML config with ${ENV_VAR} substitution. SQLite at the bottom for portability; encrypted with Fernet at rest.

Architecture deep dive →

🐳 Docker Compose

services:
  voicegateway:
    image: mahimailabs/voicegateway:latest
    ports: ["8080:8080"]
    env_file: .env
    volumes:
      - ./voicegw.yaml:/app/voicegw.yaml:ro
      - voicegw_data:/data

  dashboard:
    image: mahimailabs/voicegateway-dashboard:latest
    ports: ["9090:9090"]
    depends_on: [voicegateway]
docker compose up -d                      # core + dashboard
docker compose --profile local up -d      # + Ollama for local LLMs

🌐 HTTP API

voicegw serve --port 8080
Endpoint Purpose
GET /health Health check
GET /v1/status Provider health + model count
GET /v1/models · GET /v1/providers · GET /v1/projects Resource CRUD
GET /v1/costs?period=today&project=X&tenant=Y Cost summary
GET /v1/sessions/{id}/turns · /v1/sessions/{id}/replay · /v1/sessions/{id}/dead_air Voice-conversation surfaces
GET /v1/routing/observations Live per-provider latency
GET /v1/virtual_keys + CRUD Per-team scoped keys
GET /v1/audit-log · GET /v1/metrics Audit + Prometheus metrics

Full reference: voicegateway.mahimai.ca/docs/api/http-api.

📦 Install options

pip install voicegateway                              # core engine
pip install "voicegateway[dashboard]"                 # + web UI
pip install "voicegateway[cloud]"                     # + cloud provider plugins
pip install "voicegateway[local]"                     # + local runtimes (Whisper, Kokoro, Piper)
pip install "voicegateway[mcp]"                       # + MCP server
pip install "voicegateway[tui]"                       # + voicegw tui
pip install "voicegateway[all,dashboard,mcp,tui]"     # everything

Python 3.11+. Local extras pull larger ML runtimes.

Zero-install one-shot (uvx). For CI smoke tests, status checks, or quick runs without a persistent install:

uvx --from "voicegateway[cloud]" voicegw status
uvx --from "voicegateway[cloud,dashboard]" voicegw serve --port 8080

uvx pulls the wheel into a throwaway environment per run; uv's wheel cache makes second runs fast. Pin a version in scripts (uvx --from "voicegateway[cloud]==0.5.0" voicegw status) to avoid surprise upgrades. Not for daemon mode: uvx cannot register a LaunchAgent / systemd unit; use the curl-bash installer for the persistent flow.

📚 Docs

Full documentation: voicegateway.mahimai.ca/docs.

Quick links: Quick start · First agent · Projects · Configuration · CLI reference · Decision tree

🤝 Contributing

Issues and PRs welcome.

git clone https://github.com/mahimailabs/voicegateway
cd voicegateway
pip install -e ".[all,dashboard,mcp,dev]"
pytest

Before submitting a PR, read CONTRIBUTING.md and CODE_OF_CONDUCT.md. Security issues go through the disclosure flow in SECURITY.md, not a public issue.

⭐ Stargazers and contributors

Star History Chart

Contributors

📜 License

MIT. Fork it, ship it.

🙌 Built by

Mahimai Raja, founder of Mahimai AI, a voice AI company. Building VoiceGateway in public.

Built on the shoulders of giants: LiveKit Agents, FastAPI, Pydantic, voice-prices (a fork of pydantic/genai-prices), cryptography, Model Context Protocol.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

voicegateway-0.8.6.tar.gz (1.4 MB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

voicegateway-0.8.6-py3-none-any.whl (529.5 kB view details)

Uploaded Python 3

File details

Details for the file voicegateway-0.8.6.tar.gz.

File metadata

  • Download URL: voicegateway-0.8.6.tar.gz
  • Upload date:
  • Size: 1.4 MB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.1.0 CPython/3.13.13

File hashes

Hashes for voicegateway-0.8.6.tar.gz
Algorithm Hash digest
SHA256 fd53fa14dd9aa421b8ce08c7e256b349e6f8db088002fb258a0ccac26e5472f5
MD5 7d3027d9db78d8b7a46e9d7dae4d2a79
BLAKE2b-256 96c83f84ce28c1d8b2d7212dbc3aff1539ca7fc6b95ad4fe0747f3f1bff59a69

See more details on using hashes here.

File details

Details for the file voicegateway-0.8.6-py3-none-any.whl.

File metadata

  • Download URL: voicegateway-0.8.6-py3-none-any.whl
  • Upload date:
  • Size: 529.5 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.1.0 CPython/3.13.13

File hashes

Hashes for voicegateway-0.8.6-py3-none-any.whl
Algorithm Hash digest
SHA256 fa454083d55199b30783cfabe2f184cbe3786c868f12a89b117150b2c093c274
MD5 f6b59ca02d90044235e9c0e9cd73c7cf
BLAKE2b-256 4f40b65fe319daff70a00f9ee385059f1329bc9ffe02677267bf4b0a6442ec78

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page