Cost tracking and reconciliation for LiveKit voice agents: modality-aware unit accounting (audio-minutes, tokens, characters) backed by voice-prices.
Project description
🎙️ VoiceGateway
Voice AI cost transparency. Self-hosted, on your keys.
Drop-in for livekit.agents.inference. Per-call cost rows, voice metrics, conversation replay, multi-tenant attribution, cross-modality routing, voice guardrails. All open source.
🚀 Install in 60 seconds · 📚 Docs · 📊 Dashboard · 🎯 Roadmap · 🤝 Contributing
from livekit.agents import AgentSession
from voicegateway import inference # <- the only line that changed
session = AgentSession(
stt=inference.STT("deepgram/nova-3"),
llm=inference.LLM("openai/gpt-4o-mini"),
tts=inference.TTS("cartesia/sonic-3"),
)
# every call logged: provider, model, tokens, $cost, latency, session_id
A drop-in cost and quality observability layer for LiveKit Agents. Modality-aware unit accounting (audio-minutes, tokens, characters) with LLM, STT, and TTS prices from voice-prices. Reconcile recorded numbers against your actual provider invoices with one command. Self-hosted. Your keys. No data leaves your infra.
🚀 Install in 60 seconds
curl -fsSL https://voicegateway.mahimai.ca/install.sh | bash
voicegw onboard --install-daemon
The installer detects your OS, ensures Python 3.11+ and pipx, installs voicegateway[cloud,dashboard], runs a five-question wizard, validates your provider key, and registers a per-user daemon (LaunchAgent / systemd / Scheduled Task). Open the dashboard at http://localhost:9090 to see your first cost row land.
If uv is already on your PATH, the installer auto-detects it and uses uv tool install instead of pipx. Same ~/.local/bin/voicegw outcome, faster cold install.
Prefer manual install? pipx install "voicegateway[cloud,dashboard,mcp]" then voicegw init && voicegw dashboard.
🎯 Why VoiceGateway
Voice AI vendors hide three numbers. VoiceGateway exposes them.
Is this working? Voice has metrics text stacks do not: latency p50/p95 across the STT → LLM → TTS loop, interruption rate, dead air, talk-over. The dashboard shows all of them per call.
What does it cost? STT bills by audio seconds, LLM bills by tokens, TTS bills by characters. Every call is broken down by modality and totaled to the cent. Run voicegw reconcile to verify recorded numbers against your actual provider invoices.
How do I make it cheaper? Route by combined STT + LLM + TTS latency budget across providers. Switch models per call type. Per-tenant cost attribution so agency clients see only their own usage.
If you are building a text-only LLM application without a voice component, LiteLLM is likely a better fit. See the decision tree.
📦 What's in the box
| Capability | What it gives you |
|---|---|
| LiveKit Cloud parity | Drop-in for livekit.agents.inference. Your keys, your config |
| Daemon-first onboarding | Curl-bash install, OS daemon, five-question wizard, voicegw doctor |
| Terminal UI | voicegw tui opens a vim-key Textual UI for SSH-in inspection |
| Public-API discipline | Subpackage layout, CHANGELOG, CONTRIBUTING, SECURITY, explicit __all__ |
| Voice-conversation metrics | Per-minute cost, latency p50/p95, interruptions, dead air, talk-over |
| Conversation replay | Scrub any past call. STT chunks, LLM tokens, TTS frames with timing and cost |
| Multi-tenant attribution | Per-tenant cost, virtual API keys per team, agency-ready |
| Cross-modality routing | Route by combined STT + LLM + TTS latency budget. Per-project rosters. White-label branding |
| Voice-specific guardrails | Real-time PII detection in STT, prompt-injection detection, compliance hooks |
Full release history: CHANGELOG.md.
🚧 Roadmap
- Enterprise auth, audit log, SOC 2 prep
- One-tap latency probe
- Stability commitment, LTS branch policy
📊 The dashboard
A self-hosted web UI at http://localhost:9090. Bundled. No SaaS account. No data leaves your stack.
- Overview — total requests, cost today, active models, per-project summary cards
- Costs — daily spend with per-provider / model / project / tenant breakdown
- Sessions — every call, every cost row, routing decisions, budget overruns
- Metrics — p50/p95/p99 latency, interruption rate, dead air, talk-over
- Replay — scrub through STT chunks, LLM tokens, TTS frames with timing
- Routing — live per-provider latency observations, sortable
- Virtual Keys — issue + revoke per-team scoped keys
- Guardrails — PII / prompt-injection counts per project, session drilldown
- Settings — providers, projects, branding (logo, accent color, product name)
White-label brand support: upload a logo, pick an accent color, set a product name; the whole dashboard re-skins for your project.
🤖 Manage from your coding agent (MCP)
VoiceGateway ships a first-class Model Context Protocol server. Claude Code, Cursor, Codex, Cline can configure providers, create projects, check costs, and tail logs through natural language.
Local (stdio):
pipx inject voicegateway "voicegateway[mcp]"
claude mcp add voicegateway --command "voicegw mcp --transport stdio"
Remote (HTTP/SSE with bearer auth):
export VOICEGW_MCP_TOKEN=$(openssl rand -hex 32)
voicegw mcp --transport http --port 8090
claude mcp add voicegateway \
--transport sse \
--url https://your-host.fly.dev/mcp/sse \
--header "Authorization: Bearer $VOICEGW_MCP_TOKEN"
17 tools exposed: observability, providers, models, projects. Destructive ops (delete_*) require explicit confirm=True after a preview. Full MCP reference →
🛠️ Supported providers
11 providers across cloud and local. Mix and match per call.
| Modality | Cloud | Local |
|---|---|---|
| STT | Deepgram, OpenAI Whisper, AssemblyAI, Groq, Cartesia | faster-whisper |
| LLM | OpenAI, Anthropic, Groq | Ollama (any compatible) |
| TTS | Cartesia, ElevenLabs, Deepgram Aura-2, OpenAI | Kokoro, Piper |
| VAD | Silero | Silero |
| Turn detector | LiveKit MultilingualModel | — |
Per-model IDs: voicegateway.mahimai.ca/docs/configuration/providers. Adding a provider takes ~10 steps: contributing/adding-a-provider.
🧱 Architecture
flowchart TB
A[LiveKit Agent] --> B[voicegateway.inference]
B --> C[Router]
C --> D[Cloud Providers]
C --> E[Local Providers]
B --> F[Middleware Pipeline]
F --> F1[Cost Tracker]
F --> F2[Latency Monitor]
F --> F3[Guardrails]
F --> F4[Multi-tenant Attribution]
F --> G[(SQLite · encrypted)]
G --> H[Dashboard UI]
G --> I[MCP Server]
I --> J[Claude Code · Cursor · Codex]
Async throughout. Modular provider installs: pip install "voicegateway[openai,deepgram]" pulls only what you use. YAML config with ${ENV_VAR} substitution. SQLite at the bottom for portability; encrypted with Fernet at rest.
🐳 Docker Compose
services:
voicegateway:
image: mahimailabs/voicegateway:latest
ports: ["8080:8080"]
env_file: .env
volumes:
- ./voicegw.yaml:/app/voicegw.yaml:ro
- voicegw_data:/data
dashboard:
image: mahimailabs/voicegateway-dashboard:latest
ports: ["9090:9090"]
depends_on: [voicegateway]
docker compose up -d # core + dashboard
docker compose --profile local up -d # + Ollama for local LLMs
🌐 HTTP API
voicegw serve --port 8080
| Endpoint | Purpose |
|---|---|
GET /health |
Health check |
GET /v1/status |
Provider health + model count |
GET /v1/models · GET /v1/providers · GET /v1/projects |
Resource CRUD |
GET /v1/costs?period=today&project=X&tenant=Y |
Cost summary |
GET /v1/sessions/{id}/turns · /v1/sessions/{id}/replay · /v1/sessions/{id}/dead_air |
Voice-conversation surfaces |
GET /v1/routing/observations |
Live per-provider latency |
GET /v1/virtual_keys + CRUD |
Per-team scoped keys |
GET /v1/audit-log · GET /v1/metrics |
Audit + Prometheus metrics |
Full reference: voicegateway.mahimai.ca/docs/api/http-api.
📦 Install options
pip install voicegateway # core engine
pip install "voicegateway[dashboard]" # + web UI
pip install "voicegateway[cloud]" # + cloud provider plugins
pip install "voicegateway[local]" # + local runtimes (Whisper, Kokoro, Piper)
pip install "voicegateway[mcp]" # + MCP server
pip install "voicegateway[tui]" # + voicegw tui
pip install "voicegateway[all,dashboard,mcp,tui]" # everything
Python 3.11+. Local extras pull larger ML runtimes.
Zero-install one-shot (uvx). For CI smoke tests, status checks, or quick runs without a persistent install:
uvx --from "voicegateway[cloud]" voicegw status
uvx --from "voicegateway[cloud,dashboard]" voicegw serve --port 8080
uvx pulls the wheel into a throwaway environment per run; uv's wheel cache makes second runs fast. Pin a version in scripts (uvx --from "voicegateway[cloud]==0.5.0" voicegw status) to avoid surprise upgrades. Not for daemon mode: uvx cannot register a LaunchAgent / systemd unit; use the curl-bash installer for the persistent flow.
📚 Docs
Full documentation: voicegateway.mahimai.ca/docs.
Quick links: Quick start · First agent · Projects · Configuration · CLI reference · Decision tree
🤝 Contributing
Issues and PRs welcome.
git clone https://github.com/mahimailabs/voicegateway
cd voicegateway
pip install -e ".[all,dashboard,mcp,dev]"
pytest
Before submitting a PR, read CONTRIBUTING.md and CODE_OF_CONDUCT.md. Security issues go through the disclosure flow in SECURITY.md, not a public issue.
⭐ Stargazers and contributors
📜 License
MIT. Fork it, ship it.
🙌 Built by
Mahimai Raja, founder of Mahimai AI, a voice AI company. Building VoiceGateway in public.
Built on the shoulders of giants: LiveKit Agents, FastAPI, Pydantic, voice-prices (a fork of pydantic/genai-prices), cryptography, Model Context Protocol.
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file voicegateway-0.8.5.tar.gz.
File metadata
- Download URL: voicegateway-0.8.5.tar.gz
- Upload date:
- Size: 1.4 MB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.1.0 CPython/3.13.13
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
67225a9b8af462dd2e4553f919d00af5c1e60fdbf504adaf62f62c2e8fc055df
|
|
| MD5 |
6434d82b41e189967eb754c144563982
|
|
| BLAKE2b-256 |
e3871318fe0353853eb81a39e9160aa10371059cabd96abd2b58e14dc2e01fe4
|
File details
Details for the file voicegateway-0.8.5-py3-none-any.whl.
File metadata
- Download URL: voicegateway-0.8.5-py3-none-any.whl
- Upload date:
- Size: 529.3 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.1.0 CPython/3.13.13
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
233b3ec603b8f9b551dcc3b06ef736c4d4113d31bfd1b47dd9dc0fea483e3148
|
|
| MD5 |
8fdebb5b2ce51b25703a05b491f7eb40
|
|
| BLAKE2b-256 |
6d29acb6389e7dd1bcd035c226084de6e4d067d1a9ea3045c6fc514b329f614f
|