Cost tracking and reconciliation for LiveKit voice agents: modality-aware unit accounting (audio-minutes, tokens, characters) backed by pydantic/genai-prices.

These details have not been verified by PyPI

Project links

Project description

VoiceGateway

Cost tracking and reconciliation for LiveKit voice agents. Modality-aware unit accounting (audio-minutes, tokens, characters). LLM prices from pydantic/genai-prices. Verify against provider invoices with voicegw reconcile.

Docs · Quick Start · MCP Setup · Deploy

Why VoiceGateway

VoiceGateway is purpose-built for LiveKit voice agents. Four things make it different from general-purpose LLM gateways. v0.1.0 (current) ships a daemon-first onboarding flow: one curl command, a five-question wizard, your first provider call lands in the dashboard inside 60 seconds.

1. One-line drop-in for `livekit.agents.inference`

voicegateway.inference.STT/LLM/TTS mirror LiveKit's inference module signature for signature. Swap the import line, point your YAML at your provider keys, and the rest of your agent code works unchanged.

from livekit.agents import AgentSession
from voicegateway import inference          # <- the only line that changed

session = AgentSession(
    stt=inference.STT("deepgram/nova-3"),
    llm=inference.LLM("openai/gpt-4o-mini"),
    tts=inference.TTS("cartesia/sonic-3"),
)

Per-conversation cost, latency, and session correlation are recorded transparently. Pick a project explicitly with inference.set_project("my-app") or set default_project: my-app in voicegw.yaml; otherwise requests fall through to the auto-created default project.

2. Modality-aware unit accounting

LLM cost is per-1k-token, STT cost is per-audio-minute, TTS cost is per-character. Each modality is billed natively against its own provider unit rather than flattened to a single token-equivalent.

LLM prices come from pydantic/genai-prices: 1,100+ models, monthly releases, historic price tracking. VG does not maintain its own LLM pricing catalog. STT and TTS prices live in a local catalog with an explicit pricing_source_date per entry; CI fails when any entry is more than 60 days old, forcing a manual refresh per release.

3. Reconciliation tooling

voicegw export-costs --start 2026-04-01 --end 2026-04-30 --format csv
voicegw reconcile --provider openai --provider-usage-file openai-usage.csv

Per-request line items carry pricing_source attribution (genai-prices@<version> for LLM, voicegateway-catalog@<date> for STT/TTS). The reconcile command compares VG's logged costs against your provider's usage export and produces a per-model diff. LLM costs are estimated and may drift up to ~5%; reconciliation against your provider invoice is the verification path.

4. MCP server for agent-managed configuration

A first-class Model Context Protocol server exposes 17 tools (configure providers, create projects with daily budgets, query costs, tail logs, run health checks) over stdio and HTTP/SSE. Claude Code, Cursor, Codex, and Cline can all manage your gateway conversationally.

Is VoiceGateway right for you? If you are building a text-only LLM application without a voice component, LiteLLM is likely a better fit. It has a broader LLM-provider catalog and an OpenAI-compatible HTTP proxy. See the decision tree for a longer breakdown.

Deploy

One command to Fly.io public HTTPS URL, persistent storage, MCP-ready.

git clone https://github.com/mahimailabs/voicegateway
cd voicegateway/deploy/fly
./deploy.sh

You get a *.fly.dev URL, an MCP endpoint your coding agent can connect to, and encrypted API key storage. Fly uses pay-as-you-go pricing (~$1-3/month for light use; volumes billed even when suspended). Deployment guide →

Other options: Docker Compose locally, Hetzner/Oracle for cheap self-host, or any Docker host. See docs.voicegateway.dev/guide/installation.

Quick Start

Option 1: One-line install (recommended)

curl -fsSL https://voicegateway.mahimai.ca/install.sh | bash
voicegw onboard --install-daemon

The installer detects your OS (macOS / Linux / WSL), refuses cleanly if Python 3.11+ is missing, ensures pipx is on PATH, then pipx install voicegateway[cloud,dashboard]. The wizard collects project / provider / API key / port / install-daemon, validates the key against the upstream API, registers a per-user daemon (LaunchAgent / systemd --user / Scheduled Task), and offers an end-to-end smoke test.

For ad-hoc operations once the daemon is running:

voicegw status            # daemon + provider status
voicegw doctor            # ten-check punch list with fix steps
voicegw start / stop / restart
voicegw uninstall-daemon  # remove registration; preserves config + DB
voicegw tui               # four-tab terminal UI (Sessions / Costs / Logs / Providers)

voicegw tui opens a Textual-based terminal UI with vim navigation: live monitoring of sessions, costs, logs, and providers without leaving the shell. Polls the daemon at 1 s in Gateway mode (default), or pass --local for read-only inspection of the SQLite call DB when the daemon is down. Install the extra with pipx inject voicegateway "voicegateway[tui]" after the one-line install, or include it directly in any manual pip install (pip install "voicegateway[tui]"). Full reference →

If you prefer to skip the curl-bash one-liner, run the same pipx step manually:

pipx install "voicegateway[cloud,dashboard,mcp]"
voicegw onboard --install-daemon

For pip-based installs (e.g. inside an existing virtualenv):

pip install "voicegateway[cloud,dashboard,mcp]"
voicegw init              # creates voicegw.yaml
voicegw status            # verify providers
voicegw dashboard         # http://localhost:9090

Option 2: Docker (production-ready)

Pull the official image from Docker Hub (no build required):

docker run -p 8080:8080 \
  -v $(pwd)/voicegw-data:/data \
  -e OPENAI_API_KEY=sk-... \
  -e DEEPGRAM_API_KEY=dg_... \
  mahimairaja/voicegateway:latest

Multi-arch images for linux/amd64 and linux/arm64. Docker Hub →

Option 3: Docker Compose (recommended for self-hosting)

# docker-compose.yml
services:
  voicegateway:
    image: mahimairaja/voicegateway:latest
    ports: ["8080:8080"]
    volumes: ["./voicegw-data:/data"]
    env_file: .env

  dashboard:
    image: mahimairaja/voicegateway-dashboard:latest
    ports: ["9090:9090"]
    volumes: ["./voicegw-data:/data:ro"]
    depends_on: [voicegateway]

cp .env.example .env      # edit with your API keys
docker compose up -d
open http://localhost:9090

Your first agent

The example below runs a LiveKit Agents worker. You need a LiveKit server and credentials before it will connect.

LiveKit Cloud (free tier): sign up at livekit.io, create a project, and copy the URL plus API key and secret from project settings.

Self-hosted (local dev):

docker run --rm -p 7880:7880 -p 7881:7881 -p 7882:7882/udp \
  livekit/livekit-server --dev

Default keys are devkey / secret. Full self-host guide: livekit.io self-hosting.

Install the agents SDK and export credentials:

pip install livekit-agents
export LIVEKIT_URL=wss://<project>.livekit.cloud   # or ws://localhost:7880
export LIVEKIT_API_KEY=<key>                       # `devkey` for local --dev
export LIVEKIT_API_SECRET=<secret>                 # `secret` for local --dev

Without these the agent fails with ConnectionError: Failed to connect.

from livekit.agents import AgentSession
from voicegateway import inference

session = AgentSession(
    stt=inference.STT("deepgram/nova-3"),
    llm=inference.LLM("openai/gpt-4o-mini"),
    tts=inference.TTS("cartesia/sonic-3:voice_id"),
)

Full tutorial: docs.voicegateway.dev/guide/first-agent

Manage from your coding agent (MCP)

VoiceGateway ships a first-class Model Context Protocol server. Your Claude Code, Cursor, or Codex instance can configure providers, create projects, check costs, and tail logs all through natural language.

Local (stdio)

pip install "voicegateway[mcp]"
claude mcp add voicegateway --command "voicegw mcp --transport stdio"

Remote (HTTP/SSE with bearer auth)

export VOICEGW_MCP_TOKEN=$(openssl rand -hex 32)
voicegw mcp --transport http --port 8090

Then in Claude Code:

claude mcp add voicegateway \
  --transport sse \
  --url https://your-host.fly.dev/mcp/sse \
  --header "Authorization: Bearer $VOICEGW_MCP_TOKEN"

What you can ask your agent

"List all my providers"
"Add Deepgram with API key dg_live_..."
"Create a project for Tony's Pizza with a $5 daily budget using the premium stack"
"Show me yesterday's costs for tonys-pizza"
"What's our P95 TTFB this week?"
"Delete the dev-testing project" (agent shows preview, asks for confirmation)

17 tools available

Category	Tools
Observability	`get_health`, `get_provider_status`, `get_costs`, `get_latency_stats`, `get_logs`
Providers	`list_providers`, `get_provider`, `test_provider`, `add_provider`, `delete_provider`
Models	`list_models`, `register_model`, `delete_model`
Projects	`list_projects`, `get_project`, `create_project`, `delete_project`

Destructive operations (delete_*) require explicit confirm=True the agent receives a preview with impact details first and only deletes after you confirm. Full tool reference →

Projects

Organize agents into projects for per-project provider keys, budgets, and cost tracking:

# voicegw.yaml
projects:
  restaurant-agent:
    name: "Restaurant Receptionist"
    description: "AI receptionist for Tony's Pizza"
    daily_budget: 5.00
    budget_action: warn       # warn | throttle | block
    tags: ["production", "client-ian"]
    providers:
      openai:
        api_key: ${RESTAURANT_OPENAI_KEY}
      deepgram:
        api_key: ${RESTAURANT_DEEPGRAM_KEY}
      cartesia:
        api_key: ${RESTAURANT_CARTESIA_KEY}

  dev-testing:
    name: "Development Testing"
    daily_budget: 0.00
    tags: ["development"]

default_project: restaurant-agent

Use in code:

from voicegateway import inference

# Either set a default_project: in voicegw.yaml, or pick the project
# explicitly per call context:
inference.set_project("restaurant-agent")

stt = inference.STT("deepgram/nova-3")
llm = inference.LLM("openai/gpt-4o-mini")
tts = inference.TTS("cartesia/sonic-3")

CLI:

voicegw projects                          # list all projects
voicegw project restaurant-agent          # project details
voicegw costs --project restaurant-agent  # project costs today
voicegw logs --project restaurant-agent   # recent requests

Projects guide →

Fallback Chains

Resolver-time fallback at agent startup. Walk a chain manually and pass the first model whose provider plugin imports cleanly and whose key resolves into the inference factory; the rest are kept as backups for the next worker spawn:

# voicegw.yaml
fallbacks:
  stt: [deepgram/nova-3, groq/whisper-large-v3, local/whisper-large-v3]
  llm: [openai/gpt-4o-mini, groq/llama-3.3-70b, ollama/qwen2.5:7b]
  tts: [cartesia/sonic-3, elevenlabs/eleven_turbo_v2_5, local/kokoro]

See examples/fallback_agent.py for the worked startup-walk pattern. Once AgentSession starts, the resolved model is used for the whole call: VoiceGateway does not swap providers mid-call. For runtime mid-call failover, compose LiveKit's FallbackAdapter around VG inference.* instances. v0.0.6 adds a first-class fallback= parameter to the inference factories so the manual walk goes away.

Supported Models

11 providers across cloud and local. Add more with one line in voicegw.yaml or let your coding agent do it via MCP.

STT

Model ID	Provider	Type
`deepgram/nova-3`	Deepgram	cloud
`deepgram/nova-2-conversationalai`	Deepgram	cloud
`assemblyai/universal-2`	AssemblyAI	cloud
`openai/whisper-1`	OpenAI	cloud
`groq/whisper-large-v3`	Groq	cloud
`local/whisper-large-v3`	faster-whisper	local
`local/whisper-turbo`	faster-whisper	local

LLM

Model ID	Provider	Type
`openai/gpt-4.1`	OpenAI	cloud
`openai/gpt-4o`	OpenAI	cloud
`openai/gpt-4o-mini`	OpenAI	cloud
`anthropic/claude-opus-4-7`	Anthropic	cloud
`anthropic/claude-sonnet-4-6`	Anthropic	cloud
`anthropic/claude-haiku-4-5`	Anthropic	cloud
`groq/llama-3.3-70b-versatile`	Groq	cloud
`groq/llama-3.1-8b-instant`	Groq	cloud
`ollama/qwen2.5:7b`	Ollama	local
`ollama/qwen2.5:3b`	Ollama	local
`ollama/llama3.2:3b`	Ollama	local

TTS

Model ID	Provider	Type
`cartesia/sonic-3`	Cartesia	cloud
`elevenlabs/eleven_turbo_v2_5`	ElevenLabs	cloud
`elevenlabs/eleven_flash_v2_5`	ElevenLabs	cloud
`deepgram/aura-2`	Deepgram	cloud
`openai/tts-1-hd`	OpenAI	cloud
`local/kokoro`	Kokoro ONNX	local
`local/piper`	Piper	local

Full reference: docs.voicegateway.dev/configuration/providers

Architecture

flowchart TB
    A[LiveKit Agent] --> B[VoiceGateway]
    B --> C[Router]
    C --> D[Cloud Providers]
    C --> E[Local Providers]
    D --> D1[OpenAI · Deepgram · Anthropic · Cartesia · Groq · ElevenLabs · AssemblyAI]
    E --> E1[Ollama · Whisper · Kokoro · Piper]
    B --> F[Middleware Pipeline]
    F --> F1[Cost Tracker]
    F --> F2[Latency Monitor]
    F --> F3[Budget Enforcer]
    F --> F4[Fallback Router]
    F --> G[(SQLite · encrypted)]
    G --> H[Dashboard UI]
    G --> I[MCP Server]
    I --> J[Claude Code · Cursor · Codex]

Architecture deep dive →

Dashboard

A self-hosted web UI at http://localhost:9090 with:

Overview - total requests, cost today, active models, project summary cards
Settings - add/edit providers, register models, manage general config with Source badges (YAML vs Custom vs Env)
Projects - full CRUD with budget gauges, cost charts, recent requests per project
Costs - daily spend with per-provider/model/project breakdown
Latency - P50/P95/P99 TTFB and total latency per model
Logs - recent requests with filters for project, modality, status

API keys are encrypted with Fernet before storage. The sidebar project switcher filters every page.

HTTP API

voicegw serve --port 8080

Endpoint	Purpose
`GET /health`	Health check
`GET /v1/status`	Provider health + model count
`GET /v1/models`	List registered models
`GET /v1/providers` + CRUD	Manage providers
`GET /v1/projects` + CRUD	Manage projects
`GET /v1/costs?period=today&project=X`	Cost summary
`GET /v1/latency?period=week`	Latency stats
`GET /v1/logs?project=X&modality=stt`	Request logs
`GET /v1/audit-log`	Config change history
`GET /v1/metrics`	Prometheus-format metrics

Full reference: docs.voicegateway.dev/api/http-api

Installation

# Core engine
pip install voicegateway

# With web dashboard
pip install "voicegateway[dashboard]"

# With cloud providers (OpenAI, Deepgram, Anthropic, etc.)
pip install "voicegateway[cloud]"

# With local model runtimes (Whisper, Kokoro, Piper)
pip install "voicegateway[local]"

# With MCP server for agent management
pip install "voicegateway[mcp]"

# With the four-tab terminal UI (voicegw tui)
pip install "voicegateway[tui]"

# Everything
pip install "voicegateway[all,dashboard,mcp,tui]"

Python 3.11+. MCP extra pulls in mcp>=1.2.0. Local extras pull larger ML runtimes.

Docker Compose

services:
  voicegateway:
    image: mahimailabs/voicegateway:latest
    ports: ["8080:8080"]
    env_file: .env
    volumes:
      - ./voicegw.yaml:/app/voicegw.yaml:ro
      - voicegw_data:/data

  dashboard:
    image: mahimailabs/voicegateway-dashboard:latest
    ports: ["9090:9090"]
    depends_on: [voicegateway]

  # Optional: local LLM with Ollama
  ollama:
    image: ollama/ollama
    profiles: [local]
    ports: ["11434:11434"]

docker compose up -d                      # core + dashboard
docker compose --profile local up -d      # + Ollama for local LLMs
docker exec voicegateway-ollama ollama pull qwen2.5:3b

Contributing

We welcome provider additions, bug fixes, and documentation improvements.

git clone https://github.com/mahimailabs/voicegateway
cd voicegateway
pip install -e ".[all,dashboard,mcp,dev]"
pytest

Add a provider (10-step guide): docs.voicegateway.dev/contributing/adding-a-provider

Before submitting a PR, please read CONTRIBUTING.md and CODE_OF_CONDUCT.md. Found a security issue? Do not open a public issue: follow the disclosure flow in SECURITY.md.

Project metadata

CHANGELOG.md -- canonical changelog (mirrored into the docs site at build time).
CONTRIBUTING.md -- one-page contribution flow; deeper guides under docs/contributing/.
SECURITY.md -- vulnerability disclosure policy and supported-versions matrix.
CODE_OF_CONDUCT.md -- Contributor Covenant.
LICENSE -- MIT.
docker/ -- both Dockerfiles live here as of v0.1.2 (voicegateway.Dockerfile, dashboard.Dockerfile); docker-compose.yml at repo root references these paths.

License

MIT © Mahimai Labs

Built on the shoulders of giants: LiveKit Agents, FastAPI, Pydantic, cryptography, Model Context Protocol.

Project details

These details have not been verified by PyPI

Project links

Release history Release notifications | RSS feed

0.8.6

Jun 6, 2026

0.8.5

Jun 6, 2026

0.8.4

Jun 6, 2026

0.8.3

Jun 5, 2026

0.8.2

Jun 5, 2026

0.8.1

Jun 5, 2026

0.8.0

Jun 5, 2026

0.7.0

May 29, 2026

0.6.0

May 12, 2026

0.5.0

May 12, 2026

This version

0.4.0

May 12, 2026

0.3.0

May 11, 2026

0.2.0

May 11, 2026

0.1.2

May 11, 2026

0.1.1

May 10, 2026

0.1.0

May 10, 2026

0.0.5

May 10, 2026

0.0.4

May 6, 2026

0.0.3

Apr 18, 2026

0.0.2

Apr 17, 2026

0.0.1

Apr 13, 2026

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

voicegateway-0.4.0.tar.gz (918.3 kB view details)

Uploaded May 12, 2026 Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

The dropdown lists show the available interpreters, ABIs, and platforms. Enable javascript to be able to filter the list of wheel files.

voicegateway-0.4.0-py3-none-any.whl (286.9 kB view details)

Uploaded May 12, 2026 Python 3

File details

Details for the file voicegateway-0.4.0.tar.gz.

File metadata

Download URL: voicegateway-0.4.0.tar.gz
Upload date: May 12, 2026
Size: 918.3 kB
Tags: Source
Uploaded using Trusted Publishing? No
Uploaded via: twine/6.1.0 CPython/3.13.12

File hashes

Hashes for voicegateway-0.4.0.tar.gz
Algorithm	Hash digest
SHA256	`52f89dfaf08d92fe64e034336c35acef1b40976209ef3fe4b04131937ff3834e`
MD5	`63432f2496d5825543ee0916258c6608`
BLAKE2b-256	`02e99aff578b45a350ecb4df983503a25d07ee99e7945fe041b1bdb2dba61bb3`

See more details on using hashes here.

File details

Details for the file voicegateway-0.4.0-py3-none-any.whl.

File metadata

Download URL: voicegateway-0.4.0-py3-none-any.whl
Upload date: May 12, 2026
Size: 286.9 kB
Tags: Python 3
Uploaded using Trusted Publishing? No
Uploaded via: twine/6.1.0 CPython/3.13.12

File hashes

Hashes for voicegateway-0.4.0-py3-none-any.whl
Algorithm	Hash digest
SHA256	`90d51c4b4b8f1c9c7a8e390b0b3c311e427f430afb0c99424d9b6896e68cb41a`
MD5	`36cb7c875608a805bedf358dcb72fe3c`
BLAKE2b-256	`5d7f66c87fc7ec888de03d58dd6a01a643ac111a00b69e66f2b0e8703bef5fe5`

See more details on using hashes here.

voicegateway 0.4.0

Navigation

Verified details

Maintainers

Meta

Unverified details

Project links

Meta

Classifiers

Project description

VoiceGateway

Why VoiceGateway

1. One-line drop-in for livekit.agents.inference

2. Modality-aware unit accounting

3. Reconciliation tooling

4. MCP server for agent-managed configuration

Deploy

Quick Start

Option 1: One-line install (recommended)

Option 2: Docker (production-ready)

Option 3: Docker Compose (recommended for self-hosting)

Your first agent

Manage from your coding agent (MCP)

Local (stdio)

Remote (HTTP/SSE with bearer auth)

What you can ask your agent

17 tools available

Projects

Fallback Chains

Supported Models

STT

LLM

TTS

Architecture

Dashboard

HTTP API

Installation

Docker Compose

Contributing

Project metadata

License

Project details

Verified details

Maintainers

Meta

Unverified details

Project links

Meta

Classifiers

Release history Release notifications | RSS feed

Download files

Source Distribution

Built Distribution

File details

File metadata

File hashes

File details

File metadata

File hashes

1. One-line drop-in for `livekit.agents.inference`