Privacy-first API proxy routing voice+vision to local models with cloud fallback

These details have not been verified by PyPI

Project links

Project description

Multimodal Voice API Proxy

A privacy-first API proxy that intelligently routes voice and vision requests to local models with automatic cloud fallback.

What is this?

This proxy lets you run AI transcription (Whisper) and vision analysis (LLaVA) locally for privacy and cost savings, while automatically falling back to cloud APIs (OpenAI, Anthropic) when local resources are unavailable. It includes smart caching, usage metering, cost tracking, and multi-tenant API key management—perfect for developers building voice-enabled applications who want local-first privacy with cloud reliability as a safety net.

Features

Intelligent routing: Local-first processing with automatic cloud fallback
Privacy-focused: Audio and images processed on your hardware by default
Smart caching: Redis-backed caching prevents reprocessing identical inputs
Cost tracking: Real-time dashboard showing savings vs. cloud-only approach
Multi-tenant: API key management with per-key usage limits and metrics
OpenAPI-compatible: Drop-in replacement for OpenAI/Anthropic endpoints
Production-ready: Rate limiting, monitoring middleware, and async processing
Easy deployment: Docker Compose for local, one-click configs for Railway/Fly.io

Quick Start

Prerequisites

Docker and Docker Compose
8GB+ RAM (for local models)
GPU recommended but optional

Installation

Clone and configure

git clone <repository-url>
cd multimodal-voice-api-proxy
cp .env.example .env

Edit .env with your settings

# Required for cloud fallback
OPENAI_API_KEY=sk-...
ANTHROPIC_API_KEY=sk-ant-...

# Database
DATABASE_URL=postgresql://user:pass@db:5432/proxy

# Redis cache
REDIS_URL=redis://redis:6379/0

Launch with Docker Compose

docker-compose up -d

Run migrations

docker-compose exec api alembic upgrade head

Create your first API key

curl -X POST http://localhost:8000/keys \
  -H "Content-Type: application/json" \
  -d '{"name": "My App", "rate_limit": 100}'

The API will be available at http://localhost:8000

Usage

Audio Transcription

curl -X POST http://localhost:8000/v1/audio/transcriptions \
  -H "Authorization: Bearer YOUR_API_KEY" \
  -F file=@audio.mp3 \
  -F model=whisper-1

Vision Analysis

curl -X POST http://localhost:8000/v1/vision/analyze \
  -H "Authorization: Bearer YOUR_API_KEY" \
  -H "Content-Type: application/json" \
  -d '{
    "image_url": "https://example.com/image.jpg",
    "prompt": "Describe this image"
  }'

Check Usage Stats

curl http://localhost:8000/keys/YOUR_API_KEY/stats \
  -H "Authorization: Bearer YOUR_API_KEY"

Response includes:

Total requests (local vs. cloud)
Cost savings
Cache hit rate
Rate limit status

Deployment

Railway

railway up

Fly.io

fly deploy

Self-Hosted

Use the included Dockerfile and docker-compose.yml for custom deployments.

Tech Stack

Framework: FastAPI (Python 3.11+)
Local Models: faster-whisper, llama-cpp-python
Cloud APIs: OpenAI, Anthropic
Database: PostgreSQL + SQLAlchemy + Alembic
Cache: Redis
Deployment: Docker, Railway, Fly.io

Configuration

Key environment variables:

Variable	Description	Default
`LOCAL_MODELS_ENABLED`	Enable local model processing	`true`
`MAX_LOCAL_REQUESTS`	Concurrent local requests before fallback	`5`
`CACHE_TTL`	Cache expiration in seconds	`3600`
`RATE_LIMIT_WINDOW`	Rate limit window in seconds	`60`

See .env.example for complete configuration options.

License

MIT License - see LICENSE file for details.

Built for developers who value privacy without sacrificing reliability.

Project details

These details have not been verified by PyPI

Project links

Release history Release notifications | RSS feed

This version

0.1.0

Mar 14, 2026

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

multimodal_voice_api_proxy-0.1.0.tar.gz (15.7 kB view details)

Uploaded Mar 14, 2026 Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

The dropdown lists show the available interpreters, ABIs, and platforms. Enable javascript to be able to filter the list of wheel files.

multimodal_voice_api_proxy-0.1.0-py3-none-any.whl (18.1 kB view details)

Uploaded Mar 14, 2026 Python 3

File details

Details for the file multimodal_voice_api_proxy-0.1.0.tar.gz.

File metadata

Download URL: multimodal_voice_api_proxy-0.1.0.tar.gz
Upload date: Mar 14, 2026
Size: 15.7 kB
Tags: Source
Uploaded using Trusted Publishing? No
Uploaded via: twine/6.2.0 CPython/3.9.25

File hashes

Hashes for multimodal_voice_api_proxy-0.1.0.tar.gz
Algorithm	Hash digest
SHA256	`c885f31d35c493c1921ff6bb13cd5addeac4179b44e5ad714e98b27523f3b144`
MD5	`5b0019cec616c56b26d47cff745b0d09`
BLAKE2b-256	`e53c98bc903b8558759d40a1163636ea9353fbdd2fc9dfbf8eea51dbdf59473f`

See more details on using hashes here.

File details

Details for the file multimodal_voice_api_proxy-0.1.0-py3-none-any.whl.

File metadata

Download URL: multimodal_voice_api_proxy-0.1.0-py3-none-any.whl
Upload date: Mar 14, 2026
Size: 18.1 kB
Tags: Python 3
Uploaded using Trusted Publishing? No
Uploaded via: twine/6.2.0 CPython/3.9.25

File hashes

Hashes for multimodal_voice_api_proxy-0.1.0-py3-none-any.whl
Algorithm	Hash digest
SHA256	`9e41e1382487ae3be7ad72da5b5dc7dee1be35266fe202e0db0f318cfb72a2c7`
MD5	`e316b22f965c8d2ff40cfdefb948ac4c`
BLAKE2b-256	`e612e000a1c281e39e27876d6064dc6634a160ce24870091d384618e7c223b26`

See more details on using hashes here.

multimodal-voice-api-proxy 0.1.0

Navigation

Verified details

Maintainers

Unverified details

Project links

Meta

Classifiers

Project description

Multimodal Voice API Proxy

What is this?

Features

Quick Start

Prerequisites

Installation

Usage

Audio Transcription

Vision Analysis

Check Usage Stats

Deployment

Railway

Fly.io

Self-Hosted

Tech Stack

Configuration

License

Project details

Verified details

Maintainers

Unverified details

Project links

Meta

Classifiers

Release history Release notifications | RSS feed

Download files

Source Distribution

Built Distribution

File details

File metadata

File hashes

File details

File metadata

File hashes