Skip to main content

A local-first Python framework that turns voice messages into executable actions.

Project description

VoiceRouter

Python License: MIT Local-first

Author: Alessandro Valenti

VoiceRouter is a local-first Python framework that turns voice/audio messages into executable Python actions.

Pipeline:

audio → transcription → intent detection → parameter extraction → function routing

No paid APIs. Local by default.

Install

If your system Python is externally-managed (PEP 668), use a virtual environment:

python3 -m venv .venv
. .venv/bin/activate
python -m pip install -U pip

Base (interfaces only):

pip install -e .

Local STT + embeddings intent detection:

pip install -e ".[local,cli]"

Microphone live capture (optional):

pip install -e ".[audio]"

HTTP + WebSocket streaming server (optional):

pip install -e ".[server]"

If you want CPU-only PyTorch wheels (recommended for local-first CPU setups), install PyTorch from the CPU index first, then install VoiceRouter:

pip install --index-url https://download.pytorch.org/whl/cpu torch
pip install -e ".[local,cli]"

Optional NLP entities (spaCy):

pip install -e ".[nlp]"
python -m spacy download en_core_web_sm

Optional Ollama fallback:

pip install -e ".[ollama]"

Optional YAML intent config support:

pip install -e ".[yaml]"

Quickstart

Create a router, register intents, and route an audio file:

from voicerouter import VoiceRouter

router = VoiceRouter()


@router.intent(
    "create_ticket",
    examples=["open a ticket", "server is down", "I have a problem"],
)
def create_ticket(ctx):
    return {"status": "created", "params": ctx.params}


result = router.route("examples/server_down_en.wav")
print(result)

The first run may download open models from Hugging Face (e.g. the MiniLM embedding model). No paid APIs are used.

Return format:

{
  "transcript": "...",
  "intent": "...",
  "confidence": 0.0,
  "params": {},
  "used_llm": false,
  "action_result": {}
}

CLI

voicerouter intents
voicerouter route path/to/audio.wav
voicerouter listen
voicerouter serve --host 127.0.0.1 --port 8000

listen is best-effort and depends on optional local audio tooling (e.g. sounddevice, numpy, scipy).

Streaming (HTTP / WebSocket)

Start the server:

voicerouter serve --host 127.0.0.1 --port 8000

Or run the example server (lets you customize intents in code):

.venv/bin/python examples/server_runner.py --host 127.0.0.1 --port 8000

Endpoints:

  • GET /health{"status":"ok"}
  • GET /intents{"intents":[...]}
  • POST /route (multipart upload: file=@audio.wav) → routing result JSON
  • POST /route/bytes (raw body bytes; default assumes wav) → routing result JSON
  • WS /ws/route (send binary audio chunks, then send text "end") → result JSON

HTTP Upload (curl)

curl -sS -X POST "http://127.0.0.1:8000/route" \
  -F "file=@examples/server_down_en.wav"

HTTP Upload (Python)

pip install httpx
.venv/bin/python examples/http_upload_client.py examples/server_down_en.wav

WebSocket Audio Stream (Python)

pip install websockets
.venv/bin/python examples/ws_audio_stream_client.py examples/server_down_en.wav

Example Script

Run the included example:

.venv/bin/python examples/basic_usage.py

Or pass your own audio file:

.venv/bin/python examples/basic_usage.py /path/to/audio.wav

Microphone Live Example

pip install -e ".[local,audio]"
.venv/bin/python examples/microphone_live.py --seconds 5

Design

  • Transcription: faster-whisper backend (CPU-friendly int8 by default)
  • Intent detection: sentence-transformers embeddings similarity against examples
  • Parameter extraction: regex rules by default, optional spaCy entity extraction
  • Routing: decorator-based handlers with a typed context object
  • Optional fallback: local Ollama JSON intent+params when confidence is low

Extensibility

All major components are pluggable:

  • Custom STT engines
  • Custom intent detectors
  • Custom parameter extractors
  • Optional LLM fallback client

See voicerouter/router.py for the interfaces.

License

MIT

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

voicerouter-0.1.0.tar.gz (14.3 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

voicerouter-0.1.0-py3-none-any.whl (18.0 kB view details)

Uploaded Python 3

File details

Details for the file voicerouter-0.1.0.tar.gz.

File metadata

  • Download URL: voicerouter-0.1.0.tar.gz
  • Upload date:
  • Size: 14.3 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.1.0 CPython/3.13.12

File hashes

Hashes for voicerouter-0.1.0.tar.gz
Algorithm Hash digest
SHA256 7c601c4308c6b9e6403aea321e9e78383da38976a047e75aba12e394809a7800
MD5 10d8d46578645b7584effe461c6e3ab5
BLAKE2b-256 062054fab2c6ecbed1b96d5f99f36988e068e728aded07e64cd278cb9f355b7d

See more details on using hashes here.

File details

Details for the file voicerouter-0.1.0-py3-none-any.whl.

File metadata

  • Download URL: voicerouter-0.1.0-py3-none-any.whl
  • Upload date:
  • Size: 18.0 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.1.0 CPython/3.13.12

File hashes

Hashes for voicerouter-0.1.0-py3-none-any.whl
Algorithm Hash digest
SHA256 2428787ac8afd3426533750682334c0ffa2a581a2ac09e162fbdb331acf09836
MD5 a9d670ca08b1acbf085642f9d8a61798
BLAKE2b-256 9ec516665d202df8e16b5c216d0cd0c4d4b1275781d83974c0227ae76a7d54fc

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page