Skip to main content

Hivemind Voice Relay — local wakeword, remote STT/TTS via hivemind-core

Project description

HiveMind Voice Relay

License PyPI Python

Local wakeword detection; STT and TTS handled remotely by hivemind-core running the hivemind-audio-binary-protocol plugin.

Voice Relay runs the microphone, VAD, and wakeword engine on-device — keeping wake-word detection private and low-latency — while forwarding audio to hivemind-core (running the hivemind-audio-binary-protocol plugin) for speech-to-text, and receiving synthesised audio back for playback. No STT or TTS models run on the device.

Full documentation: docs/


Satellite spectrum

Satellite Mic VAD Wake word STT TTS Connects to
HiveMind-cli hivemind-core
hivemind-mic-satellite local local server server server core + audio-binary-protocol
HiveMind-voice-relay (this repo) local local local server server core + audio-binary-protocol
HiveMind-voice-sat local local local local local hivemind-core

Voice Relay keeps wakeword detection on-device (low latency, no audio leaves until activation) while STT and TTS run on the hive. The point is not mainly resource savings — it is what it means for the hive to own speech services (see below).


Why voice-relay — HiveMind as a service

Voice-relay's real lesson is architectural. STT and TTS run inside the hive (the hivemind-audio-binary-protocol plugin on hivemind-core) and sit behind the same access-key authentication as the rest of the mesh. For a developer, the consequences matter more than the saved CPU:

  • The hive owns STT/TTS. A voice-sat can point at any STT/TTS plugin it likes — including a public ovos-stt-plugin-server / ovos-tts-plugin-server. A relay cannot: it does not choose the engine, model, or voice. The hive operator decides, centrally and uniformly, for every relay that connects.
  • Speech is authenticated. STT/TTS are not an open endpoint anyone can hit — access is gated by the client's HiveMind credentials, exactly like every other message on the protocol.
  • It is the reference for the b64 speech API. The relay sends audio for STT and receives speech for TTS as base64-encoded WAV over the HiveMessage bus (recognizer_loop:b64_transcribe, speak:b64_audio) — the same work mic-satellite does over the binary protocol. Relay illustrates the b64 path; it could equally use binary.

Choose voice-relay when you want HiveMind to operate STT/TTS as a governed, authenticated service — uniform and centrally controlled — with wakeword kept local for latency and privacy. Lower device resource use is a consequence, not the goal.


Server requirements

⚠️ Your hivemind-core server must have the hivemind-audio-binary-protocol binary plugin installed. Plain hivemind-core does not handle STT or TTS — connecting to it will result in silence (no transcription, no spoken response).

Alternatively, run hivemind-core together with ovos-audio and ovos-dinkum-listener to provide the same capabilities.


Install

pip install HiveMind-voice-relay

60-second quickstart

1. Configure identity (one-time):

hivemind-client set-identity --key YOUR_ACCESS_KEY --password YOUR_PASSWORD --host wss://your-listener-host

2. Run:

hivemind-voice-relay

3. Speak your wake word. The default wake word is hey mycroft (configured in ~/.config/mycroft/mycroft.conf).


CLI flags

Usage: hivemind-voice-relay [OPTIONS]

  connect to hivemind-core running the audio binary protocol

Options:
  --host TEXT      hivemind host (ws:// or wss://)
  --key TEXT       Access Key
  --password TEXT  Password for key derivation
  --port INTEGER   HiveMind port number (default: 5678)
  --selfsigned     Accept self-signed TLS certificates
  --siteid TEXT    Location identifier for message context
  --help           Show this message and exit.

All flags fall back to values stored by hivemind-client set-identity.


Configuration

Voice Relay reads ~/.config/mycroft/mycroft.conf (standard OVOS config).

Plugin type Config key Default Required
Microphone microphone.module ovos-microphone-plugin-alsa Yes
VAD listener.VAD.module ovos-vad-plugin-silero Yes
Wake word listener.wake_word hey_mycroft Yes
G2P tts.g2p_module No
Media Playback Audio.backends No
OCP Plugins No
Dialog Transformers No
TTS Transformers No
PHAL No (auto-loaded if installed)

See docs/configuration.md for full details and plugin swap instructions.


Features and limitations

Built on ovos-simple-listener. Compared to the full voice-satellite:

Present:

  • Microphone capture, VAD, wakeword detection — all local
  • Audio forwarded to hivemind-core (hivemind-audio-binary-protocol plugin) for STT (base64-encoded WAV over the HiveMessage bus)
  • TTS audio synthesised server-side and streamed back for local playback
  • PHAL (platform hardware abstraction) auto-loaded if installed
  • Standard OVOS plugin system for mic, VAD, and wakeword

Not supported (use HiveMind-voice-sat if you need these):

  • Local STT / TTS plugins
  • Audio Transformers
  • Continuous / Hybrid / Recording / Sleep listening modes
  • Multiple wake words

Related

Project Role
hivemind-audio-binary-protocol Required hivemind-core plugin — provides server-side STT + TTS
hivemind-core Base mesh node (no STT/TTS)
HiveMind-cli Text-only satellite
hivemind-mic-satellite Thinnest audio satellite (no local wakeword)
HiveMind-voice-sat Full local stack satellite
hivemind-bus-client HiveMind WebSocket client library
ovos-simple-listener Lightweight listener library used internally

Development

Install from source with the end-to-end test extra, then run the suite:

uv pip install -e ".[e2e]"
pytest tests/

pyproject.toml is the single packaging source of truth. The E2E suite runs a real hivemind-core master in-process and the real relay client over a real HiveMessageBusClient, with the microphone/wakeword and the remote STT/TTS endpoints mocked — see docs/development.md.


License

Apache-2.0

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

hivemind_voice_relay-1.2.0a1.tar.gz (15.5 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

hivemind_voice_relay-1.2.0a1-py3-none-any.whl (13.4 kB view details)

Uploaded Python 3

File details

Details for the file hivemind_voice_relay-1.2.0a1.tar.gz.

File metadata

  • Download URL: hivemind_voice_relay-1.2.0a1.tar.gz
  • Upload date:
  • Size: 15.5 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.1.0 CPython/3.13.12

File hashes

Hashes for hivemind_voice_relay-1.2.0a1.tar.gz
Algorithm Hash digest
SHA256 e85952b2731c7b6655a8ee7c47ca215254bedebbd0e07e8d56be99d15c45e074
MD5 d62097e9bfb3ce8ed1cbeb8257497094
BLAKE2b-256 01ea1af40ca04035e5d49230a90a3af7efd735b3be720d284ae30f460f524262

See more details on using hashes here.

File details

Details for the file hivemind_voice_relay-1.2.0a1-py3-none-any.whl.

File metadata

File hashes

Hashes for hivemind_voice_relay-1.2.0a1-py3-none-any.whl
Algorithm Hash digest
SHA256 497c92b736d00ecfbe76ff712843885535036e3df3979144515c6bd466e803d8
MD5 8ff037ea20c50bffdbb9e213823294bc
BLAKE2b-256 4b33a66be6c33b58fa324fb8cfec86a9f82de0e596e4cda7304d020954a37b51

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page