Skip to main content

Audio binary protocol for HiveMind

Project description

hivemind-audio-binary-protocol

Binary audio plugin for hivemind-core.

Adds server-side WakeWord detection, VAD, STT, and TTS to a hivemind-core hub so that lightweight satellites (e.g. hivemind-mic-satellite) can stream raw audio and receive transcriptions or synthesised speech without running those models locally.

This plugin is the direct replacement for the old "HiveMind-listener" proof-of-concept.

Where it fits

hivemind-core
  └── hivemind-plugin-manager  (BinaryDataHandlerFactory loads plugins by entry-point)
        └── hivemind-audio-binary-protocol  ← this repo
              ├── ovos-simple-listener  (WakeWord + VAD + STT pipeline)
              └── OVOSTTSFactory / OVOSSTTFactory / OVOSVADFactory / OVOSWakeWordFactory

The plugin registers under the hivemind.binary.protocol entry-point group as hivemind-audio-binary-protocol-plugin.

Install

pip install hivemind-audio-binary-protocol

You also need OVOS STT, TTS, VAD, and WakeWord plugins. Install them as you would in a standard OVOS setup:

pip install ovos-stt-plugin-server ovos-tts-plugin-piper ovos-vad-plugin-silero \
            ovos-ww-plugin-precise-lite

Quickstart

Add the binary_protocol block to ~/.config/hivemind-core/server.json:

{
  "binary_protocol": {
    "module": "hivemind-audio-binary-protocol-plugin",
    "hivemind-audio-binary-protocol-plugin": {
      "stt": {
        "module": "ovos-stt-plugin-server",
        "ovos-stt-plugin-server": {"url": "https://stt.openvoiceos.org"}
      },
      "tts": {
        "module": "ovos-tts-plugin-piper",
        "ovos-tts-plugin-piper": {"voice": "en_US-lessac-medium"}
      },
      "vad": {
        "module": "ovos-vad-plugin-silero"
      },
      "wake_word": "hey_mycroft",
      "hotwords": {
        "hey_mycroft": {
          "module": "ovos-ww-plugin-precise-lite",
          "model": "https://github.com/OpenVoiceOS/precise-lite-models/raw/master/wakewords/en/hey_mycroft.tflite"
        }
      }
    }
  }
}

Then start hivemind-core with the listen subcommand:

hivemind-core listen

Audio streaming modes

This plugin handles three distinct binary audio flows:

Mode Client sends Hub returns Use case
Microphone stream Raw PCM audio chunks Bus messages (wakeword/utterance events) Mic satellite; hub does all pipeline processing
STT transcription Raw PCM audio recognizer_loop:transcribe.response Client wants transcription without triggering skills
STT handle Raw PCM audio Triggers recognizer_loop:utterance on the bus Client wants the hub to handle the utterance

TTS is triggered by the bus (speak:synth or speak:b64_audio) and returns binary WAV audio or a Base64-encoded string back to the client.

Configuration reference

The plugin's config block mirrors the OVOS plugin config convention. Each sub-plugin (stt, tts, vad) takes its standard OVOS config:

Key Description
stt STT plugin config. module selects the OVOS STT plugin.
tts TTS plugin config. module selects the OVOS TTS plugin.
vad VAD plugin config. module selects the OVOS VAD plugin.
wake_word WakeWord name (key into hotwords).
hotwords Dict of wakeword configurations, keyed by wakeword name.
utterance_transformers List of OVOS utterance transformer plugin names.
dialog_transformers List of OVOS dialog transformer plugin names.
metadata_transformers List of OVOS metadata transformer plugin names.

If the config block is omitted, the plugin falls back to reading mycroft.conf (the standard OVOS configuration file) to select plugins.

Access control

This plugin respects hivemind-core's per-client allowed_types whitelist. Clients must be provisioned with appropriate access to send binary audio or receive TTS output.

Docs

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

hivemind_audio_binary_protocol-2.1.6a3.tar.gz (15.9 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

File details

Details for the file hivemind_audio_binary_protocol-2.1.6a3.tar.gz.

File metadata

File hashes

Hashes for hivemind_audio_binary_protocol-2.1.6a3.tar.gz
Algorithm Hash digest
SHA256 953161e8efd27f81009926e868d017f497b4ee763c1b4f5c0307360904414082
MD5 a36ae00146072b22df44a29dc988f66a
BLAKE2b-256 2318c3b7afa47a8fcb94d6128bd367ffe114bb188a60ee2541227924c07d4db2

See more details on using hashes here.

File details

Details for the file hivemind_audio_binary_protocol-2.1.6a3-py3-none-any.whl.

File metadata

File hashes

Hashes for hivemind_audio_binary_protocol-2.1.6a3-py3-none-any.whl
Algorithm Hash digest
SHA256 c6e0f4eb494f3ced2dfcb9c0db8eb93ed4fde5105ebfbbe54bb2098e7e49f0c2
MD5 b1ae3add2b727af44a352b8e09b6f3fc
BLAKE2b-256 256b7bccf8003e6abf4810b113179ea6bfb3fd6b7c4cf7e001a80dd56ba70110

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page