Skip to main content

Audio binary protocol for HiveMind

Project description

hivemind-audio-binary-protocol

Binary audio plugin for hivemind-core.

Adds server-side WakeWord detection, VAD, STT, and TTS to a hivemind-core hub so that lightweight satellites (e.g. hivemind-mic-satellite) can stream raw audio and receive transcriptions or synthesised speech without running those models locally.

This plugin is the direct replacement for the old "HiveMind-listener" proof-of-concept.

Where it fits

hivemind-core
  └── hivemind-plugin-manager  (BinaryDataHandlerFactory loads plugins by entry-point)
        └── hivemind-audio-binary-protocol  ← this repo
              ├── ovos-simple-listener  (WakeWord + VAD + STT pipeline)
              └── OVOSTTSFactory / OVOSSTTFactory / OVOSVADFactory / OVOSWakeWordFactory

The plugin registers under the hivemind.binary.protocol entry-point group as hivemind-audio-binary-protocol-plugin.

Install

pip install hivemind-audio-binary-protocol

You also need OVOS STT, TTS, VAD, and WakeWord plugins. Install them as you would in a standard OVOS setup:

pip install ovos-stt-plugin-server ovos-tts-plugin-piper ovos-vad-plugin-silero \
            ovos-ww-plugin-precise-lite

Quickstart

Add the binary_protocol block to ~/.config/hivemind-core/server.json:

{
  "binary_protocol": {
    "module": "hivemind-audio-binary-protocol-plugin",
    "hivemind-audio-binary-protocol-plugin": {
      "stt": {
        "module": "ovos-stt-plugin-server",
        "ovos-stt-plugin-server": {"url": "https://stt.openvoiceos.org"}
      },
      "tts": {
        "module": "ovos-tts-plugin-piper",
        "ovos-tts-plugin-piper": {"voice": "en_US-lessac-medium"}
      },
      "vad": {
        "module": "ovos-vad-plugin-silero"
      },
      "wake_word": "hey_mycroft",
      "hotwords": {
        "hey_mycroft": {
          "module": "ovos-ww-plugin-precise-lite",
          "model": "https://github.com/OpenVoiceOS/precise-lite-models/raw/master/wakewords/en/hey_mycroft.tflite"
        }
      }
    }
  }
}

Then start hivemind-core with the listen subcommand:

hivemind-core listen

Audio streaming modes

This plugin handles three distinct binary audio flows:

Mode Client sends Hub returns Use case
Microphone stream Raw PCM audio chunks Bus messages (wakeword/utterance events) Mic satellite; hub does all pipeline processing
STT transcription Raw PCM audio recognizer_loop:transcribe.response Client wants transcription without triggering skills
STT handle Raw PCM audio Triggers recognizer_loop:utterance on the bus Client wants the hub to handle the utterance

TTS is triggered by the bus (speak:synth or speak:b64_audio) and returns binary WAV audio or a Base64-encoded string back to the client.

Configuration reference

The plugin's config block mirrors the OVOS plugin config convention. Each sub-plugin (stt, tts, vad) takes its standard OVOS config:

Key Description
stt STT plugin config. module selects the OVOS STT plugin.
tts TTS plugin config. module selects the OVOS TTS plugin.
vad VAD plugin config. module selects the OVOS VAD plugin.
wake_word WakeWord name (key into hotwords).
hotwords Dict of wakeword configurations, keyed by wakeword name.
utterance_transformers List of OVOS utterance transformer plugin names.
dialog_transformers List of OVOS dialog transformer plugin names.
metadata_transformers List of OVOS metadata transformer plugin names.

If the config block is omitted, the plugin falls back to reading mycroft.conf (the standard OVOS configuration file) to select plugins.

Access control

This plugin respects hivemind-core's per-client allowed_types whitelist. Clients must be provisioned with appropriate access to send binary audio or receive TTS output.

Docs

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

hivemind_audio_binary_protocol-2.1.6a2.tar.gz (15.7 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

File details

Details for the file hivemind_audio_binary_protocol-2.1.6a2.tar.gz.

File metadata

File hashes

Hashes for hivemind_audio_binary_protocol-2.1.6a2.tar.gz
Algorithm Hash digest
SHA256 50f2771a1c2b16197b6371cb7714f48056afb82f7dabd56e32a5026d11f22e40
MD5 5ddd40c5d8115567bb8853ca2b3090c4
BLAKE2b-256 8b349b5d63c47ddd24d1f6358cbbf17739fed68e7dc27b52507f14d788cfecbc

See more details on using hashes here.

File details

Details for the file hivemind_audio_binary_protocol-2.1.6a2-py3-none-any.whl.

File metadata

File hashes

Hashes for hivemind_audio_binary_protocol-2.1.6a2-py3-none-any.whl
Algorithm Hash digest
SHA256 73affbf9c89d262109f521285e4e2986b297be01118226426835af47edcf8434
MD5 471e0588d6d24c450116b101b4112b34
BLAKE2b-256 d2ba7ddb9b7d1fb33e82bf1d0332e79efdc81ebcad011a21ee71af59d5a9c794

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page