Skip to main content

OVOS wake word verifier plugin: reject wake words from non-enrolled speakers

Project description

ovos-ww-verifier-plugin-speaker

OVOS wake word verifier plugin that accepts voice commands only from enrolled household members.

After a wake word engine detects an activation, this verifier extracts a speaker embedding from the captured audio and compares it against enrolled profiles. Activations from unrecognised speakers are silently dropped.

Use case

Alice and Bob live together and use OVOS at home. They enroll their voices once. When a guest visits, their "Hey Mycroft" triggers the wake word detector — but the speaker verifier rejects it before any intent is processed. Alice and Bob's commands go through normally.

Privacy note

Speaker profiles are stored as fixed-length numeric vectors (embeddings) in a local JSON file under ~/.local/share/ovos_speaker_verifier/profiles.json. No audio is retained after embedding extraction. Embeddings cannot be reversed into audio.

Install

pip install ovos-ww-verifier-plugin-speaker

Enroll household members

ovos-speaker-enroll Alice clip1.wav clip2.wav clip3.wav
ovos-speaker-enroll Bob morning_command.wav evening_command.wav

More clips (5–30 s total per person) → more robust profile.

OVOS configuration

Add to ~/.config/mycroft/mycroft.conf (or OpenVoiceOS equivalent):

{
  "hotwords": {
    "hey mycroft": {
      "module": "...",
      "verifier": "ovos-ww-verifier-speaker",
      "verifier_config": {
        "model": "wespeaker-resnet34",
        "threshold": 0.45,
        "fail_open": true
      }
    }
  }
}

Configuration keys

Key Type Default Description
model str "wespeaker-resnet34" speakeronnx model alias or .onnx path
threshold float 0.45 Cosine similarity acceptance threshold
fail_open bool true Accept all activations when no profiles enrolled
profiles_path str ~/.local/share/ovos_speaker_verifier/profiles.json Override profile storage path
per_profile_thresholds dict {} Per-name threshold overrides, e.g. {"Alice": 0.5}
sample_rate int 16000 PCM sample rate of audio chunks passed to verify()
sample_width int 2 PCM sample width in bytes (2 = 16-bit)
channels int 1 PCM channel count

Supported models

The model key accepts any alias from speakeronnx's registry (models are downloaded from HuggingFace on first use and cached):

Alias Architecture
wespeaker-resnet34 (default) WeSpeaker ResNet34 r-vector
wespeaker-ecapa512 WeSpeaker ECAPA-TDNN-512
wespeaker-resnet293 WeSpeaker ResNet293 (large)
campplus WeSpeaker CAM++
campplus-zh-en CAM++ (zh/en)
eres2net ERes2Net
titanet-small NVIDIA TitaNet-Small
titanet-large NVIDIA TitaNet-Large
redimnet-b2 ReDimNet-B2

Threshold tuning

The acceptance threshold is model-specific — it does not transfer between models. Cosine-similarity scales differ enormously across architectures (in our tests the same enrolled-vs-guest pair scored ~0.95 / 0.89 on titanet-small but ~0.17 / 0.14 on campplus). The default 0.45 is calibrated for the default wespeaker-resnet34; if you change model, you must re-tune threshold.

To pick a value, enrol a speaker, then compare verify() scores for genuine vs. guest clips and choose a threshold that sits between them (tests/test_ovoscope_models_e2e.py calibrates this per model automatically). For a given model, lower the threshold for noisier or distant-microphone setups and raise it for stricter security.

Python API

from ovos_ww_verifier_plugin_speaker import SpeakerVerifier

v = SpeakerVerifier(config={"threshold": 0.45, "fail_open": False})
v.enroll("Alice", ["alice1.wav", "alice2.wav"])

# In wake word callback:
accepted = v.verify(pcm_bytes)  # True if Alice spoke

Testing

pip install -e ".[test]"
pytest tests/test_unit.py tests/test_ovoscope_e2e.py   # fast, offline
  • test_unit.py — verifier policy logic (enrolment, thresholds, fail-open).
  • test_ovoscope_e2e.py — drives the verifier through a real listener (ovoscope.MiniVoiceLoop) and asserts a rejected speaker suppresses recognizer_loop:record_begin on the bus. Fast; no model download.
  • test_e2e.py / test_ovoscope_models_e2e.py — real-model tests over every speakeronnx model, using edge-tts synthetic voices to confirm only the enrolled speaker triggers the wake word. Require edge-tts + ffmpeg and download models; they skip automatically when unavailable.

Dependencies

  • speakeronnx (onnxruntime + numpy + huggingface_hub)
  • ovos-plugin-manager

Credits

Developed by TigreGotico for OpenVoiceOS.

Funded by NGI0 Commons Fund / NLnet under grant agreement No 101135429, through the European Commission's Next Generation Internet programme.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

ovos_ww_verifier_plugin_speaker-0.0.2a1.tar.gz (15.2 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

File details

Details for the file ovos_ww_verifier_plugin_speaker-0.0.2a1.tar.gz.

File metadata

File hashes

Hashes for ovos_ww_verifier_plugin_speaker-0.0.2a1.tar.gz
Algorithm Hash digest
SHA256 ed7786c09ea9a2eb409889859f56a31923cded21dd7665f5e0bb6754a21ea1d8
MD5 f6adfc02b97c23a19ae8d7a71e080782
BLAKE2b-256 62d73339790af847ffb7a9c38f67703640350aa24797136999851fe8cb0e0b30

See more details on using hashes here.

File details

Details for the file ovos_ww_verifier_plugin_speaker-0.0.2a1-py3-none-any.whl.

File metadata

File hashes

Hashes for ovos_ww_verifier_plugin_speaker-0.0.2a1-py3-none-any.whl
Algorithm Hash digest
SHA256 f237b45d7f0dfc77e3b7adee0033ab7ca1d5aa290b0786e4648dfa7ea56d9bcb
MD5 ef31b621f0e5f649506c67cd03787a87
BLAKE2b-256 b0cc8e997c54f28ec3aa9641cf3ba37411110a5c63040410159ffb789fecbfa7

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page