Skip to main content

ovos-core listener daemon client

Project description

OpenVoiceOS Dinkum Listener

Documentation can be found in the technical manual

Install

pip install ovos-dinkum-listener[extras] to install this package and the default plugins.

Without extras you will also need to manually install, and possibly configure STT, WW, and VAD modules as described below.

Configuration

you can set the Wakeword, VAD, STT and Microphone plugins

eg, to run under MacOS you should use https://github.com/OpenVoiceOS/ovos-microphone-plugin-sounddevice

non exhaustive list of config options

{
  "stt": {
    "module": "ovos-stt-plugin-server",
    "fallback_module": "",
    "ovos-stt-plugin-server": {"url": "https://stt.openvoiceos.com/stt"}
  },
  "listener": {
    // NOTE, multiple hotwords are supported, these fields define the main wake_word,
    // this is equivalent to setting "active": true in the "hotwords" section
    // see "hotwords" section at https://github.com/OpenVoiceOS/ovos-config/blob/dev/ovos_config/mycroft.conf
    "wake_word": "hey_mycroft",
    "stand_up_word": "wake_up",
    "microphone": {
      "module": "ovos-microphone-plugin-alsa"
    },
    // If enabled will only check for wakeword if VAD also detected speech
    // this should reduce false activations
    "vad_pre_wake_enabled": true,
    // Voice Activity Detection is used to determine when users are speaking
    VAD": {
     // recommended plugin: "ovos-vad-plugin-silero"
     "module": "ovos-vad-plugin-silero",
     "ovos-vad-plugin-silero": {"threshold": 0.2},
     "ovos-vad-plugin-webrtcvad": {"vad_mode": 3}
    },
    // Seconds of speech before voice command has begun
    "speech_begin": 0.1,
    // Seconds of silence before a voice command has finished
    "silence_end": 0.5,
    // Settings used by microphone to set recording timeout with and without speech detected
    "recording_timeout": 10.0,
    // Settings used by microphone to set recording timeout without speech detected.
    "recording_timeout_with_silence": 3.0,
    // max time allowed without user speaking before exiting RECORDING mode
    "recording_mode_max_silence_seconds": 30.0,
    // Setting to remove all silence/noise from start and end of recorded speech (only non-streaming)
    "remove_silence": true,
    // continuous listen is an experimental setting, it removes the need for
    // wake words and uses VAD only, a streaming STT is strongly recommended
    // NOTE: depending on hardware this may cause mycroft to hear its own TTS responses as questions
    "continuous_listen": false,

    // hybrid listen is an experimental setting,
    // it will not require a wake word for X seconds after a user interaction
    // this means you dont need to say "hey mycroft" for follow up questions
    "hybrid_listen": false,
    // number of seconds to wait for an interaction before requiring wake word again
    "listen_timeout": 45
  }
}

Tips and tricks

Saving Transcriptions

You can enable saving of recordings to file, this should be your first step to diagnose problems, is the audio inteligible? is it being cropped? too noisy? low volume?

set "save_utterances": true in your listener config, recordings will be saved to ~/.local/share/mycroft/listener/utterances

If the recorded audio looks good to you, maybe you need to use a different STT plugin, maybe the one you are using does not like your microphone, or just isn't very good for your language

Wrong Transcriptions

If you consistently get specific words or utterances transcribed wrong, you can remedy around this to some extent by using the ovos-utterance-corrections-plugin

You can define replacements at word level ~/.local/share/mycroft/word_corrections.json

for example whisper STT often gets artist names wrong, this allows you to correct them

{
    "Jimmy Hendricks": "Jimi Hendrix",
    "Eric Klapptern": "Eric Clapton",
    "Eric Klappton": "Eric Clapton"
}

Silence Removal

By default OVOS applies VAD (Voice Activity Detection) to crop silence from the audio sent to STT, this helps in performance and in accuracy (reduces hallucinations in plugins like FasterWhisper)

Depending on your microphone/VAD plugin, this might be removing too much audio

set "remove_silence": false in your listener config, this will send the full audio recording to STT

Listen Sound

does your listen sound contain speech? some users replace the "ding" sound with words such as "yes?"

In this case the listen sound will be sent to STT and might negatively affect the transcription

set "instant_listen": false in your listener config, this will drop the listen sound audio from the STT audio buffer. You will need to wait for the listen sound to finish before speaking your command in this case

Credits

Voice Loop state machine implementation by @Synesthesiam for mycroft-dinkum

Project details


Release history Release notifications | RSS feed

This version

0.5.0

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

ovos_dinkum_listener-0.5.0.tar.gz (112.1 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

ovos_dinkum_listener-0.5.0-py3-none-any.whl (110.9 kB view details)

Uploaded Python 3

File details

Details for the file ovos_dinkum_listener-0.5.0.tar.gz.

File metadata

  • Download URL: ovos_dinkum_listener-0.5.0.tar.gz
  • Upload date:
  • Size: 112.1 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.9.25

File hashes

Hashes for ovos_dinkum_listener-0.5.0.tar.gz
Algorithm Hash digest
SHA256 c52a0596deec3a279e441afd4e74b16ecad4a0bab94573c5e8248a1a938db719
MD5 8c7d4c157ed13d1a1fc25d4a5a57ffdf
BLAKE2b-256 16f4540a143ca70f963a16aecda10cc6dd87dcaee7b4637cf5b7412684db90c9

See more details on using hashes here.

File details

Details for the file ovos_dinkum_listener-0.5.0-py3-none-any.whl.

File metadata

File hashes

Hashes for ovos_dinkum_listener-0.5.0-py3-none-any.whl
Algorithm Hash digest
SHA256 d3fe6b240f7896573399b9134590d93f56400509d3f94ade027a2623e993de28
MD5 240290aa53b66c239ddee1a7f5ffb536
BLAKE2b-256 2f09a7b044494c4b6a2871a097a4fb3744bc4c5b2c183e46af683e9dea26f017

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page