Skip to main content

A voice recognition plugin for OVOS

Project description

VoiceEmbeddingsRecognitionPlugin

The VoiceEmbeddingsRecognitionPlugin is a plugin for recognizing and managing voice embeddings.

It uses Resemblyzer to extract speaker embeddings and integrates with ovos-chromadb-embeddings-plugin for storing and retrieving voice embeddings.

Features

  • Voice Embeddings Extraction: Converts audio data into voice embeddings using the VoiceEncoder from resemblyzer.
  • Voice Data Storage: Stores and retrieves voice embeddings using ChromaEmbeddingsDB.
  • Voice Data Management: Allows for adding, querying, and predicting voice embeddings associated with user IDs.
  • Supports Multiple Audio Formats: Can handle audio data in various formats, including wav and flac.

Usage

Here is a quick example of how to use the VoiceEmbeddingsRecognitionPlugin:

from ovos_voice_embeddings import VoiceEmbeddingsRecognitionPlugin
from resemblyzer import preprocess_wav
from speech_recognition import Recognizer, AudioFile
from ovos_chromadb_embeddings import ChromaEmbeddingsDB

db = ChromaEmbeddingsDB("./voice_db")
v = VoiceEmbeddingsRecognitionPlugin(db)

a = "/home/miro/PycharmProjects/ovos-user-id/2609-156975-0001.flac"
b = "/home/miro/PycharmProjects/ovos-user-id/qCCWXoCURKY.mp3"
b2 = "/home/miro/PycharmProjects/ovos-user-id/4glfwiMXgwQ.mp3"

with AudioFile(a) as source:
    audio = Recognizer().record(source)
v.add_voice("user", audio)

wav = preprocess_wav(b)
v.add_voice("donald", wav)

wav = preprocess_wav(b2)
print(v.predict(wav))
print(v.prompt(wav))

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

ovos-voice-embeddings-plugin-0.0.0a2.tar.gz (3.4 kB view details)

Uploaded Source

Built Distribution

File details

Details for the file ovos-voice-embeddings-plugin-0.0.0a2.tar.gz.

File metadata

File hashes

Hashes for ovos-voice-embeddings-plugin-0.0.0a2.tar.gz
Algorithm Hash digest
SHA256 ba05b9e8bd7b734b9d2b7d606de3333c7e63aeda789ce4352669d8aedb512d47
MD5 a2856a813ec9c4c12073f1be4ebc138b
BLAKE2b-256 2a95ffc0ab294b79d4762ba4ecb38e7ab8b000235ec39151bd461ca73c1e85de

See more details on using hashes here.

File details

Details for the file ovos_voice_embeddings_plugin-0.0.0a2-py3-none-any.whl.

File metadata

File hashes

Hashes for ovos_voice_embeddings_plugin-0.0.0a2-py3-none-any.whl
Algorithm Hash digest
SHA256 47289cb007642bca919791210ecec6c9cea4cf065d667a2d53cbbdff77e09509
MD5 9018683b60946af92f065400895a2fa4
BLAKE2b-256 7cecee6a7e650ca7ccc20c16dbb95d5e5b66c4b77c226cb13c282c292ae6da81

See more details on using hashes here.

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page