A voice recognition plugin for OVOS
Project description
VoiceEmbeddingsRecognitionPlugin
The VoiceEmbeddingsRecognitionPlugin
is a plugin for recognizing and managing voice embeddings.
It uses Resemblyzer to extract speaker embeddings and integrates with ovos-chromadb-embeddings-plugin for storing and retrieving voice embeddings.
Features
- Voice Embeddings Extraction: Converts audio data into voice embeddings using the
VoiceEncoder
fromresemblyzer
. - Voice Data Storage: Stores and retrieves voice embeddings using
ChromaEmbeddingsDB
. - Voice Data Management: Allows for adding, querying, and predicting voice embeddings associated with user IDs.
- Supports Multiple Audio Formats: Can handle audio data in various formats, including
wav
andflac
.
Usage
Here is a quick example of how to use the VoiceEmbeddingsRecognitionPlugin
:
from ovos_voice_embeddings import VoiceEmbeddingsRecognitionPlugin
from resemblyzer import preprocess_wav
from speech_recognition import Recognizer, AudioFile
from ovos_chromadb_embeddings import ChromaEmbeddingsDB
db = ChromaEmbeddingsDB("./voice_db")
v = VoiceEmbeddingsRecognitionPlugin(db)
a = "/home/miro/PycharmProjects/ovos-user-id/2609-156975-0001.flac"
b = "/home/miro/PycharmProjects/ovos-user-id/qCCWXoCURKY.mp3"
b2 = "/home/miro/PycharmProjects/ovos-user-id/4glfwiMXgwQ.mp3"
with AudioFile(a) as source:
audio = Recognizer().record(source)
v.add_voice("user", audio)
wav = preprocess_wav(b)
v.add_voice("donald", wav)
wav = preprocess_wav(b2)
print(v.predict(wav))
print(v.prompt(wav))
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
File details
Details for the file ovos-voice-embeddings-plugin-0.0.0a2.tar.gz
.
File metadata
- Download URL: ovos-voice-embeddings-plugin-0.0.0a2.tar.gz
- Upload date:
- Size: 3.4 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/5.1.1 CPython/3.9.20
File hashes
Algorithm | Hash digest | |
---|---|---|
SHA256 | ba05b9e8bd7b734b9d2b7d606de3333c7e63aeda789ce4352669d8aedb512d47 |
|
MD5 | a2856a813ec9c4c12073f1be4ebc138b |
|
BLAKE2b-256 | 2a95ffc0ab294b79d4762ba4ecb38e7ab8b000235ec39151bd461ca73c1e85de |
File details
Details for the file ovos_voice_embeddings_plugin-0.0.0a2-py3-none-any.whl
.
File metadata
- Download URL: ovos_voice_embeddings_plugin-0.0.0a2-py3-none-any.whl
- Upload date:
- Size: 3.8 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/5.1.1 CPython/3.9.20
File hashes
Algorithm | Hash digest | |
---|---|---|
SHA256 | 47289cb007642bca919791210ecec6c9cea4cf065d667a2d53cbbdff77e09509 |
|
MD5 | 9018683b60946af92f065400895a2fa4 |
|
BLAKE2b-256 | 7cecee6a7e650ca7ccc20c16dbb95d5e5b66c4b77c226cb13c282c292ae6da81 |