A voice recognition plugin for OVOS
Project description
VoiceEmbeddingsRecognitionPlugin
The VoiceEmbeddingsRecognitionPlugin
is a plugin for recognizing and managing voice embeddings.
It uses Resemblyzer to extract speaker embeddings and integrates with ovos-chromadb-embeddings-plugin for storing and retrieving voice embeddings.
Features
- Voice Embeddings Extraction: Converts audio data into voice embeddings using the
VoiceEncoder
fromresemblyzer
. - Voice Data Storage: Stores and retrieves voice embeddings using
ChromaEmbeddingsDB
. - Voice Data Management: Allows for adding, querying, and predicting voice embeddings associated with user IDs.
- Supports Multiple Audio Formats: Can handle audio data in various formats, including
wav
andflac
.
Usage
Here is a quick example of how to use the VoiceEmbeddingsRecognitionPlugin
:
from ovos_voice_embeddings import VoiceEmbeddingsRecognitionPlugin
from resemblyzer import preprocess_wav
from speech_recognition import Recognizer, AudioFile
from ovos_chromadb_embeddings import ChromaEmbeddingsDB
db = ChromaEmbeddingsDB("./voice_db")
v = VoiceEmbeddingsRecognitionPlugin(db)
a = "/home/miro/PycharmProjects/ovos-user-id/2609-156975-0001.flac"
b = "/home/miro/PycharmProjects/ovos-user-id/qCCWXoCURKY.mp3"
b2 = "/home/miro/PycharmProjects/ovos-user-id/4glfwiMXgwQ.mp3"
with AudioFile(a) as source:
audio = Recognizer().record(source)
v.add_voice("user", audio)
wav = preprocess_wav(b)
v.add_voice("donald", wav)
wav = preprocess_wav(b2)
print(v.predict(wav))
print(v.prompt(wav))
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Close
Hashes for ovos-voice-embeddings-plugin-0.0.0a2.tar.gz
Algorithm | Hash digest | |
---|---|---|
SHA256 | ba05b9e8bd7b734b9d2b7d606de3333c7e63aeda789ce4352669d8aedb512d47 |
|
MD5 | a2856a813ec9c4c12073f1be4ebc138b |
|
BLAKE2b-256 | 2a95ffc0ab294b79d4762ba4ecb38e7ab8b000235ec39151bd461ca73c1e85de |
Close
Hashes for ovos_voice_embeddings_plugin-0.0.0a2-py3-none-any.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | 47289cb007642bca919791210ecec6c9cea4cf065d667a2d53cbbdff77e09509 |
|
MD5 | 9018683b60946af92f065400895a2fa4 |
|
BLAKE2b-256 | 7cecee6a7e650ca7ccc20c16dbb95d5e5b66c4b77c226cb13c282c292ae6da81 |