A voice recognition plugin for OVOS
Project description
VoiceEmbeddingsRecognitionPlugin
The VoiceEmbeddingsRecognitionPlugin is a plugin for recognizing and managing voice embeddings.
It uses Resemblyzer to extract speaker embeddings and integrates with ovos-chromadb-embeddings-plugin for storing and retrieving voice embeddings.
Features
- Voice Embeddings Extraction: Converts audio data into voice embeddings using the
VoiceEncoderfromresemblyzer. - Voice Data Storage: Stores and retrieves voice embeddings using
ChromaEmbeddingsDB. - Voice Data Management: Allows for adding, querying, and predicting voice embeddings associated with user IDs.
- Supports Multiple Audio Formats: Can handle audio data in various formats, including
wavandflac.
Usage
Here is a quick example of how to use the VoiceEmbeddingsRecognitionPlugin:
from ovos_voice_embeddings import VoiceEmbeddingsRecognitionPlugin
from resemblyzer import preprocess_wav
from speech_recognition import Recognizer, AudioFile
from ovos_chromadb_embeddings import ChromaEmbeddingsDB
db = ChromaEmbeddingsDB("./voice_db")
v = VoiceEmbeddingsRecognitionPlugin(db)
a = "/home/miro/PycharmProjects/ovos-user-id/2609-156975-0001.flac"
b = "/home/miro/PycharmProjects/ovos-user-id/qCCWXoCURKY.mp3"
b2 = "/home/miro/PycharmProjects/ovos-user-id/4glfwiMXgwQ.mp3"
with AudioFile(a) as source:
audio = Recognizer().record(source)
v.add_voice("user", audio)
wav = preprocess_wav(b)
v.add_voice("donald", wav)
wav = preprocess_wav(b2)
print(v.predict(wav))
print(v.prompt(wav))
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file ovos-voice-embeddings-plugin-0.0.0a2.tar.gz.
File metadata
- Download URL: ovos-voice-embeddings-plugin-0.0.0a2.tar.gz
- Upload date:
- Size: 3.4 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/5.1.1 CPython/3.9.20
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
ba05b9e8bd7b734b9d2b7d606de3333c7e63aeda789ce4352669d8aedb512d47
|
|
| MD5 |
a2856a813ec9c4c12073f1be4ebc138b
|
|
| BLAKE2b-256 |
2a95ffc0ab294b79d4762ba4ecb38e7ab8b000235ec39151bd461ca73c1e85de
|
File details
Details for the file ovos_voice_embeddings_plugin-0.0.0a2-py3-none-any.whl.
File metadata
- Download URL: ovos_voice_embeddings_plugin-0.0.0a2-py3-none-any.whl
- Upload date:
- Size: 3.8 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/5.1.1 CPython/3.9.20
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
47289cb007642bca919791210ecec6c9cea4cf065d667a2d53cbbdff77e09509
|
|
| MD5 |
9018683b60946af92f065400895a2fa4
|
|
| BLAKE2b-256 |
7cecee6a7e650ca7ccc20c16dbb95d5e5b66c4b77c226cb13c282c292ae6da81
|