A voice recognition plugin for OVOS
Project description
VoiceEmbeddingsRecognitionPlugin
The VoiceEmbeddingsRecognitionPlugin
is a plugin for recognizing and managing voice embeddings.
It uses Resemblyzer to extract speaker embeddings and integrates with ovos-chromadb-embeddings-plugin for storing and retrieving voice embeddings.
Features
- Voice Embeddings Extraction: Converts audio data into voice embeddings using the
VoiceEncoder
fromresemblyzer
. - Voice Data Storage: Stores and retrieves voice embeddings using
ChromaEmbeddingsDB
. - Voice Data Management: Allows for adding, querying, and predicting voice embeddings associated with user IDs.
- Supports Multiple Audio Formats: Can handle audio data in various formats, including
wav
andflac
.
Usage
Here is a quick example of how to use the VoiceEmbeddingsRecognitionPlugin
:
from ovos_voice_embeddings import VoiceEmbeddingsRecognitionPlugin
from resemblyzer import preprocess_wav
from speech_recognition import Recognizer, AudioFile
from ovos_chromadb_embeddings import ChromaEmbeddingsDB
db = ChromaEmbeddingsDB("./voice_db")
v = VoiceEmbeddingsRecognitionPlugin(db)
a = "/home/miro/PycharmProjects/ovos-user-id/2609-156975-0001.flac"
b = "/home/miro/PycharmProjects/ovos-user-id/qCCWXoCURKY.mp3"
b2 = "/home/miro/PycharmProjects/ovos-user-id/4glfwiMXgwQ.mp3"
with AudioFile(a) as source:
audio = Recognizer().record(source)
v.add_voice("user", audio)
wav = preprocess_wav(b)
v.add_voice("donald", wav)
wav = preprocess_wav(b2)
print(v.predict(wav))
print(v.prompt(wav))
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Close
Hashes for ovos-voice-embeddings-plugin-0.0.0a0.tar.gz
Algorithm | Hash digest | |
---|---|---|
SHA256 | 10b409901055046ca210b510104c37a76503cac6e67540094d615be94cc9a21d |
|
MD5 | 294d9980d153151400a89c7cb5a99a39 |
|
BLAKE2b-256 | a502bc6f2bb94be76628594fef6f460d4a6ab402a425c5b6616da2cf4e04bc9b |
Close
Hashes for ovos_voice_embeddings_plugin-0.0.0a0-py3-none-any.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | 3dc3bd6d3a7b2ef0f0a29c44109cbf176b5a51f633c4c387cd54199d7fc8c787 |
|
MD5 | 3fd23381d1f6eb348df8c1ade7a0463f |
|
BLAKE2b-256 | a8dc1b41a72bc9c79464e9fb7eafd4b620a5448758361c26acfa3a6575669a0a |