No project description provided

These details have not been verified by PyPI

Project links

Homepage

GitHub Statistics

View statistics for this project via Libraries.io, or by using our public dataset on Google BigQuery

Project description

WeSpeaker Python Binding

This is a python binding of WeSpeaker.

WeSpeaker is a production first and production ready end-to-end speaker recognition toolkit.

Two onnx models are available: voxceleb model, cnceleb_model
Extract embedding from wav file or feature(Fbank/MFCC).
Support using kaldiio to save embedding.

Install

Python 3.6+ is required.

pip3 install wespeakerruntime

Usage

Extract embedding from wav file

import sys
import wespeakerruntime as wespeaker
wav_file = sys.argv[1]
speaker = wespeaker.Speaker(lang='chs')
ans = speaker.extract_embedding(wav_file)
print(ans)

You can also specify the following parameter in wespeaker.Speaker

onnx_path (str, optional): is the path of onnx model.
- Default: onnx model will be downloaded from the server.
lang (str): chs for cnceleb_model. en for voxceleb model.
inter_op_num_threads and intra_op_num_threads (int): the number of threads during the model runing. For details, please see: https://onnxruntime.ai/docs/

The parameters of extract_embedding

wav_path (str): the path of wav
resample_rate (int): resampling rate. Default: 16000
num_mel_bins (int): dimension of fbank. Default: 80
frame_length (int): frame length. Default: 25
frame_shift (int): frame shift. Default: 10
cmn (bool): if true, cepstrum average normalization is applied. Default: True

Compute cosine score

import wespeakerruntime as wespeaker
speaker = wespeaker.Speaker(lang='chs')
emb1 = speaker.extract_embedding_wav(wav1_path)
emb2 = speaker.extract_embedding_wav(wav2_path)
score = speaker.compute_cosine_score(emb1, emb2)

The parameters of compute_cosine_score:

emb1(numpy.ndarray): embedding of speaker-1
emb2(numpy.ndarray): embedding of speaker-2

[Optional] Extract embedding from feature(fbank/MFCC)

import sys
import wespeakerruntime as wespeaker
feat = your_fbank
speaker = wespeaker.Speaker(lang='chs')
ans = speaker.extract_embedding_feat(feat)
print(ans)

The parameters of extract_embedding_feat:

feats(numpy.ndarray): the shape is [B, T, D].
cmn(bool): if true, cepstrum average normalization is applied. Default: True

[Optional] Extract embedding from wav.scp

import sys
import wespeakerruntime as wespeaker
wav_scp = sys.argv[1]
speaker = wespeaker.Speaker(lang='chs')
speaker.extract_embedding_kaldiio(wav_scp, 'embed.ark')

The parameters of extract_embedding_kaldiio:

wav_path (str): the path of wav
embed_ark (str): the path of $ouput.ark
resample_rate (int): resampling rate. Default: 16000
num_mel_bins(int): dimension of fbank. Default: 80
frame_length(int): frame length. Default: 25
frame_shift(int): frame shift. Default: 10
cmn(bool): if true, cepstrum average normalization is applied. Default: True

Build on Your Local Machine

git clone git@github.com:wenet-e2e/wespeaker.git
cd wespeaker/runtime/binding/python
python setup.py install

Project details

These details have not been verified by PyPI

Project links

Homepage

GitHub Statistics

View statistics for this project via Libraries.io, or by using our public dataset on Google BigQuery

Release history Release notifications | RSS feed

This version

1.0.1

Feb 27, 2023

1.0.0

Sep 23, 2022

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

wespeakerruntime-1.0.1.tar.gz (5.4 kB view hashes)

Uploaded Feb 27, 2023 Source

Built Distribution

wespeakerruntime-1.0.1-py3-none-any.whl (6.3 kB view hashes)

Uploaded Feb 27, 2023 Python 3

Hashes for wespeakerruntime-1.0.1.tar.gz

Hashes for wespeakerruntime-1.0.1.tar.gz
Algorithm	Hash digest
SHA256	`863e0e72bfb8a1271c6192aa6c1f90e89c72fb353aff44a1fd19764d0191f343`
MD5	`61b8d1527482a4d5aa08a48a6cf39cdb`
BLAKE2b-256	`3ff113ac146a127c1fd5cc4806e51498758db68c287005d9161e0076499a1f57`

Hashes for wespeakerruntime-1.0.1-py3-none-any.whl

Hashes for wespeakerruntime-1.0.1-py3-none-any.whl
Algorithm	Hash digest
SHA256	`862eafc82f8d788ea2bb30ccdbd519b82019b75725399dcf3b0db388e87600da`
MD5	`021889ed919626120c89b2cf52dc5056`
BLAKE2b-256	`54468e5cb4453fe0ae17762ab0ac029e259f3a2ddbd8ce7eaaf321cba35f8acd`