Skip to main content

No project description provided

Project description

WeSpeaker Python Binding

This is a python binding of WeSpeaker.

WeSpeaker is a production first and production ready end-to-end speaker recognition toolkit.

  1. Two onnx models are available: voxceleb model, cnceleb_model
  2. Extract embedding from wav file or feature(Fbank/MFCC).
  3. Support using kaldiio to save embedding.

Install

Python 3.6+ is required.

pip3 install wespeakerruntime

Usage

Extract embedding from wav file

import sys
import wespeakerruntime as wespeaker
wav_file = sys.argv[1]
speaker = wespeaker.Speaker(lang='chs')
ans = speaker.extract_embedding(wav_file)
print(ans)

You can also specify the following parameter in wespeaker.Speaker

  • onnx_path (str, optional): is the path of onnx model.
    • Default: onnx model will be downloaded from the server.
  • lang (str): chs for cnceleb_model. en for voxceleb model.
  • inter_op_num_threads and intra_op_num_threads (int): the number of threads during the model runing. For details, please see: https://onnxruntime.ai/docs/

The parameters of extract_embedding

  • wav_path (str): the path of wav
  • resample_rate (int): resampling rate. Default: 16000
  • num_mel_bins (int): dimension of fbank. Default: 80
  • frame_length (int): frame length. Default: 25
  • frame_shift (int): frame shift. Default: 10
  • cmn (bool): if true, cepstrum average normalization is applied. Default: True

Compute cosine score

import wespeakerruntime as wespeaker
speaker = wespeaker.Speaker(lang='chs')
emb1 = speaker.extract_embedding_wav(wav1_path)
emb2 = speaker.extract_embedding_wav(wav2_path)
score = speaker.compute_cosine_score(emb1, emb2)

The parameters of compute_cosine_score:

  • emb1(numpy.ndarray): embedding of speaker-1
  • emb2(numpy.ndarray): embedding of speaker-2

[Optional] Extract embedding from feature(fbank/MFCC)

import sys
import wespeakerruntime as wespeaker
feat = your_fbank
speaker = wespeaker.Speaker(lang='chs')
ans = speaker.extract_embedding_feat(feat)
print(ans)

The parameters of extract_embedding_feat:

  • feats(numpy.ndarray): the shape is [B, T, D].
  • cmn(bool): if true, cepstrum average normalization is applied. Default: True

[Optional] Extract embedding from wav.scp

import sys
import wespeakerruntime as wespeaker
wav_scp = sys.argv[1]
speaker = wespeaker.Speaker(lang='chs')
speaker.extract_embedding_kaldiio(wav_scp, 'embed.ark')

The parameters of extract_embedding_kaldiio:

  • wav_path (str): the path of wav
  • embed_ark (str): the path of $ouput.ark
  • resample_rate (int): resampling rate. Default: 16000
  • num_mel_bins(int): dimension of fbank. Default: 80
  • frame_length(int): frame length. Default: 25
  • frame_shift(int): frame shift. Default: 10
  • cmn(bool): if true, cepstrum average normalization is applied. Default: True

Build on Your Local Machine

git clone git@github.com:wenet-e2e/wespeaker.git
cd wespeaker/runtime/binding/python
python setup.py install

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

wespeakerruntime-1.0.1.tar.gz (5.4 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

wespeakerruntime-1.0.1-py3-none-any.whl (6.3 kB view details)

Uploaded Python 3

File details

Details for the file wespeakerruntime-1.0.1.tar.gz.

File metadata

  • Download URL: wespeakerruntime-1.0.1.tar.gz
  • Upload date:
  • Size: 5.4 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/4.0.2 CPython/3.10.8

File hashes

Hashes for wespeakerruntime-1.0.1.tar.gz
Algorithm Hash digest
SHA256 863e0e72bfb8a1271c6192aa6c1f90e89c72fb353aff44a1fd19764d0191f343
MD5 61b8d1527482a4d5aa08a48a6cf39cdb
BLAKE2b-256 3ff113ac146a127c1fd5cc4806e51498758db68c287005d9161e0076499a1f57

See more details on using hashes here.

File details

Details for the file wespeakerruntime-1.0.1-py3-none-any.whl.

File metadata

File hashes

Hashes for wespeakerruntime-1.0.1-py3-none-any.whl
Algorithm Hash digest
SHA256 862eafc82f8d788ea2bb30ccdbd519b82019b75725399dcf3b0db388e87600da
MD5 021889ed919626120c89b2cf52dc5056
BLAKE2b-256 54468e5cb4453fe0ae17762ab0ac029e259f3a2ddbd8ce7eaaf321cba35f8acd

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page