Project description

WeNet Python Binding

This is a python binding of Wespeaker.

Wespeaker is a production first and production ready end-to-end speaker verification toolkit.

The best things of the binding are:

Multiple languages supports, including English, Chinese. Other languages are in development.
Non-streaming and streaming API
N-best, contextual biasing, and timestamp supports, which are very important for speech productions.
Alignment support. You can get phone level alignments this tool, on developing.

Install

Python 3.6+ is required.

pip3 install wespeakerruntime

Usage

Non-streaming Usage

import sys
import wenetruntime as wenet
wav_file = sys.argv[1]
decoder = wenet.Decoder(lang='chs')
ans = decoder.decode_wav(wav_file)
print(ans)

You can also specify the following parameter in wenet.Decoder

lang (str): The language you used, chs for Chinese, and en for English.
model_dir (str): is the Runtime Model directory, it contains the following files. If not provided, official model for specific lang will be downloaded automatically.
- final.zip: runtime TorchScript ASR model.
- units.txt: modeling units file
- TLG.fst: optional, it means decoding with LM when TLG.fst is given.
- words.txt: optional, word level symbol table for decoding with TLG.fst
Please refer https://github.com/wenet-e2e/wenet/blob/main/docs/pretrained_models.md for the details of Runtime Model.
nbest (int): Output the top-n best result.
enable_timestamp (bool): Whether to enable the word level timestamp.
context (List[str]): a list of context biasing words.
context_score (float): context bonus score.
continuous_decoding (bool): Whether to enable continuous(long) decoding.

For example:

decoder = wenet.Decoder(model_dir,
                        lang='chs',
                        nbest=5,
                        enable_timestamp=True,
                        context=['不忘初心', '牢记使命'],
                        context_score=3.0)

Streaming Usage

import sys
import wave
import wenetruntime as wenet
test_wav = sys.argv[1]
with wave.open(test_wav, 'rb') as fin:
    assert fin.getnchannels() == 1
    wav = fin.readframes(fin.getnframes())
decoder = wenet.Decoder(lang='chs')
# We suppose the wav is 16k, 16bits, and decode every 0.5 seconds
interval = int(0.5 * 16000) * 2
for i in range(0, len(wav), interval):
    last = False if i + interval < len(wav) else True
    chunk_wav = wav[i: min(i + interval, len(wav))]
    ans = decoder.decode(chunk_wav, last)
    print(ans)

You can use the same parameters as we introduced above to control the behavior of wenet.Decoder

Build on Your Local Machine

git clone git@github.com:wenet-e2e/wenet.git
cd wenet/runtime/binding/python
python setup.py install

Project details

These details have not been verified by PyPI

Project links

Homepage

GitHub Statistics

View statistics for this project via Libraries.io, or by using our public dataset on Google BigQuery

Release history Release notifications | RSS feed

1.0.1

Feb 27, 2023

This version

1.0.0

Sep 23, 2022

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

wespeakerruntime-1.0.0.tar.gz (5.1 kB view hashes)

Uploaded Sep 23, 2022 Source

Built Distribution

wespeakerruntime-1.0.0-py3-none-any.whl (6.0 kB view hashes)

Uploaded Sep 23, 2022 Python 3

Hashes for wespeakerruntime-1.0.0.tar.gz

Hashes for wespeakerruntime-1.0.0.tar.gz
Algorithm	Hash digest
SHA256	`78c183fe858d8e36ddc64448e91ed750f048b2865dc3690befb53ae58b285f57`
MD5	`f7570b638eeabb0d135b23e821ebf562`
BLAKE2b-256	`dc032896a4641c1ddafa14f1d581869bf8b110437b001621f0ecf38ed9d8950f`

Hashes for wespeakerruntime-1.0.0-py3-none-any.whl

Hashes for wespeakerruntime-1.0.0-py3-none-any.whl
Algorithm	Hash digest
SHA256	`732e3ccbc84662f95f89af849918908bbf32bffac56f3fece663fc58d075a48a`
MD5	`84e1218afbbd5f5ec311bf50ab9a7956`
BLAKE2b-256	`32929f76a14bf0d3401b198b0714fc20a036a61802b5c041578ba25441469f08`