No project description provided
Project description
WeNet Python Binding
This is a python binding of Wespeaker.
Wespeaker is a production first and production ready end-to-end speaker verification toolkit.
The best things of the binding are:
- Multiple languages supports, including English, Chinese. Other languages are in development.
- Non-streaming and streaming API
- N-best, contextual biasing, and timestamp supports, which are very important for speech productions.
- Alignment support. You can get phone level alignments this tool, on developing.
Install
Python 3.6+ is required.
pip3 install wespeakerruntime
Usage
Non-streaming Usage
import sys
import wenetruntime as wenet
wav_file = sys.argv[1]
decoder = wenet.Decoder(lang='chs')
ans = decoder.decode_wav(wav_file)
print(ans)
You can also specify the following parameter in wenet.Decoder
-
lang
(str): The language you used,chs
for Chinese, anden
for English. -
model_dir
(str): is theRuntime Model
directory, it contains the following files. If not provided, official model for specificlang
will be downloaded automatically.final.zip
: runtime TorchScript ASR model.units.txt
: modeling units fileTLG.fst
: optional, it means decoding with LM whenTLG.fst
is given.words.txt
: optional, word level symbol table for decoding withTLG.fst
Please refer https://github.com/wenet-e2e/wenet/blob/main/docs/pretrained_models.md for the details of
Runtime Model
. -
nbest
(int): Output the top-n best result. -
enable_timestamp
(bool): Whether to enable the word level timestamp. -
context
(List[str]): a list of context biasing words. -
context_score
(float): context bonus score. -
continuous_decoding
(bool): Whether to enable continuous(long) decoding.
For example:
decoder = wenet.Decoder(model_dir,
lang='chs',
nbest=5,
enable_timestamp=True,
context=['不忘初心', '牢记使命'],
context_score=3.0)
Streaming Usage
import sys
import wave
import wenetruntime as wenet
test_wav = sys.argv[1]
with wave.open(test_wav, 'rb') as fin:
assert fin.getnchannels() == 1
wav = fin.readframes(fin.getnframes())
decoder = wenet.Decoder(lang='chs')
# We suppose the wav is 16k, 16bits, and decode every 0.5 seconds
interval = int(0.5 * 16000) * 2
for i in range(0, len(wav), interval):
last = False if i + interval < len(wav) else True
chunk_wav = wav[i: min(i + interval, len(wav))]
ans = decoder.decode(chunk_wav, last)
print(ans)
You can use the same parameters as we introduced above to control the behavior of wenet.Decoder
Build on Your Local Machine
git clone git@github.com:wenet-e2e/wenet.git
cd wenet/runtime/binding/python
python setup.py install
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Hashes for wespeakerruntime-1.0.0-py3-none-any.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | 732e3ccbc84662f95f89af849918908bbf32bffac56f3fece663fc58d075a48a |
|
MD5 | 84e1218afbbd5f5ec311bf50ab9a7956 |
|
BLAKE2b-256 | 32929f76a14bf0d3401b198b0714fc20a036a61802b5c041578ba25441469f08 |