No project description provided
Project description
asrp
ASR text preprocessing utility
install
pip install asrp
Preprocess
input: dictionary, with key sentence
output: preprocessed result, inplace handling.
import asrp
batch_data = {
'sentence': "I'm fine, thanks."
}
asrp.fun_en(batch_data)
dynamic loading
import asrp
batch_data = {
'sentence': "I'm fine, thanks."
}
preprocessor = getattr(asrp, 'fun_en')
preprocessor(batch_data)
Evaluation
import asrp
targets = ['HuggingFace is great!', 'Love Transformers!', 'Let\'s wav2vec!']
preds = ['HuggingFace is awesome!', 'Transformers is powerful.', 'Let\'s finetune wav2vec!']
print("chunk size WER: {:2f}".format(100 * asrp.chunked_wer(targets, preds, chunk_size=None)))
print("chunk size CER: {:2f}".format(100 * asrp.chunked_cer(targets, preds, chunk_size=None)))
Speech to Hubert code
import asrp
import nlp2
nlp2.download_file(
'https://huggingface.co/voidful/mhubert-base/resolve/main/mhubert_base_vp_en_es_fr_it3_L11_km1000.bin', './')
hc = asrp.HubertCode("voidful/mhubert-base", './mhubert_base_vp_en_es_fr_it3_L11_km1000.bin', 11,
chunk_sec=30,
worker=20)
hc('voice file path')
Hubert code to speech
import asrp
code = [] # discrete unit
# download tts checkpoint and waveglow_checkpint from https://github.com/pytorch/fairseq/tree/main/examples/textless_nlp/gslm/unit2speech
cs = asrp.Code2Speech(tts_checkpoint='./tts_checkpoint_best.pt', waveglow_checkpint='waveglow_256channels_new.pt')
cs(code)
# play on notebook
import IPython.display as ipd
ipd.Audio(data=cs(code), autoplay=False, rate=cs.sample_rate)
Speech Enhancement
Denoiser copied from fairseq
from asrp import SpeechEnhancer
ase = SpeechEnhancer()
print(ase('./test/xxx.wav'))
usage - liveASR
- modify from https://github.com/oliverguhr/wav2vec2-live
from asrp.live import LiveSpeech
english_model = "voidful/wav2vec2-xlsr-multilingual-56"
asr = LiveSpeech(english_model, device_name="default")
asr.start()
try:
while True:
text, sample_length, inference_time = asr.get_last_text()
print(f"{sample_length:.3f}s"
+ f"\t{inference_time:.3f}s"
+ f"\t{text}")
except KeyboardInterrupt:
asr.stop()
usage - liveASR - whisper
from asrp.live import LiveSpeech
whisper_model = "tiny"
asr = LiveSpeech(whisper_model, vad_mode=2, language='zh')
asr.start()
last_text = ""
while True:
asr_text = ""
try:
asr_text, sample_length, inference_time = asr.get_last_text()
if len(asr_text) > 0:
print(asr_text, sample_length, inference_time)
except KeyboardInterrupt:
asr.stop()
break
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
asrp-0.0.54.tar.gz
(47.1 kB
view details)
Built Distribution
asrp-0.0.54-py3-none-any.whl
(48.4 kB
view details)
File details
Details for the file asrp-0.0.54.tar.gz
.
File metadata
- Download URL: asrp-0.0.54.tar.gz
- Upload date:
- Size: 47.1 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/4.0.1 CPython/3.9.10
File hashes
Algorithm | Hash digest | |
---|---|---|
SHA256 | 3eaef3a8d32a9b8bea4f9e2d0b9957be56e4436bb8d0a0e804115e66f26dcb47 |
|
MD5 | 87f8a604ae5d23ef3a9769f22b0108d1 |
|
BLAKE2b-256 | 0d1289a38dc8b683e8813ddcb160e438a4546d2ea3b157dac18a4f28de023d03 |
File details
Details for the file asrp-0.0.54-py3-none-any.whl
.
File metadata
- Download URL: asrp-0.0.54-py3-none-any.whl
- Upload date:
- Size: 48.4 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/4.0.1 CPython/3.9.10
File hashes
Algorithm | Hash digest | |
---|---|---|
SHA256 | e4302daff475350a810db5a0880b4c6005d33da47f76d4474058dd9bd9edd01f |
|
MD5 | 8bbd0686dc73de9c7d745735e357378a |
|
BLAKE2b-256 | ffb2257f5e0a46946b3de6814fffde51ada9e9d8f71f148b0c38c4e3afb38858 |