Skip to main content

No project description provided

Project description

asrp

ASR text preprocessing utility

install

pip install asrp

Preprocess

input: dictionary, with key sentence
output: preprocessed result, inplace handling.

import asrp

batch_data = {
    'sentence': "I'm fine, thanks."
}
asrp.fun_en(batch_data)

dynamic loading

import asrp

batch_data = {
    'sentence': "I'm fine, thanks."
}
preprocessor = getattr(asrp, 'fun_en')
preprocessor(batch_data)

Evaluation

import asrp

targets = ['HuggingFace is great!', 'Love Transformers!', 'Let\'s wav2vec!']
preds = ['HuggingFace is awesome!', 'Transformers is powerful.', 'Let\'s finetune wav2vec!']
print("chunk size WER: {:2f}".format(100 * asrp.chunked_wer(targets, preds, chunk_size=None)))
print("chunk size CER: {:2f}".format(100 * asrp.chunked_cer(targets, preds, chunk_size=None)))

Speech to Hubert code

import asrp

hc = asrp.HubertCode("facebook/hubert-large-ll60k", './km_feat_100_layer_20', 20)
hc('voice file path')

Hubert code to speech

import asrp

code = []  # discrete unit
# download tts checkpoint and waveglow_checkpint from https://github.com/pytorch/fairseq/tree/main/examples/textless_nlp/gslm/unit2speech
cs = asrp.Code2Speech(tts_checkpoint='./tts_checkpoint_best.pt', waveglow_checkpint='waveglow_256channels_new.pt')
cs(code)

# play on notebook
import IPython.display as ipd

ipd.Audio(data=cs(code), autoplay=False, rate=cs.sample_rate)

Speech Enhancement

Denoiser copied from fairseq

from asrp import SpeechEnhancer

ase = SpeechEnhancer()
print(ase('./test/xxx.wav'))

usage - liveASR

from asrp.live import LiveSpeech

english_model = "voidful/wav2vec2-xlsr-multilingual-56"
asr = LiveSpeech(english_model, device_name="default")
asr.start()

try:
    while True:
        text, sample_length, inference_time = asr.get_last_text()
        print(f"{sample_length:.3f}s"
              + f"\t{inference_time:.3f}s"
              + f"\t{text}")

except KeyboardInterrupt:
    asr.stop()

usage - liveASR - whisper

from asrp.live import LiveSpeech

whisper_model = "tiny"
asr = LiveSpeech(whisper_model)
asr.start()
last_text = ""
while True:
    asr_text = ""
    try:
        asr_text, sample_length, inference_time = asr.get_last_text()
        if len(asr_text) > 0:
            print(asr_text, sample_length, inference_time)
    except KeyboardInterrupt:
        asr.stop()
        break

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

asrp-0.0.49.tar.gz (45.7 kB view details)

Uploaded Source

Built Distribution

asrp-0.0.49-py3-none-any.whl (46.9 kB view details)

Uploaded Python 3

File details

Details for the file asrp-0.0.49.tar.gz.

File metadata

  • Download URL: asrp-0.0.49.tar.gz
  • Upload date:
  • Size: 45.7 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/4.0.1 CPython/3.9.10

File hashes

Hashes for asrp-0.0.49.tar.gz
Algorithm Hash digest
SHA256 1b6a528b1065a49a9860e4ea8ff110012554ec404259e507ce4cd9e24f0305f3
MD5 baa25b29727cea14ee7a2ea4c2763b7b
BLAKE2b-256 6f1a2890c299a59e3cda11769d99f4f58db0db73a54b13b056e8bdd54535c4cc

See more details on using hashes here.

File details

Details for the file asrp-0.0.49-py3-none-any.whl.

File metadata

  • Download URL: asrp-0.0.49-py3-none-any.whl
  • Upload date:
  • Size: 46.9 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/4.0.1 CPython/3.9.10

File hashes

Hashes for asrp-0.0.49-py3-none-any.whl
Algorithm Hash digest
SHA256 82d389968eb382fb6215219a934e46ab1f6651ae29467b51681781a311e3985f
MD5 b814230c7376a7e0ae8f50ef6ae4edb8
BLAKE2b-256 452f2092d8dcd8cd531b0f7a79910d1f2b16238ca8443fc68476e4618e8d569a

See more details on using hashes here.

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page