A Hackable speech recognition library

These details have not been verified by PyPI

Project links

Project description

Test

Thunder speech

A Hackable speech recognition library.

What to expect from this project:

End-to-end speech recognition models
Simple fine-tuning to new languages
Inference support as a first-class feature
Developer oriented api

What it's not:

A general-purpose speech toolkit
A collection of complex systems that require thousands of gpu-hours and expert knowledge, only focusing on the state-of-the-art results

Quick usage guide

Install

Install the library from PyPI:

pip install thunder-speech

Load the model and train it

from thunder.registry import load_pretrained
from thunder.quartznet.compatibility import QuartznetCheckpoint

# Tab completion works to discover other QuartznetCheckpoint.*
module = load_pretrained(QuartznetCheckpoint.QuartzNet5x5LS_En)
# It also accepts the string identifier
module = load_pretrained("QuartzNet5x5LS_En")
# Or models from the huggingface hub
module = load_pretrained("facebook/wav2vec2-large-960h")

Export to a pure pytorch model using torchscript

module.to_torchscript("model_ready_for_inference.pt")

# Optional step: also export audio loading pipeline
from thunder.data.dataset import AudioFileLoader

loader = AudioFileLoader(sample_rate=16000)
scripted_loader = torch.jit.script(loader)
scripted_loader.save("audio_loader.pt")

Run inference in production

import torch
import torchaudio

model = torch.jit.load("model_ready_for_inference.pt")
loader = torch.jit.load("audio_loader.pt")
# Open audio
audio = loader("audio_file.wav")
# transcriptions is a list of strings with the captions.
transcriptions = model.predict(audio)

More quick tips

If you want to know how to access the raw probabilities and decode manually or fine-tune the models you can access the documentation here.

Contributing

The first step to contribute is to do an editable installation of the library:

git clone https://github.com/scart97/thunder-speech.git
cd thunder-speech
poetry install
pre-commit install

Then, make sure that everything is working. You can run the test suit, that is based on pytest:

RUN_SLOW=1 poetry run pytest

Here the RUN_SLOW flag is used to run all the tests, including the ones that might download checkpoints or do small training runs and are marked as slow. If you don't have a CUDA capable gpu, some tests will be unconditionally skipped.

Influences

This library has heavy influence of the best practices in the pytorch ecosystem. The original model code, including checkpoints, is based on the NeMo ASR toolkit. From there also came the inspiration for the fine-tuning and prediction api's.

The data loading and processing is loosely based on my experience using fast.ai. It tries to decouple transforms that happen at the item level from the ones that are efficiently implemented for the whole batch at the GPU. Also, the idea that default parameters should be great.

The overall organization of code and decoupling follows the pytorch-lightning ideals, with self-contained modules that try to reduce the boilerplate necessary.

Finally, the transformers library inspired the simple model implementations, with a clear separation in folders containing the specific code that you need to understand each architecture and preprocessing, and their strong test suit.

Project details

These details have not been verified by PyPI

Project links

Release history Release notifications | RSS feed

This version

3.2.0

Aug 4, 2022

3.1.3

Jul 5, 2022

3.1.2

May 11, 2022

3.1.1

May 11, 2022

3.1.0

May 10, 2022

3.0.4

May 9, 2022

3.0.3

Apr 18, 2022

3.0.2

Apr 18, 2022

3.0.1

Apr 17, 2022

3.0.0

Feb 27, 2022

2.2.3

Sep 14, 2021

2.2.2

Jul 29, 2021

2.2.1

Jul 28, 2021

2.2.0

Jul 21, 2021

2.1.0

Jul 19, 2021

2.0.0

Jul 19, 2021

1.3.0

Jun 15, 2021

1.2.0

May 26, 2021

1.1.0

May 26, 2021

1.0.1

May 25, 2021

1.0.0

May 23, 2021

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

thunder-speech-3.2.0.tar.gz (36.1 kB view details)

Uploaded Aug 4, 2022 Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

The dropdown lists show the available interpreters, ABIs, and platforms. Enable javascript to be able to filter the list of wheel files.

thunder_speech-3.2.0-py3-none-any.whl (48.9 kB view details)

Uploaded Aug 4, 2022 Python 3

File details

Details for the file thunder-speech-3.2.0.tar.gz.

File metadata

Download URL: thunder-speech-3.2.0.tar.gz
Upload date: Aug 4, 2022
Size: 36.1 kB
Tags: Source
Uploaded using Trusted Publishing? No
Uploaded via: twine/3.8.0 pkginfo/1.8.3 readme-renderer/35.0 requests/2.28.1 requests-toolbelt/0.9.1 urllib3/1.26.11 tqdm/4.64.0 importlib-metadata/4.12.0 keyring/23.7.0 rfc3986/2.0.0 colorama/0.4.5 CPython/3.9.13

File hashes

Hashes for thunder-speech-3.2.0.tar.gz
Algorithm	Hash digest
SHA256	`b48642899b43c2b75391c8c024057efc41cab293d285b0c201860689d13814be`
MD5	`929444ab28c100f77bd02e693591a1ff`
BLAKE2b-256	`cb4f7466d21cf0777cbbd551f40a305497e684d8883ae308eade5d574c8ccf1b`

See more details on using hashes here.

File details

Details for the file thunder_speech-3.2.0-py3-none-any.whl.

File metadata

Download URL: thunder_speech-3.2.0-py3-none-any.whl
Upload date: Aug 4, 2022
Size: 48.9 kB
Tags: Python 3
Uploaded using Trusted Publishing? No
Uploaded via: twine/3.8.0 pkginfo/1.8.3 readme-renderer/35.0 requests/2.28.1 requests-toolbelt/0.9.1 urllib3/1.26.11 tqdm/4.64.0 importlib-metadata/4.12.0 keyring/23.7.0 rfc3986/2.0.0 colorama/0.4.5 CPython/3.9.13

File hashes

Hashes for thunder_speech-3.2.0-py3-none-any.whl
Algorithm	Hash digest
SHA256	`2ef727485605eab72de24b40e8b6bc1cd8ef9d068d6399e41811aa3fd81761af`
MD5	`47e5ad1cb5d794a4186e60c0113f2b4c`
BLAKE2b-256	`c3a6045e9d3bdf5ec1eb8f1c2bd600ada553fe8f99b0dc7928e5cb59444c2c84`

See more details on using hashes here.

thunder-speech 3.2.0

Navigation

Verified details

Maintainers

Unverified details

Project links

Meta

Classifiers

Project description

Thunder speech

Quick usage guide

Install

Load the model and train it

Export to a pure pytorch model using torchscript

Run inference in production

More quick tips

Contributing

Influences

Project details

Verified details

Maintainers

Unverified details

Project links

Meta

Classifiers

Release history Release notifications | RSS feed

Download files

Source Distribution

Built Distribution

File details

File metadata

File hashes

File details

File metadata

File hashes