Skip to main content

An insanely fast whisper CLI

Project description

Insanely Fast Whisper

Powered by 🤗 Transformers, Optimum & flash-attn

TL;DR - Transcribe 150 minutes (2.5 hours) of audio in less than 98 seconds - with OpenAI's Whisper Large v3. Blazingly fast transcription is now a reality!⚡️

Not convinced? Here are some benchmarks we ran on a free Google Colab T4 GPU! 👇

Optimisation type Time to Transcribe (150 mins of Audio)
Transformers (fp32) ~31 (31 min 1 sec)
Transformers (fp16 + batching [24] + bettertransformer) ~5 (5 min 2 sec)
Transformers (fp16 + batching [24] + Flash Attention 2) ~2 (1 min 38 sec)
distil-whisper (fp16 + batching [24] + bettertransformer) ~3 (3 min 16 sec)
distil-whisper (fp16 + batching [24] + Flash Attention 2) ~1 (1 min 18 sec)
Faster Whisper (fp16 + beam_size [1]) ~9.23 (9 min 23 sec)
Faster Whisper (8-bit + beam_size [1]) ~8 (8 min 15 sec)

🆕 Blazingly fast transcriptions via your terminal! ⚡️

We've added a CLI to enable fast transcriptions. Here's how you can use it:

Install insanely-fast-whisper with pipx:

pipx install insanely-fast-whisper

Run inference from any path on your computer:

insanely-fast-whisper --file-name <filename or URL>

🔥 You can run Whisper-large-v3 w/ Flash Attention 2 from this CLI too:

insanely-fast-whisper --file-name <filename or URL> --flash True 

🌟 You can run distil-whisper directly from this CLI too:

insanely-fast-whisper --model-name distil-whisper/large-v2 --file-name <filename or URL> 

Don't want to install insanely-fast-whisper? Just use pipx run:

pipx run insanely-fast-whisper --file-name <filename or URL>

Note: The CLI is opinionated and currently only works for Nvidia GPUs. Make sure to check out the defaults and the list of options you can play around with to maximise your transcription throughput. Run insanely-fast-whisper --help or pipx run insanely-fast-whisper --help to get all the CLI arguments and defaults.

How to use it without a CLI?

For older GPUs, all you need to run is:

import torch
from transformers import pipeline

pipe = pipeline("automatic-speech-recognition",
                "openai/whisper-large-v2",
                torch_dtype=torch.float16,
                device="cuda:0")

pipe.model = pipe.model.to_bettertransformer()

outputs = pipe("<FILE_NAME>",
               chunk_length_s=30,
               batch_size=24,
               return_timestamps=True)

outputs["text"]

For newer (A10, A100, H100s), use Flash Attention:

import torch
from transformers import pipeline

pipe = pipeline("automatic-speech-recognition",
                "openai/whisper-large-v2",
                torch_dtype=torch.float16,
                model_kwargs={"use_flash_attention_2": True},
                device="cuda:0")

outputs = pipe("<FILE_NAME>",
               chunk_length_s=30,
               batch_size=24,
               return_timestamps=True)

outputs["text"]                

Roadmap

  • Add a light CLI script
  • Deployment script with Inference API

Community showcase

@ochen1 created a brilliant MVP for a CLI here: https://github.com/ochen1/insanely-fast-whisper-cli (Try it out now!)

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

insanely_fast_whisper-0.0.5.tar.gz (7.3 kB view details)

Uploaded Source

Built Distribution

insanely_fast_whisper-0.0.5-py3-none-any.whl (8.4 kB view details)

Uploaded Python 3

File details

Details for the file insanely_fast_whisper-0.0.5.tar.gz.

File metadata

  • Download URL: insanely_fast_whisper-0.0.5.tar.gz
  • Upload date:
  • Size: 7.3 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: pdm/2.10.1 CPython/3.12.0

File hashes

Hashes for insanely_fast_whisper-0.0.5.tar.gz
Algorithm Hash digest
SHA256 f397a4054eb081c37f8255e7989e3c4286f4c4d7b58439c99f0cd8f8b8a974be
MD5 2ca633c5e0ab49d6b9ce2adffaddddbc
BLAKE2b-256 9f29bbf17b2074c72fdf2961a9d94c8443c0696e21ac1d5f6fd45ddd85a245f6

See more details on using hashes here.

File details

Details for the file insanely_fast_whisper-0.0.5-py3-none-any.whl.

File metadata

File hashes

Hashes for insanely_fast_whisper-0.0.5-py3-none-any.whl
Algorithm Hash digest
SHA256 e2a1dfc307def6dd497752852cab5f11f1bb84e1e4eaa660eafa3a7ffcd4ba63
MD5 fc8daa42032544896f8db2b612d25d72
BLAKE2b-256 68f63c3928da2a07a6193b5b94a03b8c85763ef5e7a8cee2a569f9c3afdff97a

See more details on using hashes here.

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page