Skip to main content

Whisper command line client that uses CTranslate2

Project description

PyPI version PyPI downloads

Introduction

Whisper command line client compatible with original OpenAI client based on CTranslate2.

It uses CTranslate2 and Faster-whisper Whisper implementation that is up to 4 times faster than openai/whisper for the same accuracy while using less memory.

Goals of the project:

  • Provide an easy way to use the CTranslate2 Whisper implementation
  • Ease the migration for people using OpenAI Whisper CLI

Installation

To install the latest stable version, just type:

pip install -U whisper-ctranslate2

Alternatively, if you are interested the latest development (non-stable) version from this repository, just tpe:

pip install git+https://github.com/jordimas/whisper-ctranslate2.git

CPU and GPU support

GPU and CPU support are provided by CTranslate2.

It has compatibility with x86-64 and AArch64/ARM64 CPU and integrates multiple backends that are optimized for these platforms: Intel MKL, oneDNN, OpenBLAS, Ruy, and Apple Accelerate.

GPU execution requires the NVIDIA libraries cuBLAS 11.x and cuDNN 8.x to be installed on the system. Please refer to the CTranslate2 documentation

By default the best hardware available is selected for inference. You can use the options --device and --device_index to control manually the selection.

Usage

Same command line that OpenAI Whisper.

To transcribe:

whisper-ctranslate2 inaguracio2011.mp3 --model medium
image

To translate:

whisper-ctranslate2 inaguracio2011.mp3 --model medium --task translate
image

Additionally using:

whisper-ctranslate2 --help

All the supported options with their help are shown.

CTranslate2 specific options

On top of the OpenAI Whisper command line options, there are some specific options provided by CTranslate2 or whiper-ctranslate2.

Quantization

--compute_type option which accepts default,auto,int8,int8_float16,int16,float16,float32 values indicates the type of quantization to use. On CPU int8 will give the best performance:

whisper-ctranslate2 myfile.mp3 --compute_type int8

Loading the model from a directory

--model_directory option allows to specify the directory from which you want to load a CTranslate2 Whisper model. For example, if you want to load your own quantified Whisper model version or using your own Whisper fine-tunned version. The model must be in CTranslate2 format.

Using Voice Activity Detection (VAD) filter

--vad_filter option enables the voice activity detection (VAD) to filter out parts of the audio without speech. This step uses the Silero VAD model:

whisper-ctranslate2 myfile.mp3 --vad_filter True

The VAD filter accepts multiple additional options to determine the filter behavior:

--vad_threshold VALUE (float)

Probabilities above this value are considered as speech.

--vad_min_speech_duration_ms (int)

Final speech chunks shorter min_speech_duration_ms are thrown out.

--vad_max_speech_duration_s VALUE (int)

Maximum duration of speech chunks in seconds. Longer will be split at the timestamp of the last silence.

Print colors

--print_colors True options prints the transcribed text using an experimental color coding strategy based on whisper.cpp to highlight words with high or low confidence:

whisper-ctranslate2 myfile.mp3 --print_colors True
image

Live transcribe from your microphone

----live_transcribe True option activates the live transcription mode from your microphone:

whisper-ctranslate2 myfile.mp3 --print_colors True

https://user-images.githubusercontent.com/309265/231533784-e58c4b92-e9fb-4256-b4cd-12f1864131d9.mov

Need help?

Check our frequently asked questions for common questions.

Contact

Jordi Mas jmas@softcatala.org

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

whisper-ctranslate2-0.1.9.tar.gz (17.1 kB view details)

Uploaded Source

Built Distribution

whisper_ctranslate2-0.1.9-py3-none-any.whl (17.9 kB view details)

Uploaded Python 3

File details

Details for the file whisper-ctranslate2-0.1.9.tar.gz.

File metadata

  • Download URL: whisper-ctranslate2-0.1.9.tar.gz
  • Upload date:
  • Size: 17.1 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/4.0.1 CPython/3.10.7

File hashes

Hashes for whisper-ctranslate2-0.1.9.tar.gz
Algorithm Hash digest
SHA256 4dbda4dbca9c9e7368a788db9a66585d378b1c192b41a3b1be12eab4373e951b
MD5 dabd81a8766af0ec2d072f5f4d52c0ef
BLAKE2b-256 76886e124934f9b6a1e73c25fb14a73df3e9265844da745ba316cdf83547f5fd

See more details on using hashes here.

File details

Details for the file whisper_ctranslate2-0.1.9-py3-none-any.whl.

File metadata

File hashes

Hashes for whisper_ctranslate2-0.1.9-py3-none-any.whl
Algorithm Hash digest
SHA256 d737873350ea45ec5bd5da480a9bf5cfbee864d71f649b376698753798988bf4
MD5 cf8cfb3405775f51fac2d02fd3ef8e63
BLAKE2b-256 4632e24678ae8eef4658393975e50c9906ab53adec3388c3437df1977b595aa7

See more details on using hashes here.

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page