Whisper command line client that uses CTranslate2
Project description
Introduction
Whisper command line client compatible with original OpenAI client based on CTranslate2.
It uses CTranslate2 and Faster-whisper Whisper implementation that is up to 4 times faster than openai/whisper for the same accuracy while using less memory.
Goals of the project:
- Provide an easy way to use the CTranslate2 Whisper implementation
- Ease the migration for people using OpenAI Whisper CLI
Installation
Just type:
pip install -U whisper-ctranslate2
Alternatively, the following command will pull and install the latest commit from this repository, along with its Python dependencies:
pip install git+https://github.com/jordimas/whisper-ctranslate2.git
Usage
Same command line that OpenAI whisper.
To transcribe:
whisper-ctranslate2 inaguracio2011.mp3 --model medium
To translate:
whisper-ctranslate2 inaguracio2011.mp3 --model medium --task translate
Additionally using:
whisper-ctranslate2 --help
All the supported options with their help are shown.
CTranslate2 specific options
On top of the OpenAI Whisper command line options, there are some specific options provided by CTranslate2 .
--compute_type {default,auto,int8,int8_float16,int16,float16,float32}
Type of quantization to use. On CPU int8 will give the best performance.
--model_directory MODEL_DIRECTORY
Directory where to find a CTranslate Whisper model, for example a fine-tunned Whisper model. The model should be in CTranslate2 format.
--device_index [DEVICE_INDEX ...]
Device IDs where to place this model on
--vad_filter VAD_FILTER
Enable the voice activity detection (VAD) to filter out parts of the audio without speech. This step is using the Silero VAD model https://github.com/snakers4/silero-vad.
--vad_min_silence_duration_ms VAD_MIN_SILENCE_DURATION_MS
When vad_filter
is enabled, audio segments without speech for at least this number of milliseconds will be ignored.
Whisper-ctranslate2 specific options
On top of the OpenAI Whisper and CTranslate2, whisper-ctranslate2 provides some additional specific options:
--print-colors PRINT_COLORS
Adding the --print_colors True
argument will print the transcribed text using an experimental color coding strategy based on whisper.cpp to highlight words with high or low confidence:
--live_transcribe
Adding the ----live_transcribe True
will activate the live transcription mode from your microphone.
https://user-images.githubusercontent.com/309265/231533784-e58c4b92-e9fb-4256-b4cd-12f1864131d9.mov
Need help?
Check our frequently asked questions for common questions.
Contact
Jordi Mas jmas@softcatala.org
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Hashes for whisper-ctranslate2-0.1.7.tar.gz
Algorithm | Hash digest | |
---|---|---|
SHA256 | 235891077bbe13cd7f68fce2f8e04a7c24e3320b3fb8c89b26cf7c102ecedaba |
|
MD5 | caaa619cd21ea3cad8324d4603f5a0cc |
|
BLAKE2b-256 | 375be3df201c2f455d6c8dc34dd9bcd34567d1bedd0f1d3c0fbb3505814c385b |
Hashes for whisper_ctranslate2-0.1.7-py3-none-any.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | 493f60ac0238a5b54ee4b53816ad51470e43b67d201f07582de10d1deae08f49 |
|
MD5 | 395f8482a7e31c95d59a149ef153eb36 |
|
BLAKE2b-256 | c9039070f7db4cc7a95a74ae0aa2286acbca13b7881cd719cc415e9d9961cf6c |