Push-to-talk transcription

These details have not been verified by PyPI

Project description

faster-whisper Hotkey

a minimalist push-to-talk style transcription tool built upon cutting-edge ASR models.

Hold the hotkey, Speak, Release ==> And baamm in your text field!

In the terminal, in a text editor, or even in the text chat of your online video game, anywhere!

Features

Models downloading: Missing models are automatically downloaded from Hugging Face.
User-Friendly Interface: Allows users to set the input device, transcription model, compute type, device, and language directly through the menu.
Fast: Almost instant transcription, even on CPU when picking parakeet or canary.

Current models

(NEW) CohereLabs/cohere-transcribe-03-2026:
- 15 languages supported (en, de, fr, it, es, pt, el, nl, pl, ar, vi, zh, ja, ko)
- Transcription only
- No automatic language recognition
- Runs well on CPU
- Quite smart, deals well with hesitation and stutters

Experimental: LLM Correction

Optionally correct transcribed text via any OpenAI-compatible API endpoint. Works best with models that have strong language understanding (tiny models not recommended).

nvidia/canary-1b-v2:
- 25 languages supported
- Transcription and translation
- No automatic language recognition
- Crazy fast even on CPU in F16
nvidia/parakeet-tdt-0.6b-v3:
- 25 languages supported
- Transcription only
- Automatic language recognition
- Crazy fast even on CPU in F16
mistralai/Voxtral-Mini-3B-2507:
- English, Spanish, French, Portuguese, Hindi, German, Dutch, Italian
- Transcription only
- Automatic language recognition
- Smart (it even guesses when to put some quotes, etc.) and less error-prone for non English native speakers
- GPU only
Systran/faster-whisper:
- Many languages
- Transcription only

What I personally use currently?

- Almost always cohere-transcribe-03-2026, on CPU, when I need all my VRAM to run my LMs (in replacement of parakeet-tdt-0.6b-v3 which is less smart)

- Sometimes Voxtral-Mini-3B-2507, on GPU, when I run smaller LMs and can fit it along them

Installation

see https://docs.astral.sh/uv/ for more information on uv. uv is fast :)

From PyPi

As a pip package:
```
uv pip install faster-whisper-hotkey
```
or as an tool, so that you can run faster-whisper-hotkey from any venv:
```
uv tool install faster-whisper-hotkey
```

From source

Clone the repository:

git clone https://github.com/blakkd/faster-whisper-hotkey
cd faster-whisper-hotkey

Install the package and dependencies:

as a pip package:
```
uv pip install .
```
or as an uv tool:
```
uv tool install .
```

For Nvidia GPU

You need to install cudnn https://developer.nvidia.com/cudnn-downloads

Usage

Whether you installed from PyPi or from source, just run faster-whisper-hotkey
Go through the menu steps.
Once the model is loaded, focus on any text field.
Then, simply press the hotkey (PAUSE, F4 or F8) while you speak, release it when you're done, and see the magic happening!

When the script is running, you can forget it, the model will remain loaded, and it's ready to transcribe at any time.

Configuration File

The script automatically saves your settings to ~/.config/faster_whisper_hotkey/transcriber_settings.json.

Limitations

voxtral: because of some limitations, and to keep the automatic language recognition capabilities, we are splitting the audio by chunks of 30s. So even if we can still transcribe long speech, best results are when audio is shorter than this. In the current state it seems impossible to concile long audio as 1 chunk and automatic language detection. We may need to patch upstream https://huggingface.co/docs/transformers/v4.56.1/en/model_doc/voxtral#transformers.VoxtralProcessor.apply_transcription_request
Due to window type detection to send appropriate key stroke, unfortunately the VSCodium/VSCode terminal isn't supported for now. No clue if we can workaround this.
Windows supported is not planned. That said, you can use eutychius's branch which seems working fine. See this comment for instructions.

Tips

If you you pick a multilingual faster-whisper model, and select en as source while speaking another language it will be translated to English, provided you speak for at least few seconds.
If you pick parakeet-tdt-0.6b-v3, you can even use multiple languages during your recording!

Acknowledgements

Many thanks to:

the developers of faster-whisper for providing such an efficient transcription inference engine
NVIDIA for their blazing fast parakeet and canary models
Mistral for their impressively accurate model Voxtral-Mini-3B model
Cohere for their cohere-transcribe-03-2026 model
and to all the contributors of the libraries I used

Also thanks to wgabrys88 and MohamedRashadthat for their huggingface spaces that have been helpful!

And to finish, a special mention to @siddhpant for their useful broo tool, who gave me a mic <3

Project details

These details have not been verified by PyPI

Release history Release notifications | RSS feed

0.4.9

Apr 5, 2026

0.4.8

Apr 5, 2026

0.4.7

Apr 5, 2026

This version

0.4.6

Apr 5, 2026

0.4.5

Mar 28, 2026

0.4.4

Mar 28, 2026

0.4.3

Sep 18, 2025

0.4.2

Sep 14, 2025

0.4.1

Sep 14, 2025

0.4.0

Sep 14, 2025

0.3.8

Sep 8, 2025

0.3.7

Sep 8, 2025

0.3.6

Sep 8, 2025

0.3.5

Sep 6, 2025

0.3.4

Sep 6, 2025

0.3.3

Aug 31, 2025

0.3.2

Aug 30, 2025

0.3.1

Aug 30, 2025

0.3.0

Aug 29, 2025

0.2.7

Aug 26, 2025

0.2.6

Aug 9, 2025

0.2.5

Jun 15, 2025

0.2.4

Jun 14, 2025

0.2.3

Jun 14, 2025

0.2.2

Jun 11, 2025

0.2.0

Jun 10, 2025

0.1.8

May 31, 2025

0.1.7

May 25, 2025

0.1.6

May 5, 2025

0.1.5

Apr 25, 2025

0.1.4

Apr 15, 2025

0.1.3

Apr 12, 2025

0.1.2

Apr 8, 2025

0.1.1

Apr 8, 2025

0.1.0

Apr 8, 2025

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

faster_whisper_hotkey-0.4.6.tar.gz (41.1 kB view details)

Uploaded Apr 5, 2026 Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

The dropdown lists show the available interpreters, ABIs, and platforms. Enable javascript to be able to filter the list of wheel files.

faster_whisper_hotkey-0.4.6-py3-none-any.whl (23.6 kB view details)

Uploaded Apr 5, 2026 Python 3

File details

Details for the file faster_whisper_hotkey-0.4.6.tar.gz.

File metadata

Download URL: faster_whisper_hotkey-0.4.6.tar.gz
Upload date: Apr 5, 2026
Size: 41.1 kB
Tags: Source
Uploaded using Trusted Publishing? No
Uploaded via: twine/6.2.0 CPython/3.13.12

File hashes

Hashes for faster_whisper_hotkey-0.4.6.tar.gz
Algorithm	Hash digest
SHA256	`96d7a552f880cbcde51f4943125a701cc4b618e3e24eb458fe96c3dfa62123fb`
MD5	`79173d1eb7da13f1f93b866a89e49212`
BLAKE2b-256	`6f1af37f14ace92a4162dfc5fb8c87403addc7f9e7e9130216d4fb003e1fbf6e`

See more details on using hashes here.

File details

Details for the file faster_whisper_hotkey-0.4.6-py3-none-any.whl.

File metadata

Download URL: faster_whisper_hotkey-0.4.6-py3-none-any.whl
Upload date: Apr 5, 2026
Size: 23.6 kB
Tags: Python 3
Uploaded using Trusted Publishing? No
Uploaded via: twine/6.2.0 CPython/3.13.12

File hashes

Hashes for faster_whisper_hotkey-0.4.6-py3-none-any.whl
Algorithm	Hash digest
SHA256	`3bfffc812d0486253153fdfe835ed1d56ceb262afbe5466f147c209ca0f5d6be`
MD5	`98f979cf9d0a18c193c2647ea31dd74f`
BLAKE2b-256	`b16aa32f07eaad4b6977dfb79ead90567822bbc70e1ad562caf325e7451e5d81`

See more details on using hashes here.

faster-whisper-hotkey 0.4.6

Navigation

Verified details

Maintainers

Unverified details

Meta

Classifiers

Project description

faster-whisper Hotkey

Features

Current models

Experimental: LLM Correction

Installation

From PyPi

From source

For Nvidia GPU

Usage

Configuration File

Limitations

Tips

Acknowledgements

Project details

Verified details

Maintainers

Unverified details

Meta

Classifiers

Release history Release notifications | RSS feed

Download files

Source Distribution

Built Distribution

File details

File metadata

File hashes

File details

File metadata

File hashes