An interactive dictation tool for local speech recognition using OpenAI's Whisper models.
Project description
whspr
A minimalist dictation tool for local speech recognition using OpenAI's Whisper models. Its interface is fully keyboard-driven and sound-based so as not to interfere with windowing or application focus.
Processing is done locally using faster-whisper. If whspr[gpu] optional dependencies
are installed and an Nvidia GPU is available, the model whisper-large-v3-turbo
will be used; otherwise, whisper-small.en will be used. whspr is currently only
available on Linux and can be installed from PyPI.
Usage
Bind the following commands to your preferred keyboard shortcuts (examples given here).
whspr # Super+C
whspr --paste # Super+V
whspr --cancel # Super+X
In the example given, Super+C and Super+V will both start or stop dictation and copy the
result to the clipboard. The difference is that Super+V will additionally paste the result
into the currently focussed application. Super+X will cancel any dictation currently in progress.
Sounds will indicate when whspr is listening and when it has finished processing.
whspr can also be accessed from within Python:
from whspr import transcribe
result = transcribe("path/to/audio.mp3")
Installation
whspr depends on:
- the
aplay,arecord,ydotoolcommands. The former two are part of thealsa-utilspackage and installed on most distros by default.ydotoolis optional and only required for the--pasteflag (see Usage). - a clipboard backend compatible with
pyperclip, e.g.wl-clipboardon Wayland orxclipon X11. - for optional GPU-accelerated speech recognition, an Nvidia GPU and drivers are required.
On Ubuntu, simply run:
sudo apt update && sudo apt install -y alsa-utils wl-clipboard xclip ydotool pipx
pipx install whspr[gpu] # gpu support is optional; omit [gpu] if it's not desired
whspr --finish-setup # optional to pre-load the model from the internet before its first use
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file whspr-1.0.0.tar.gz.
File metadata
- Download URL: whspr-1.0.0.tar.gz
- Upload date:
- Size: 99.7 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.2.0 CPython/3.10.12
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
d2bcc5aa71528712e505b1aaa3e20f686bc1cc1c2a3b178c1b9d5b551cdb19d9
|
|
| MD5 |
24e8b6925163e3275e03f2750a83796b
|
|
| BLAKE2b-256 |
4df0a7c225bd9ab1a9748cc650bf81db91604ef9644de2d4759d3d2af19b2402
|
File details
Details for the file whspr-1.0.0-py3-none-any.whl.
File metadata
- Download URL: whspr-1.0.0-py3-none-any.whl
- Upload date:
- Size: 110.5 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.2.0 CPython/3.10.12
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
79a92126da8409ff74f45caabb189b9c33d90d55611d4738eeb54290fa6b3443
|
|
| MD5 |
1051be20bd92cc2decbe6f30955ddbb9
|
|
| BLAKE2b-256 |
5e2c6e88011d786b97ba2a79f1e0a5a0578a74c0007dce2b63a94984125f5a41
|