Skip to main content

An interactive dictation tool for local speech recognition using OpenAI's Whisper models.

Project description

whspr

A minimalist dictation tool for local speech recognition using OpenAI's Whisper models. Its interface is fully keyboard-driven and sound-based so as not to interfere with windowing or application focus.

Processing is done locally using faster-whisper. If whspr[gpu] optional dependencies are installed and an Nvidia GPU is available, the model whisper-large-v3-turbo will be used; otherwise, whisper-small.en will be used. whspr is currently only available on Linux and can be installed from PyPI.

Usage

Bind the following commands to your preferred keyboard shortcuts (examples given here).

whspr            # Super+C
whspr --paste    # Super+V
whspr --cancel   # Super+X

In the example given, Super+C and Super+V will both start or stop dictation and copy the result to the clipboard. The difference is that Super+V will additionally paste the result into the currently focussed application. Super+X will cancel any dictation currently in progress. Sounds will indicate when whspr is listening and when it has finished processing.

whspr can also be accessed from within Python:

from whspr import transcribe
result = transcribe("path/to/audio.mp3")

Installation

whspr depends on:

  • the aplay, arecord, ydotool commands. The former two are part of the alsa-utils package and installed on most distros by default. ydotool is optional and only required for the --paste flag (see Usage).
  • a clipboard backend compatible with pyperclip, e.g. wl-clipboard on Wayland or xclip on X11.
  • for optional GPU-accelerated speech recognition, an Nvidia GPU and drivers are required.

On Ubuntu, simply run:

sudo apt update && sudo apt install -y alsa-utils wl-clipboard xclip ydotool pipx
pipx install whspr[gpu]  # gpu support is optional; omit [gpu] if it's not desired
whspr --finish-setup     # optional to pre-load the model from the internet before its first use

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

whspr-1.0.0.tar.gz (99.7 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

whspr-1.0.0-py3-none-any.whl (110.5 kB view details)

Uploaded Python 3

File details

Details for the file whspr-1.0.0.tar.gz.

File metadata

  • Download URL: whspr-1.0.0.tar.gz
  • Upload date:
  • Size: 99.7 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.10.12

File hashes

Hashes for whspr-1.0.0.tar.gz
Algorithm Hash digest
SHA256 d2bcc5aa71528712e505b1aaa3e20f686bc1cc1c2a3b178c1b9d5b551cdb19d9
MD5 24e8b6925163e3275e03f2750a83796b
BLAKE2b-256 4df0a7c225bd9ab1a9748cc650bf81db91604ef9644de2d4759d3d2af19b2402

See more details on using hashes here.

File details

Details for the file whspr-1.0.0-py3-none-any.whl.

File metadata

  • Download URL: whspr-1.0.0-py3-none-any.whl
  • Upload date:
  • Size: 110.5 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.10.12

File hashes

Hashes for whspr-1.0.0-py3-none-any.whl
Algorithm Hash digest
SHA256 79a92126da8409ff74f45caabb189b9c33d90d55611d4738eeb54290fa6b3443
MD5 1051be20bd92cc2decbe6f30955ddbb9
BLAKE2b-256 5e2c6e88011d786b97ba2a79f1e0a5a0578a74c0007dce2b63a94984125f5a41

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page