Local speech-to-text for desktop using faster-whisper
Project description
stt2desktop
Local speech-to-text for desktop using faster-whisper.
Let's you dictate text into any application without sending audio to any cloud services. Everything runs locally on your machine — no internet connection required after the initial model was download.
Currently only tested under Linux with KDE ;)
How it works
- Run
./cli.py listen(Whisper model downloaded on first run, cached on disk) - Hold Scroll Lock to record from your microphone
- Release Scroll Lock — the audio is transcribed locally by faster-whisper
- The transcribed text is copied to the clipboard via
wl-copyand pasted into the focused window viaydotool key ctrl+v
Used tools:
- faster-whisper for local speech recognition
- ydotool to simulate keyboard input (works on Wayland and X11)
- wl-clipboard (
wl-copy) to paste text via clipboard — avoids keyboard layout issues - chime to play notification sounds
Prepare installation
Requirements: Python 3.12+, a working microphone, wl-clipboard and ydotool and ydotoold:
sudo apt install ydotool ydotoold wl-clipboard
sudo usermod -aG input $USER
echo 'KERNEL=="uinput", GROUP="input", MODE="0660"' | sudo tee /etc/udev/rules.d/60-uinput.rules
sudo udevadm control --reload-rules && sudo udevadm trigger
Then re-login (or run newgrp input in the current shell) so the group change takes effect.
Install via pipx
You can install "stt2desktop" with pipx:
sudo apt install pipx
pipx install stt2desktop
Then run:
stt2desktop listen
The default global hotkey is Scroll Lock (In german: "rollen").
You can change it via the --hotkey option (see below).
Proposal for alternative key: ctrl_r, alt_r, cmd_r, shift_r ;)
CLI listen
usage: stt2desktop listen [-h] [LISTEN OPTIONS]
Start the STT listener. Hold the hotkey to record, release to transcribe and insert.
╭─ options ────────────────────────────────────────────────────────────────────────────────────────────────────────────╮
│ -h, --help show this help message and exit │
│ -v, --verbosity Verbosity level; e.g.: -v, -vv, -vvv, etc. (repeatable) │
│ --model {tiny_en,tiny,base_en,base,small_en,small,medium_en,medium,large_v1,large_v2,large_v3,large,distil_large_v2, │
│ distil_medium_en,distil_small_en,distil_large_v3,distil_large_v3_5,large_v3_turbo,turbo} │
│ Whisper model to use for transcription. (default: small) │
│ --hotkey STR evdev key name to hold for recording. Release to transcribe and insert text. Examples: │
│ KEY_SCROLLLOCK, KEY_RIGHTCTRL, KEY_RIGHTALT. (default: KEY_SCROLLLOCK) │
│ --sample-rate INT Audio sample rate in Hz. Whisper expects 16000. (default: 16000) │
│ --device STR Device to run inference on, e.g. cpu or cuda. (default: auto) │
│ --compute-type STR Quantization type, e.g. int8, float16, float32. (default: int8) │
│ --num-workers {None}|INT Number of parallel transcription workers. Defaults to CPU count. (default: None) │
│ --sounds, --no-sounds Play notification sounds via chime. (default: True) │
╰──────────────────────────────────────────────────────────────────────────────────────────────────────────────────────╯
Whisper models
Just a selection and approximate values:
| Model | Size | Speed | Accuracy |
|---|---|---|---|
tiny |
~75 MB | fastest | lowest |
base |
~145 MB | fast | good |
small |
~460 MB | slower | better (default) |
medium |
~1.5 GB | slow | high |
Larger models produce more accurate transcriptions but take longer to process ;)
Troubleshooting
Use pavucontrol to check your audio setup and make sure the correct microphone is selected and working.
Test audio recording:
./cli.py test-recording
Some terminal commands to check your audio setup:
# List capture devices in PulseAudio sound server:
pactl list sources short
# Check current volume:
pactl list sources | grep -A1 "Name: .*input\|Volume:"
# Displays the current state in PipeWire:
wpctl status
Setup loopback mode to hear youself:
# Start:
pactl load-module module-loopback
# Undo:
pactl unload-module module-loopback
start development
At least uv is needed. Install e.g.: via pipx:
apt-get install pipx
pipx install uv
Clone the project and just start the CLI help commands. A virtual environment will be created/updated automatically.
~$ git clone https://github.com/jedie/stt2desktop.git
~$ cd stt2desktop
~/stt2desktop$ ./cli.py --help
~/stt2desktop$ ./dev-cli.py --help
usage: ./dev-cli.py [-h] {coverage,install,lint,mypy,nox,pip-audit,publish,test,update,update-readme-history,update-test-snapshot-files,version}
╭─ options ────────────────────────────────────────────────────────────────────────────────────────────────────────────╮
│ -h, --help show this help message and exit │
╰──────────────────────────────────────────────────────────────────────────────────────────────────────────────────────╯
╭─ subcommands ────────────────────────────────────────────────────────────────────────────────────────────────────────╮
│ (required) │
│ • coverage Run tests and show coverage report. │
│ • install Install requirements and 'stt2desktop' via pip as editable. │
│ • lint Check/fix code style by run: "ruff check --fix" │
│ • mypy Run Mypy (configured in pyproject.toml) │
│ • nox Run nox │
│ • pip-audit │
│ Run pip-audit check against current requirements files │
│ • publish Build and upload this project to PyPi │
│ • test Run unittests │
│ • update Update dependencies (uv.lock) and git pre-commit hooks │
│ • update-readme-history │
│ Update project history base on git commits/tags in README.md Will be exited with 1 if the README.md │
│ was updated otherwise with 0. │
│ │
│ Also, callable via e.g.: │
│ python -m cli_base update-readme-history -v │
│ • update-test-snapshot-files │
│ Update all test snapshot files (by remove and recreate all snapshot files) │
│ • version Print version and exit │
╰──────────────────────────────────────────────────────────────────────────────────────────────────────────────────────╯
History
- v0.3.0
- 2026-04-23 - avoid double hotkey processing
- 2026-04-23 - nicer exit
- 2026-04-23 - fix code style
- 2026-04-23 - Use a lock file to ensure that only one instance is running
- 2026-04-23 - restore old clipboard after pasting the STT text
- v0.2.0
- 2026-04-22 - paste text via clipboard to avoid keyboard layout issues
- 2026-04-16 - Add test commands and migrate to ydotool
- v0.1.2
- 2026-03-30 - print warning when not running on Linux
- 2026-03-30 - Update requirements
- 2026-03-27 - Update README
- v0.1.1
- 2026-03-27 - +Proposal for alternative hotkey
- 2026-03-27 - fix color outputs
- 2026-03-27 - Update requirements
- 2026-03-27 - add missing license file.
Project details
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file stt2desktop-0.3.0.tar.gz.
File metadata
- Download URL: stt2desktop-0.3.0.tar.gz
- Upload date:
- Size: 128.2 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.2.0 CPython/3.14.4
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
3d7b26bc16f7a0638ba262941ed8accdab6faa197fcb06f1d2880cd0b59ca45f
|
|
| MD5 |
c06a2706927b076963e430366f7c07b6
|
|
| BLAKE2b-256 |
1b4db7dc35a57d412300bfcfbfcd32d209436e0e7bea3f194200a0869b656519
|
File details
Details for the file stt2desktop-0.3.0-py3-none-any.whl.
File metadata
- Download URL: stt2desktop-0.3.0-py3-none-any.whl
- Upload date:
- Size: 34.8 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.2.0 CPython/3.14.4
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
4f2c39a6f7d08e210c3acd7a90d0441c73c97af8b41acf516513e55131ce32aa
|
|
| MD5 |
4802c48b32c8e33c5b69e8371645d985
|
|
| BLAKE2b-256 |
8558e7716ed75591be4195cfb975650a8567ce173bc94aa1a93902bf3e454cfe
|