Real-time speech-to-text for Linux. Hold a hotkey, speak, release — your words appear wherever your cursor is.
Project description
PushToType
Hold a hotkey, speak, release — your words appear wherever your cursor is.
PushToType is a local, real-time speech-to-text tool for Linux. It transcribes your voice using a local Whisper model and types the result directly into whatever application has focus — no clipboard, no cloud, no API keys.
An open-source alternative to OpenAI's Whisper Flow, which has no Linux support.
Features
- Works everywhere — types into any focused app: browsers, editors, terminals, search bars
- Local-only —
faster-whisperruns on your GPU (CUDA) with automatic CPU fallback - No cloud — no API keys, no network required after the one-time model download
- Fast — ~250ms from hotkey release to text appearing
- Configurable — TOML config file, interactive setup wizard, CLI flags
- Wayland + X11 — works on both display servers via
evdev
Quick Start
# Install
uv add pushtotype # or: pip install pushtotype
# System dependencies (X11)
sudo apt install libportaudio2 xdotool
# Add yourself to the input group (required for hotkey detection)
sudo usermod -aG input $USER
# Log out and back in for this to take effect
# Run the setup wizard
pushtotype config
# Start
pushtotype
Hold your configured hotkey (default: right Ctrl), speak, release. Text appears at the cursor.
How It Works
[Hold hotkey] → [Record audio] → [Whisper transcription] → [Type into focused app]
evdev sounddevice faster-whisper xdotool type
PushToType runs as a background daemon. A global hotkey listener (via evdev, reading directly from /dev/input/) fires a recording callback. When you release the hotkey, the audio is sent to faster-whisper for transcription, then xdotool type injects the text into whatever window is focused.
Installation
Recommended: uv
uv tool install pushtotype
pip / pipx
pip install pushtotype
# or
pipx install pushtotype
From source
git clone https://github.com/danielgraviet/pushtotype.git
cd pushtotype
uv pip install -e ".[dev]"
System Requirements
| Requirement | Notes |
|---|---|
| Linux | X11 or Wayland |
| Python 3.10+ | |
libportaudio2 |
sudo apt install libportaudio2 |
xdotool |
X11 only — sudo apt install xdotool |
wtype + wl-clipboard |
Wayland only — sudo apt install wtype wl-clipboard |
input group |
sudo usermod -aG input $USER |
| NVIDIA GPU | Recommended for speed — CPU works but is slower |
Configuration
Config file lives at ~/.config/pushtotype/config.toml. Run pushtotype config to create it interactively.
[hotkey]
keys = ["KEY_RIGHTCTRL"]
[audio]
device = "default"
sample_rate = 16000
[model]
name = "base.en"
device = "auto"
compute_type = "float16"
[feedback]
enabled = true
volume = 0.5
[output]
method = "auto" # "auto", "x11", or "wayland"
Config priority (highest to lowest)
- CLI flags (e.g.
--model small.en) - Environment variables (e.g.
PUSHTOTYPE_MODEL=small.en) - Config file (
~/.config/pushtotype/config.toml) - Built-in defaults
Environment variables
| Variable | Config key |
|---|---|
PUSHTOTYPE_MODEL |
model.name |
PUSHTOTYPE_DEVICE |
model.device |
PUSHTOTYPE_AUDIO_DEV |
audio.device |
PUSHTOTYPE_FEEDBACK |
feedback.enabled |
PUSHTOTYPE_HOTKEY |
hotkey.keys (comma-separated) |
CLI Reference
pushtotype Start the push-to-talk daemon
pushtotype config Run the interactive setup wizard
pushtotype config --show Print the current effective config
pushtotype devices List available audio input devices
pushtotype test Record 5 seconds and transcribe (verify setup)
pushtotype download [MODEL] Pre-download a Whisper model
Global flags:
-v, --verbose Enable debug logging (shows per-step timings)
-q, --quiet Suppress all output except errors
--log-file PATH Write logs to a file
--model NAME Override model (e.g. small.en)
--hotkey COMBO Override hotkey (e.g. ctrl+shift+s)
--device INDEX Override audio device index
--no-feedback Disable start/stop beeps
Troubleshooting
Permission denied on /dev/input/
You need to be in the input group:
sudo usermod -aG input $USER
# Log out and back in
xdotool not found
sudo apt install xdotool
Text doesn't appear in my terminal
Terminals use Ctrl+Shift+V to paste, but PushToType uses xdotool type which bypasses the clipboard entirely — it should work in all terminals without any special config.
CUDA not available
PushToType automatically falls back to CPU. Transcription will be slower (~1-3s per 5s of audio vs ~0.2s on GPU). Check pushtotype -v startup output to see which device is being used.
Model download fails / slow
Models are cached in ~/.cache/huggingface/hub/ after the first download. Pre-download manually:
pushtotype download base.en
wtype or wl-copy not found (Wayland)
sudo apt install wtype wl-clipboard
Known Limitations
- English only (
base.enmodel) - No AMD GPU (ROCm) support
- Wayland session detection relies on
XDG_SESSION_TYPEorWAYLAND_DISPLAY - No GUI — terminal only
Contributing
See CONTRIBUTING.md. Issues and PRs welcome.
License
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file pushtotype-0.1.0.tar.gz.
File metadata
- Download URL: pushtotype-0.1.0.tar.gz
- Upload date:
- Size: 291.0 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.2.0 CPython/3.13.12
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
16f8f72959cdd2295dc9ade1c01eda9fbd9d32dfb201c1eb075ff7c03ec61077
|
|
| MD5 |
df458622dfe71658c0ddbcd6eae1d687
|
|
| BLAKE2b-256 |
d73fb9e12237c7e7eaa4db116f5a2d1b452e1a9154b781fb7c2d4489819b9827
|
File details
Details for the file pushtotype-0.1.0-py3-none-any.whl.
File metadata
- Download URL: pushtotype-0.1.0-py3-none-any.whl
- Upload date:
- Size: 200.6 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.2.0 CPython/3.13.12
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
bfc68bb7977d5621c8aecc6c95b98424316c5fe91d0d8c952f2bf4e28e1da761
|
|
| MD5 |
f901810351929f01c36af49975f199e7
|
|
| BLAKE2b-256 |
9ee860c6b868e6811348399337e87c91ec16f35ca5593a7d33f79f9d7525568e
|