Local voice-to-text with Whisper + LLM cleanup
Project description
voice2text
Local voice-to-text with Whisper + LLM cleanup. Push-to-talk (Right ⌘), pastes at cursor.
Voice-to-text tools like Wispr Flow, MacWhisper, and VoiceInk are becoming increasingly popular. It's a testament to our times that in 2025, ~270 lines of Python with local Whisper and a small ollama language model (Qwen 2.5-3B) can deliver a comparable experience on consumer hardware. Such tooling would have been unimaginable 3 years ago. This project is a proof of concept to demonstrate just that.
Note: Before anyone suggests splitting this into modules and submodules — this is an intentional design choice to demonstrate how this whole functionality fits in less than 300 lines of python code.
Note 2: This is macOS-only by design. We use:
- mlx-whisper — optimized for Apple Silicon
- osascript — for simulating Cmd+V paste via System Events
- pbcopy/pbpaste — macOS clipboard
- nowplaying-cli — macOS media control
- System Preferences URLs for permissions
You're welcome to fork this and make it work on Linux or Windows!
Prerequisites
Skip this if using
pixi— it handles ollama automatically.
brew install ollama
ollama pull qwen2.5:3b
Install
uvx (easiest)
uvx --from voice2text v2t
Or from GitHub:
uvx --from git+https://github.com/lucharo/voice2text v2t
pip
pip install voice2text
v2t
Development install
git clone https://github.com/lucharo/voice2text.git
cd voice2text
uv sync
uv run v2t
Pixi
Pixi handles the ollama dependency automatically:
git clone https://github.com/lucharo/voice2text.git
cd voice2text
pixi run ollama pull qwen2.5:3b
pixi run v2t
Note: We don't publish to conda-forge/pixi channels yet, but may in the future.
Usage
v2t # strict mode (restructures sentences)
v2t --casual # light cleanup (punctuation only)
v2t --pause-music # pause media while recording (macOS only, requires nowplaying-cli via brew)
Hold Right Command to record, release to transcribe and paste.
Strict vs Casual Mode
| Raw transcription | Strict | Casual |
|---|---|---|
| "Hey um I'll see you tomorrow at 9 actually no make it 10" | "Hey, I'll see you tomorrow at 10." | "Hey, I'll see you tomorrow at 9, actually no, make it 10." |
| "So basically I was thinking we could um you know maybe try the other approach" | "I was thinking we could try the other approach." | "So basically, I was thinking we could maybe try the other approach." |
Strict (default): Removes filler words, restructures for clarity, condenses.
Casual: Only adds punctuation and removes "um/uh", keeps your phrasing.
--pause-music (macOS only)
Pauses any playing media while recording and resumes after. Requires:
brew install nowplaying-cli
Not available via pixi/conda-forge for now, maybe will publish later!
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file voice2text-0.2.0.tar.gz.
File metadata
- Download URL: voice2text-0.2.0.tar.gz
- Upload date:
- Size: 124.0 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: uv/0.10.6 {"installer":{"name":"uv","version":"0.10.6","subcommand":["publish"]},"python":null,"implementation":{"name":null,"version":null},"distro":{"name":"macOS","version":null,"id":null,"libc":null},"system":{"name":null,"release":null},"cpu":null,"openssl_version":null,"setuptools_version":null,"rustc_version":null,"ci":null}
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
3588311ad4af85f9dd1efe70d293edd6fad5ffc250d2cbdd0f1ea4c42445b306
|
|
| MD5 |
410576828fe341626ae29cd47f86de24
|
|
| BLAKE2b-256 |
013536364b3147532e8c960a148d4de0a9ec14500cbfe27af47e9273f09b60b4
|
File details
Details for the file voice2text-0.2.0-py3-none-any.whl.
File metadata
- Download URL: voice2text-0.2.0-py3-none-any.whl
- Upload date:
- Size: 13.0 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: uv/0.10.6 {"installer":{"name":"uv","version":"0.10.6","subcommand":["publish"]},"python":null,"implementation":{"name":null,"version":null},"distro":{"name":"macOS","version":null,"id":null,"libc":null},"system":{"name":null,"release":null},"cpu":null,"openssl_version":null,"setuptools_version":null,"rustc_version":null,"ci":null}
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
3b2c92e3eeef1c114908b68be1cc67f7fd1074e7aca29612fbb51b036e2c6fb7
|
|
| MD5 |
6246678c21597fec4627c7d491db8699
|
|
| BLAKE2b-256 |
fa760734b3275d0b03ba8274581c6fee380929b7efab1a31c2926fb8b011a3c4
|