Voice dictation for macOS. Hold Option, speak, release. Text appears at your cursor.
Project description
voicekey
Open-source voice dictation for macOS.
Hold Option. Speak. Release. Text appears at your cursor.
No subscription, no third-party servers. Your API key, one hotkey, and you're dictating.
Uses OpenAI's gpt-4o-mini-transcribe by default, with a pluggable provider system so you can swap in other transcription backends. Audio goes straight from your mic to the API over HTTPS, text comes back in under a second, gets pasted at your cursor. Costs about $0.003/minute with OpenAI.
Why this exists
Wispr Flow is good, but lots of companies (including OpenAI itself) block third-party services that process audio. If your employer won't let you install Wispr, or you'd rather not pay $10/month for dictation, this does the same thing with your own API key.
How it compares
| voicekey | Wispr Flow | macOS Dictation | whisper.cpp | |
|---|---|---|---|---|
| Cost | ~$0.003/min (API) | $10/month | Free | Free (local) |
| Privacy | Your API key, direct to provider | Audio goes to Wispr servers | Audio goes to Apple servers | Fully local |
| Allowed at work | Yes (your own key) | Often blocked by IT | Usually allowed | Yes |
| Accuracy | GPT-4o-mini-transcribe | Proprietary | Apple ML | Open-source Whisper |
| Latency | Sub-second | Sub-second | ~1-2s | Depends on hardware |
| Paste at cursor | Yes | Yes | Yes | No (manual copy) |
| Works in any app | Yes | Yes | Yes | No (terminal only) |
| Open source | Yes | No | No | Yes |
| Setup time | 2 minutes | Account + install | Built-in | Compile from source |
How it works
Hold Option → Recording → Release Option → Transcribe → Paste at cursor
- Hold the Option key. A red dot appears on screen, recording starts.
- Talk.
- Release Option. Audio gets sent to the transcription API, text streams back and gets typed wherever your cursor is.
- Your clipboard is untouched. It's saved before the paste and restored after.
Install
Prerequisites
- macOS 13+
- Python 3.11+
- PortAudio:
brew install portaudio
pipx (recommended)
brew install portaudio
pipx install voicekey
voicekey setup
From source with uv
brew install portaudio
git clone https://github.com/adamkhakhar/voicekey.git
cd voicekey
uv sync
uv run voicekey setup
Setup will ask for your API key (goes into macOS Keychain, never touches disk) and walk you through Accessibility and Microphone permissions.
Run
voicekey # if installed via pipx
uv run voicekey # if installed via uv
A mic icon shows up in your menu bar. Hold Option to dictate.
Permissions
macOS needs two permissions granted:
| Permission | Why | How to grant |
|---|---|---|
| Accessibility | Global hotkey detection + simulating Cmd+V to paste | System Settings → Privacy & Security → Accessibility → add your terminal |
| Microphone | Recording audio | Dialog pops up on first use |
voicekey setup checks both and opens the right Settings pane if either is missing.
Configuration
Config file is at ~/.config/voicekey/config.toml.
voicekey config # view all settings
voicekey config language en # set language (ISO 639-1)
voicekey config model gpt-4o-transcribe # higher-accuracy model (~$0.006/min)
voicekey config hotkey left_option # trigger on left Option only
| Key | Default | Options |
|---|---|---|
provider |
openai |
openai (more coming) |
model |
gpt-4o-mini-transcribe |
gpt-4o-mini-transcribe, gpt-4o-transcribe |
hotkey |
option (either) |
option, left_option, right_option |
language |
"" (auto-detect) |
Any ISO 639-1 code |
Architecture
┌─────────────────────────────────────────────────────┐
│ Main thread (NSApp run loop) │
│ ┌──────────┐ ┌─────────┐ ┌────────────────────┐ │
│ │ CGEvent │→ │ rumps │ │ Overlay (NSWindow) │ │
│ │ tap │ │ menubar │ │ red dot indicator │ │
│ └──────────┘ └─────────┘ └────────────────────┘ │
├─────────────────────────────────────────────────────┤
│ Audio thread (sounddevice callback) │
│ 24kHz mono int16 PCM → WAV encoding │
├─────────────────────────────────────────────────────┤
│ Transcription thread (per utterance) │
│ Provider.transcribe() → paste at cursor │
└─────────────────────────────────────────────────────┘
State machine: IDLE → RECORDING → TRANSCRIBING → INSERTING → IDLE
Providers are pluggable. The providers/ package defines a Protocol that any transcription backend can implement. OpenAI is the default. To add a new provider, implement transcribe() in a new module and register it.
There's a 200ms debounce on the Option key so it doesn't fire when you're typing special characters (Option+E for accents, etc).
Security and privacy
Your API key is stored in the macOS Keychain through the keyring library, not in a config file. Audio goes directly to your chosen provider's API over HTTPS with no middleman. There's no telemetry, no analytics, nothing phoning home. The clipboard gets saved before pasting transcribed text and restored right after. The whole thing is ~900 lines of Python you can read in 20 minutes.
Troubleshooting
"Failed to create event tap" - Your terminal doesn't have Accessibility permission. Add it in System Settings → Privacy & Security → Accessibility, then restart voicekey.
No audio captured - Check that your terminal has Microphone permission in System Settings → Privacy & Security → Microphone.
Text not appearing at cursor - A few apps with custom input handling don't respond to simulated Cmd+V. Most standard apps work.
PortAudio errors on install -
brew install portaudio first.
Contributing
Contributions are welcome. Open an issue first if you want to discuss a change.
git clone https://github.com/adamkhakhar/voicekey.git
cd voicekey
uv sync
uv run pytest tests/ -v
License
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file voicekey-0.1.1.tar.gz.
File metadata
- Download URL: voicekey-0.1.1.tar.gz
- Upload date:
- Size: 50.4 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: uv/0.9.5
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
e26f6b6ffe7b28e9a548b7f78811c0f894428f632374bdb64ff105aace2d17e0
|
|
| MD5 |
20a66c87b88f0fbd53cd33f9281b43c6
|
|
| BLAKE2b-256 |
a6ab2fcdf16f39bd80247674e5e802ead0431ccc91ea8db1d87dacd872c3511a
|
File details
Details for the file voicekey-0.1.1-py3-none-any.whl.
File metadata
- Download URL: voicekey-0.1.1-py3-none-any.whl
- Upload date:
- Size: 18.9 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: uv/0.9.5
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
89bc11508e29cc4bfc9ad4cbef95d46b1fd520d0a77a548c3b224b1dfcf6e35c
|
|
| MD5 |
2654a70d6b3956dac3df91e84bd4d1c3
|
|
| BLAKE2b-256 |
e865192fca3b983361301194743b70b450a08e1765b365ad8a5aba303025b98e
|