Push-to-Talk Voice Dictation menu bar app for macOS using Apple Silicon MLX
Project description
Dictate
Push-to-talk voice dictation for macOS. Runs 100% on-device using Apple Silicon MLX models. No cloud, no API keys, no subscriptions.
Hold a key, speak, release — clean text appears wherever your cursor is.
Install
pip install dictate-mlx
dictate
That's it. Dictate launches in the background and appears in your menu bar. Close the terminal — it keeps running. Quit from the menu bar icon.
macOS will prompt for Accessibility and Microphone permissions on first run. Models download automatically in the background (~2-4GB total, cached in ~/.cache/huggingface/).
Install from source
git clone https://github.com/0xbrando/dictate.git
cd dictate
python3 -m venv .venv
source .venv/bin/activate
pip install -e .
dictate
Requirements
- macOS with Apple Silicon (any M-series chip)
- Python 3.11+
- ~4GB RAM minimum, ~6GB recommended
How It Works
Hold PTT → Speak → Release → Clean text pasted into active window
Under the hood:
- Push-to-talk captures audio via the microphone
- VAD segments speech from silence
- STT transcribes locally (Whisper or Parakeet)
- Smart skip detects clean short phrases and skips cleanup entirely
- LLM fixes grammar, punctuation, and formatting
- Auto-paste puts the result wherever your cursor is
Everything runs locally. Nothing leaves your machine.
Controls
| Action | Key |
|---|---|
| Record | Hold Left Control |
| Lock recording (hands-free) | Press Space while holding PTT |
| Stop locked recording | Press PTT again |
The PTT key is configurable from the menu bar: Left Control, Right Control, Right Command, or either Option key.
STT Engines
Both engines are included. Switch anytime from the menu bar.
| Engine | Speed | Languages | Notes |
|---|---|---|---|
| Parakeet TDT 0.6B | ~50ms | English | Default. 4-8x faster than Whisper |
| Whisper Large V3 Turbo | ~300ms | 99+ | Best for multilingual or non-English |
Parakeet is the default for speed. Switch to Whisper from the menu bar if you need non-English STT.
Writing Styles
| Style | What it does |
|---|---|
| Clean Up | Fixes punctuation and capitalization — keeps your words |
| Formal | Rewrites in a professional tone |
| Bullet Points | Distills your dictation into concise key points |
Quality Presets
| Preset | Speed | RAM | Best for |
|---|---|---|---|
| Smart | ~250ms | 0 | Auto-routes: fast local for short, API server for long |
| Speedy (1.5B) | ~120ms | 1GB | Quick fixes, great for any chip |
| Fast (3B) | ~250ms | 2GB | Quick cleanup, everyday use |
| Balanced (7B) | ~350ms | 5GB | Longer dictation, formal rewriting |
| Quality (14B) | ~500ms | 9GB | Best accuracy for bullet points and rewrites |
Times measured on M3 Ultra. The app picks the best default for your chip — Ultra/Max get 3B, everything else gets 1.5B.
The Quality menu only shows models you've downloaded. To add a larger model:
python -c "from mlx_lm import load; load('mlx-community/Qwen2.5-7B-Instruct-4bit')"
Menu Bar
All settings accessible from the waveform icon in your menu bar:
Main menu:
- Writing Style — Clean Up, Formal, or Bullet Points
- Quality — model size (shows only downloaded models)
- Input Device — select microphone
- Recent — last 10 transcriptions, click to re-paste
Advanced settings:
- STT Engine — Whisper or Parakeet
- PTT Key — choose your push-to-talk modifier
- Languages — input and output language (12 languages for translation)
- Sounds — 6 tones or silent
- LLM Endpoint — configure API server
- LLM Cleanup — toggle on/off
- Personal Dictionary — names, brands, technical terms always spelled correctly
- Launch at Login — auto-start on boot
API Server Setup
If you run a local LLM server, Dictate can use it instead of loading its own model — zero additional RAM:
DICTATE_LLM_BACKEND=api DICTATE_LLM_API_URL=http://localhost:8005/v1/chat/completions dictate
Works with any OpenAI-compatible server: vllm-mlx, LM Studio, Ollama.
Smart Routing
The Smart preset auto-routes based on message length:
- Short (15 words or fewer) → fast local model (~120ms)
- Long (16+ words) → your API server for higher quality
Environment Variables
| Variable | Description | Default |
|---|---|---|
DICTATE_AUDIO_DEVICE |
Microphone device index | System default |
DICTATE_OUTPUT_MODE |
type or clipboard |
type |
DICTATE_INPUT_LANGUAGE |
auto, en, ja, ko, etc. |
auto |
DICTATE_OUTPUT_LANGUAGE |
Translation target (auto = same) |
auto |
DICTATE_LLM_CLEANUP |
Enable LLM text cleanup | true |
DICTATE_LLM_MODEL |
qwen-1.5b, qwen, qwen-7b, qwen-14b |
qwen |
DICTATE_LLM_BACKEND |
local or api |
local |
DICTATE_LLM_API_URL |
OpenAI-compatible endpoint | http://localhost:8005/v1/chat/completions |
DICTATE_ALLOW_REMOTE_API |
Allow non-localhost API URLs | unset |
Agent Integration
Dictate works well as a voice input layer for AI assistants and agent frameworks. If you're building with tools like Claude Code, OpenClaw, or similar — Dictate gives your setup a local, private voice interface with zero cloud dependency.
Debugging
Run in the foreground to see logs:
dictate --foreground
Or check the background log:
tail -f ~/Library/Logs/Dictate/dictate.log
Security
- All processing is local. Audio and text never leave your machine.
- LLM endpoints are restricted to localhost by default. Set
DICTATE_ALLOW_REMOTE_API=1to override. - Preferences stored with
0o600permissions (owner-only read/write). - No API keys, tokens, or accounts required.
License
MIT — See LICENSES.md for dependency licenses.
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file dictate_mlx-2.2.2.tar.gz.
File metadata
- Download URL: dictate_mlx-2.2.2.tar.gz
- Upload date:
- Size: 63.6 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.2.0 CPython/3.12.3
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
5732cdac316609ab157b63d5ac665fdcdb1b0ae241d996c117300300be361206
|
|
| MD5 |
2ad8110897d25aa296e142121a30994f
|
|
| BLAKE2b-256 |
e959a1fa9ff61dc15123f48eaf330eb7ba5bcf5e5c4e1746dba04d8a7618ad7c
|
File details
Details for the file dictate_mlx-2.2.2-py3-none-any.whl.
File metadata
- Download URL: dictate_mlx-2.2.2-py3-none-any.whl
- Upload date:
- Size: 43.3 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.2.0 CPython/3.12.3
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
6599002a33f7fbef00a102f1d4c5935a2437b02624bab94a2e099491612f10ab
|
|
| MD5 |
cdd64a2333004536ca211d785bfc2c09
|
|
| BLAKE2b-256 |
774c884cde3fbea68b53e2cd77b41e77f0153b09ad6085f7e4756db77961b0a5
|