Push-to-Talk Voice Dictation menu bar app for macOS using Apple Silicon MLX

These details have not been verified by PyPI

Project links

Project description

Dictate

Push-to-talk voice dictation for macOS. Runs 100% on-device using Apple Silicon MLX models. No cloud, no API keys, no subscriptions.

Hold a key, speak, release — clean text appears wherever your cursor is.

Install

pip install dictate-mlx
dictate

That's it. Dictate launches in the background and appears in your menu bar. Close the terminal — it keeps running. Quit from the menu bar icon.

macOS will prompt for Accessibility and Microphone permissions on first run. Models download automatically in the background (~2-4GB total, cached in ~/.cache/huggingface/).

Install from source

git clone https://github.com/0xbrando/dictate.git
cd dictate
python3 -m venv .venv
source .venv/bin/activate
pip install -e .
dictate

Requirements

macOS with Apple Silicon (any M-series chip)
Python 3.11+
~4GB RAM minimum, ~6GB recommended

How It Works

Hold PTT → Speak → Release → Clean text pasted into active window

Under the hood:

Push-to-talk captures audio via the microphone
VAD segments speech from silence
STT transcribes locally (Whisper or Parakeet)
Smart skip detects clean short phrases and skips cleanup entirely
LLM fixes grammar, punctuation, and formatting
Auto-paste puts the result wherever your cursor is

Everything runs locally. Nothing leaves your machine.

Controls

Action	Key
Record	Hold Left Control
Lock recording (hands-free)	Press Space while holding PTT
Stop locked recording	Press PTT again

The PTT key is configurable from the menu bar: Left Control, Right Control, Right Command, or either Option key.

STT Engines

Both engines are included. Switch anytime from the menu bar.

Engine	Speed	Languages	Notes
Parakeet TDT 0.6B	~50ms	English	Default. 4-8x faster than Whisper
Whisper Large V3 Turbo	~300ms	99+	Best for multilingual or non-English

Parakeet is the default for speed. Switch to Whisper from the menu bar if you need non-English STT.

Writing Styles

Style	What it does
Clean Up	Fixes punctuation and capitalization — keeps your words
Formal	Rewrites in a professional tone
Bullet Points	Distills your dictation into concise key points

Quality Presets

Preset	Speed	RAM	Best for
Smart	~250ms	0	Auto-routes: fast local for short, API server for long
Speedy (1.5B)	~120ms	1GB	Quick fixes, great for any chip
Fast (3B)	~250ms	2GB	Quick cleanup, everyday use
Balanced (7B)	~350ms	5GB	Longer dictation, formal rewriting
Quality (14B)	~500ms	9GB	Best accuracy for bullet points and rewrites

Times measured on M3 Ultra. The app picks the best default for your chip — Ultra/Max get 3B, everything else gets 1.5B.

The Quality menu only shows models you've downloaded. To add a larger model:

python -c "from mlx_lm import load; load('mlx-community/Qwen2.5-7B-Instruct-4bit')"

Menu Bar

All settings accessible from the waveform icon in your menu bar:

Main menu:

Writing Style — Clean Up, Formal, or Bullet Points
Quality — model size (shows only downloaded models)
Input Device — select microphone
Recent — last 10 transcriptions, click to re-paste

Advanced settings:

STT Engine — Whisper or Parakeet
PTT Key — choose your push-to-talk modifier
Languages — input and output language (12 languages for translation)
Sounds — 6 tones or silent
LLM Endpoint — configure API server
LLM Cleanup — toggle on/off
Personal Dictionary — names, brands, technical terms always spelled correctly
Launch at Login — auto-start on boot

API Server Setup

If you run a local LLM server, Dictate can use it instead of loading its own model — zero additional RAM:

DICTATE_LLM_BACKEND=api DICTATE_LLM_API_URL=http://localhost:8005/v1/chat/completions dictate

Works with any OpenAI-compatible server: vllm-mlx, LM Studio, Ollama.

Smart Routing

The Smart preset auto-routes based on message length:

Short (15 words or fewer) → fast local model (~120ms)
Long (16+ words) → your API server for higher quality

Environment Variables

Variable	Description	Default
`DICTATE_AUDIO_DEVICE`	Microphone device index	System default
`DICTATE_OUTPUT_MODE`	`type` or `clipboard`	`type`
`DICTATE_INPUT_LANGUAGE`	`auto`, `en`, `ja`, `ko`, etc.	`auto`
`DICTATE_OUTPUT_LANGUAGE`	Translation target (`auto` = same)	`auto`
`DICTATE_LLM_CLEANUP`	Enable LLM text cleanup	`true`
`DICTATE_LLM_MODEL`	`qwen-1.5b`, `qwen`, `qwen-7b`, `qwen-14b`	`qwen`
`DICTATE_LLM_BACKEND`	`local` or `api`	`local`
`DICTATE_LLM_API_URL`	OpenAI-compatible endpoint	`http://localhost:8005/v1/chat/completions`
`DICTATE_ALLOW_REMOTE_API`	Allow non-localhost API URLs	unset

Agent Integration

Dictate works well as a voice input layer for AI assistants and agent frameworks. If you're building with tools like Claude Code, OpenClaw, or similar — Dictate gives your setup a local, private voice interface with zero cloud dependency.

Debugging

Run in the foreground to see logs:

dictate --foreground

Or check the background log:

tail -f ~/Library/Logs/Dictate/dictate.log

Security

All processing is local. Audio and text never leave your machine.
LLM endpoints are restricted to localhost by default. Set DICTATE_ALLOW_REMOTE_API=1 to override.
Preferences stored with 0o600 permissions (owner-only read/write).
No API keys, tokens, or accounts required.

License

MIT — See LICENSES.md for dependency licenses.

Project details

These details have not been verified by PyPI

Project links

Release history Release notifications | RSS feed

2.5.0

Mar 3, 2026

2.4.1

Feb 11, 2026

2.4.0

Feb 11, 2026

2.3.1

Feb 11, 2026

2.3.0

Feb 11, 2026

2.2.3

Feb 11, 2026

This version

2.2.2

Feb 11, 2026

2.2.1

Feb 11, 2026

2.2.0

Feb 11, 2026

2.1.0

Feb 11, 2026

2.0.0

Feb 11, 2026

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

dictate_mlx-2.2.2.tar.gz (63.6 kB view details)

Uploaded Feb 11, 2026 Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

The dropdown lists show the available interpreters, ABIs, and platforms. Enable javascript to be able to filter the list of wheel files.

dictate_mlx-2.2.2-py3-none-any.whl (43.3 kB view details)

Uploaded Feb 11, 2026 Python 3

File details

Details for the file dictate_mlx-2.2.2.tar.gz.

File metadata

Download URL: dictate_mlx-2.2.2.tar.gz
Upload date: Feb 11, 2026
Size: 63.6 kB
Tags: Source
Uploaded using Trusted Publishing? No
Uploaded via: twine/6.2.0 CPython/3.12.3

File hashes

Hashes for dictate_mlx-2.2.2.tar.gz
Algorithm	Hash digest
SHA256	`5732cdac316609ab157b63d5ac665fdcdb1b0ae241d996c117300300be361206`
MD5	`2ad8110897d25aa296e142121a30994f`
BLAKE2b-256	`e959a1fa9ff61dc15123f48eaf330eb7ba5bcf5e5c4e1746dba04d8a7618ad7c`

See more details on using hashes here.

File details

Details for the file dictate_mlx-2.2.2-py3-none-any.whl.

File metadata

Download URL: dictate_mlx-2.2.2-py3-none-any.whl
Upload date: Feb 11, 2026
Size: 43.3 kB
Tags: Python 3
Uploaded using Trusted Publishing? No
Uploaded via: twine/6.2.0 CPython/3.12.3

File hashes

Hashes for dictate_mlx-2.2.2-py3-none-any.whl
Algorithm	Hash digest
SHA256	`6599002a33f7fbef00a102f1d4c5935a2437b02624bab94a2e099491612f10ab`
MD5	`cdd64a2333004536ca211d785bfc2c09`
BLAKE2b-256	`774c884cde3fbea68b53e2cd77b41e77f0153b09ad6085f7e4756db77961b0a5`

See more details on using hashes here.

dictate-mlx 2.2.2

Navigation

Verified details

Maintainers

Unverified details

Project links

Meta

Classifiers

Project description

Dictate

Install

Install from source

Requirements

How It Works

Controls

STT Engines

Writing Styles

Quality Presets

Menu Bar

API Server Setup

Smart Routing

Environment Variables

Agent Integration

Debugging

Security

License

Project details

Verified details

Maintainers

Unverified details

Project links

Meta

Classifiers

Release history Release notifications | RSS feed

Download files

Source Distribution

Built Distribution

File details

File metadata

File hashes

File details

File metadata

File hashes