Voice-first control for AI coding agents
Project description
DICTAre™
Voice layer for AI coding agents.
Speak to your agent. No window focus required. 100% local.
If you want to know how a poker game turned into a voice interaction system for coding agents... watch this →
Why dictare
Most voice tools (Wispr Flow, Superwhisper, etc.) simulate keystrokes — they type into whatever window has focus. Switch to your browser and your code gets your voice.
Dictare uses a protocol. Your agent listens via SSE and receives transcriptions regardless of window focus. Your coding agent can be behind 3 other windows — it still gets your words.
Features
- No focus required — agent receives voice even when its window is in the background
- Agent-native — transcriptions go to the agent protocol, not a text field
- 100% local — STT runs on-device, zero data leaves your machine
- Multi-agent — switch agents by voice: "agent coding", "agent review"
- Open protocol — OpenVIP — any tool can implement the SSE endpoint
- Bidirectional — STT (voice in) + TTS (voice out)
Install
macOS — full guide
brew install dragfly/tap/dictare
Linux — full guide
curl -fsSL https://raw.githubusercontent.com/dragfly/dictare/main/install.sh | bash
sudo usermod -aG input $USER # required for hotkey (log out/in after)
Permissions
macOS — grant when prompted:
- Microphone — prompted on first launch
- Input Monitoring — System Settings → Privacy & Security → enable Dictare
- Accessibility — needed for keyboard mode (typing into other apps)
After granting all three:
dictare service restart
Linux — two steps:
- Input group (hotkey, X11 + Wayland):
sudo usermod -aG input $USER— log out/in- ydotool (keyboard mode on Wayland):
sudo apt install ydotool
Quick Start
dictare agent freddie # starts the default profile (Claude Code)
That's it. The service starts automatically. Speak — your agent receives the transcription.
If you prefer a different coding agent:
dictare agent ozzy --profile codex # OpenAI Codex
dictare agent gilmour --profile gemini # Google Gemini CLI
dictare agent bowie --profile aider # Aider
How It Works
Microphone
│
▼
STT Module Whisper (MLX / CTranslate2) or Parakeet (ONNX)
│ all local, zero cold-start
▼
Pipeline submit detection, mute control, agent switching
│
▼
OpenVIP HTTP / SSE — open protocol
│
▼
Agent receives transcription, no window focus needed
The engine runs as a background service (launchd on macOS, systemd on Linux). STT models are preloaded at startup. Each agent connects in its own terminal.
Agent Profiles
Profiles are predefined in ~/.config/dictare/config.toml:
[agent_profiles]
default = "claude"
[agent_profiles.claude]
command = ["claude"]
description = "Claude Code"
[agent_profiles.codex]
command = ["codex"]
description = "OpenAI Codex"
[agent_profiles.pi]
command = ["pi", "--provider", "ollama", "--model", "qwen3:8b"]
continue_args = ["-c"]
description = "Pi + Ollama local, agentic with tools"
Then connect:
dictare agent freddie # default profile (claude)
dictare agent ozzy --profile codex # use codex profile
dictare agent -- claude --model opus # explicit command override
Voice Commands
| Say | Action |
|---|---|
| "ok, submit" / "ok, send" / "ok, invia" / "ja, senden" | Submit to agent (Enter) |
| "ok, mute" / "ok, hold on" | Mute (stop listening) |
| "ok, listen" / "ok, listen up" | Unmute (resume listening) |
| "agent coding" / "agent review" | Switch active agent |
Submit triggers are multilingual (en, de, es, it, fr) and fully configurable.
Hotkey Cheat Sheet
Default hotkey: Right ⌘ (macOS) / Scroll Lock (Linux).
| Gesture | Action |
|---|---|
| Single tap | Toggle listening on/off |
| Double tap | Submit (send Enter to agent) |
| Right Alt + hotkey | Switch mode: agents ↔ keyboard |
Service Management
dictare service install # Install + enable (auto-starts at login)
dictare service start # Start the service
dictare service stop # Stop the service
dictare service restart # Restart the service
dictare service status # Show service and engine status
dictare service logs # View recent logs
dictare service uninstall # Remove the service
Keyboard Mode
No agent? Use dictare as a dictation tool — voice to keystrokes in any app.
dictare config set output.mode keyboard
Hotkey to toggle listening (configurable):
- macOS: Right ⌘ by default
- Linux: Scroll Lock by default
dictare config set hotkey.key KEY_RIGHTALT # change hotkey
Text-to-Speech
dictare speak "Hello world"
dictare speak --engine piper "Hello"
echo "Hello" | dictare speak
Engines: espeak, say (macOS), piper, kokoro
Configuration
dictare config edit # Open config in editor
dictare config list # Show all settings
dictare config get stt.model
dictare config set stt.language it
Full configuration reference at dictare.io/docs/configuration.
Development
git clone https://github.com/dragfly/dictare && cd dictare
# macOS Apple Silicon (MLX GPU acceleration)
uv sync --python 3.11 --extra mlx
# macOS Intel / Linux
uv sync --python 3.11
# Run engine in foreground
uv run --python 3.11 dictare serve
# Tests
uv run --python 3.11 pytest tests/ -x
# Tests (parallel)
uv run --python 3.11 pytest tests/ -x -n auto
Ghostty users: add
keybind = shift+enter=text:\nto config. See TERMINAL_COMPATIBILITY.md.
Protocol
dictare is the reference implementation of OpenVIP — an open protocol for voice input to AI agents. Any tool can implement the SSE endpoint and receive voice transcriptions from dictare.
License
MIT
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file dictare-0.2.7.tar.gz.
File metadata
- Download URL: dictare-0.2.7.tar.gz
- Upload date:
- Size: 2.1 MB
- Tags: Source
- Uploaded using Trusted Publishing? Yes
- Uploaded via: twine/6.1.0 CPython/3.13.7
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
5b32e1a109534e5b15fdcad490c31a352a339c3486638f174d0dd4ac89a0d27f
|
|
| MD5 |
ac7949699d7a0c71ae9390a055b48b7e
|
|
| BLAKE2b-256 |
2587143034d5615eec1c5ac93e40a13808afa9d7ba3cbfd67b199f539924a2f0
|
Provenance
The following attestation bundles were made for dictare-0.2.7.tar.gz:
Publisher:
publish-pypi.yml on dragfly/dictare
-
Statement:
-
Statement type:
https://in-toto.io/Statement/v1 -
Predicate type:
https://docs.pypi.org/attestations/publish/v1 -
Subject name:
dictare-0.2.7.tar.gz -
Subject digest:
5b32e1a109534e5b15fdcad490c31a352a339c3486638f174d0dd4ac89a0d27f - Sigstore transparency entry: 1191739432
- Sigstore integration time:
-
Permalink:
dragfly/dictare@bcdd473caf06a1c4dbe12564e9e243f430dc7532 -
Branch / Tag:
refs/heads/main - Owner: https://github.com/dragfly
-
Access:
public
-
Token Issuer:
https://token.actions.githubusercontent.com -
Runner Environment:
github-hosted -
Publication workflow:
publish-pypi.yml@bcdd473caf06a1c4dbe12564e9e243f430dc7532 -
Trigger Event:
workflow_dispatch
-
Statement type:
File details
Details for the file dictare-0.2.7-py3-none-any.whl.
File metadata
- Download URL: dictare-0.2.7-py3-none-any.whl
- Upload date:
- Size: 1.9 MB
- Tags: Python 3
- Uploaded using Trusted Publishing? Yes
- Uploaded via: twine/6.1.0 CPython/3.13.7
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
e3db570ed2368679405337feef9b1b5e4842678e53d7b200eba66dbb78daa6bf
|
|
| MD5 |
8ce779a343d7308c472640da7bb32fdb
|
|
| BLAKE2b-256 |
a279e8fda6992438246ceea22f3fc265575193ce4bb0ccc035ce127f09b46a44
|
Provenance
The following attestation bundles were made for dictare-0.2.7-py3-none-any.whl:
Publisher:
publish-pypi.yml on dragfly/dictare
-
Statement:
-
Statement type:
https://in-toto.io/Statement/v1 -
Predicate type:
https://docs.pypi.org/attestations/publish/v1 -
Subject name:
dictare-0.2.7-py3-none-any.whl -
Subject digest:
e3db570ed2368679405337feef9b1b5e4842678e53d7b200eba66dbb78daa6bf - Sigstore transparency entry: 1191739442
- Sigstore integration time:
-
Permalink:
dragfly/dictare@bcdd473caf06a1c4dbe12564e9e243f430dc7532 -
Branch / Tag:
refs/heads/main - Owner: https://github.com/dragfly
-
Access:
public
-
Token Issuer:
https://token.actions.githubusercontent.com -
Runner Environment:
github-hosted -
Publication workflow:
publish-pypi.yml@bcdd473caf06a1c4dbe12564e9e243f430dc7532 -
Trigger Event:
workflow_dispatch
-
Statement type: