Voice-first control for AI coding agents
Project description
DICTAre
Voice layer for AI coding agents.
Speak to your agent. No window focus required. 100% local.
Why dictare
Most voice tools (Wispr Flow, Superwhisper, etc.) simulate keystrokes — they type into whatever window has focus. Switch to your browser and your code gets your voice.
Dictare uses a protocol. Your agent listens via SSE and receives transcriptions regardless of window focus. Your coding agent can be behind 3 other windows — it still gets your words.
Features
- No focus required — agent receives voice even when its window is in the background
- Agent-native — transcriptions go to the agent protocol, not a text field
- 100% local — STT runs on-device, zero data leaves your machine
- Multi-agent — switch agents by voice: "agent coding", "agent review"
- Open protocol — OpenVIP — any tool can implement the SSE endpoint
- Bidirectional — STT (voice in) + TTS (voice out)
Install
macOS:
git clone https://github.com/dragfly/dictare && cd dictare
./scripts/install.sh
Linux:
pip install dictare
Quick Start
# 1. Install as system service (auto-starts at login)
dictare service install
# 2. Connect your agent
dictare agent myproject --type coding
The service starts automatically. Speak — your agent receives the transcription.
How It Works
Microphone
│
▼
STT Module Whisper (MLX / CTranslate2) or Parakeet (ONNX)
│ all local, zero cold-start
▼
Pipeline submit detection, agent switching, language filter
│
▼
OpenVIP HTTP / SSE — open protocol
│
▼
Agent receives transcription, no window focus needed
The engine runs as a background service (launchd on macOS, systemd on Linux). STT models are preloaded at startup. Each agent connects in its own terminal.
Agent Templates
Define agent types in ~/.config/dictare/config.toml:
[agent_types.coding]
command = ["claude"]
description = "AI coding assistant"
[agent_types.review]
command = ["aider", "--model", "claude-sonnet-4-6"]
description = "Code review"
[agent_types.writing]
command = ["claude", "--model", "claude-opus-4-6"]
description = "Writing and documentation"
Then connect using --type:
dictare agent myproject --type coding # session "myproject", type "coding"
dictare agent frontend --type review # session "frontend", type "review"
dictare agent -- claude --model opus # explicit command override
Voice Commands
| Say | Action |
|---|---|
| "ok, submit" / "ok, send" / "ok, invia" / "ja, senden" | Submit to agent (Enter) |
| "agent coding" / "agent review" | Switch active agent type |
Submit triggers are multilingual (en, it, es, de, fr) and fully configurable.
Hotkey Cheat Sheet
Default hotkey: Right ⌘ (macOS) / Scroll Lock (Linux).
| Gesture | Action |
|---|---|
| Single tap | Toggle listening on/off |
| Double tap | Submit (send Enter to agent) |
| Long press (≥0.8s) | Switch mode: agents ↔ keyboard |
Service Management
dictare service install # Install + enable (auto-starts at login)
dictare service start # Start the service
dictare service stop # Stop the service
dictare service restart # Restart the service
dictare service status # Show service and engine status
dictare service logs # View recent logs
dictare service uninstall # Remove the service
Keyboard Mode
No agent? Use dictare as a dictation tool — voice to keystrokes in any app.
dictare config set output.mode keyboard
Hotkey to toggle listening (configurable):
- macOS: Right ⌘ by default
- Linux: Scroll Lock by default
dictare config set hotkey.key KEY_RIGHTALT # change hotkey
Text-to-Speech
dictare speak "Hello world"
dictare speak --engine piper "Hello"
echo "Hello" | dictare speak
Engines: espeak, say (macOS), piper, kokoro
Configuration
dictare config edit # Open config in editor
dictare config list # Show all settings
dictare config get stt.model
dictare config set stt.language it
Requirements
- Python 3.11
- macOS or Linux
macOS: Grant Input Monitoring permission when prompted during dictare service install.
System Settings → Privacy & Security → Input Monitoring → enable Dictare.
Linux: Join input group: sudo usermod -aG input $USER (log out/in).
Development
git clone https://github.com/dragfly/dictare && cd dictare
# macOS Apple Silicon (MLX GPU acceleration)
uv sync --python 3.11 --extra mlx
# macOS Intel / Linux
uv sync --python 3.11
# Run engine in foreground
uv run --python 3.11 dictare serve
# Tests
uv run --python 3.11 pytest tests/ -x
# Tests (parallel)
uv run --python 3.11 pytest tests/ -x -n auto
Ghostty users: add
keybind = shift+enter=text:\nto config. See TERMINAL_COMPATIBILITY.md.
Roadmap
- Plugin architecture: pipeline filters loadable as plugins, each declaring its model dependencies (STT, TTS, LLM, Vision).
- Realtime partial transcription: stream partial results while speaking using a fast small model.
- Cloud relay (Phase 2): E2E encrypted relay connecting web clients to local engines.
Protocol
dictare is the reference implementation of OpenVIP — an open protocol for voice input to AI agents. Any tool can implement the SSE endpoint and receive voice transcriptions from dictare.
License
MIT
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file dictare-0.1.140rc1.tar.gz.
File metadata
- Download URL: dictare-0.1.140rc1.tar.gz
- Upload date:
- Size: 2.1 MB
- Tags: Source
- Uploaded using Trusted Publishing? Yes
- Uploaded via: twine/6.1.0 CPython/3.13.7
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
6a05f514d8e9d4fc823f954aadd569b461475ac7d167e9295a237b3d88de6be5
|
|
| MD5 |
cc3394e0e0066762d25dbecd9a58af55
|
|
| BLAKE2b-256 |
f5dd039e3a833e0a393c6f100fbabd8e413e16eeede1c6aebf1cb7485ce99b16
|
Provenance
The following attestation bundles were made for dictare-0.1.140rc1.tar.gz:
Publisher:
publish-pypi.yml on dragfly/dictare
-
Statement:
-
Statement type:
https://in-toto.io/Statement/v1 -
Predicate type:
https://docs.pypi.org/attestations/publish/v1 -
Subject name:
dictare-0.1.140rc1.tar.gz -
Subject digest:
6a05f514d8e9d4fc823f954aadd569b461475ac7d167e9295a237b3d88de6be5 - Sigstore transparency entry: 1066548391
- Sigstore integration time:
-
Permalink:
dragfly/dictare@1f0725efcf5a406c80082d683fc9205fb24d05ac -
Branch / Tag:
refs/heads/main - Owner: https://github.com/dragfly
-
Access:
private
-
Token Issuer:
https://token.actions.githubusercontent.com -
Runner Environment:
github-hosted -
Publication workflow:
publish-pypi.yml@1f0725efcf5a406c80082d683fc9205fb24d05ac -
Trigger Event:
workflow_dispatch
-
Statement type:
File details
Details for the file dictare-0.1.140rc1-py3-none-any.whl.
File metadata
- Download URL: dictare-0.1.140rc1-py3-none-any.whl
- Upload date:
- Size: 1.9 MB
- Tags: Python 3
- Uploaded using Trusted Publishing? Yes
- Uploaded via: twine/6.1.0 CPython/3.13.7
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
8fb5ff7ec4f711bb4ad3fa8021470c1d3653766bd8a683ea765abe5776e5d41a
|
|
| MD5 |
d1b7600341f730d5d48c55103997c93f
|
|
| BLAKE2b-256 |
8f7e789aa5383e196483336680466342ea64bc00befcc7a9dc2bb9836425a5ea
|
Provenance
The following attestation bundles were made for dictare-0.1.140rc1-py3-none-any.whl:
Publisher:
publish-pypi.yml on dragfly/dictare
-
Statement:
-
Statement type:
https://in-toto.io/Statement/v1 -
Predicate type:
https://docs.pypi.org/attestations/publish/v1 -
Subject name:
dictare-0.1.140rc1-py3-none-any.whl -
Subject digest:
8fb5ff7ec4f711bb4ad3fa8021470c1d3653766bd8a683ea765abe5776e5d41a - Sigstore transparency entry: 1066548395
- Sigstore integration time:
-
Permalink:
dragfly/dictare@1f0725efcf5a406c80082d683fc9205fb24d05ac -
Branch / Tag:
refs/heads/main - Owner: https://github.com/dragfly
-
Access:
private
-
Token Issuer:
https://token.actions.githubusercontent.com -
Runner Environment:
github-hosted -
Publication workflow:
publish-pypi.yml@1f0725efcf5a406c80082d683fc9205fb24d05ac -
Trigger Event:
workflow_dispatch
-
Statement type: