Skip to main content

Voice-first control for AI coding agents

Project description

dictare icon

DICTAre

Voice layer for AI coding agents.

Speak to your agent. No window focus required. 100% local.

MIT License Python 3.11 CI

dictare.io · OpenVIP Protocol


Why dictare

Most voice tools (Wispr Flow, Superwhisper, etc.) simulate keystrokes — they type into whatever window has focus. Switch to your browser and your code gets your voice.

Dictare uses a protocol. Your agent listens via SSE and receives transcriptions regardless of window focus. Your coding agent can be behind 3 other windows — it still gets your words.

Features

  • No focus required — agent receives voice even when its window is in the background
  • Agent-native — transcriptions go to the agent protocol, not a text field
  • 100% local — STT runs on-device, zero data leaves your machine
  • Multi-agent — switch agents by voice: "agent coding", "agent review"
  • Open protocolOpenVIP — any tool can implement the SSE endpoint
  • Bidirectional — STT (voice in) + TTS (voice out)

Install

macOS:

git clone https://github.com/dragfly/dictare && cd dictare
./scripts/install.sh

Linux:

pip install dictare

Quick Start

# 1. Install as system service (auto-starts at login)
dictare service install

# 2. Connect your agent
dictare agent myproject --type coding

The service starts automatically. Speak — your agent receives the transcription.

How It Works

  Microphone
      │
      ▼
  STT Module       Whisper (MLX / CTranslate2) or Parakeet (ONNX)
      │             all local, zero cold-start
      ▼
  Pipeline         submit detection, agent switching, language filter
      │
      ▼
  OpenVIP          HTTP / SSE — open protocol
      │
      ▼
  Agent            receives transcription, no window focus needed

The engine runs as a background service (launchd on macOS, systemd on Linux). STT models are preloaded at startup. Each agent connects in its own terminal.

Agent Templates

Define agent types in ~/.config/dictare/config.toml:

[agent_types.coding]
command = ["claude"]
description = "AI coding assistant"

[agent_types.review]
command = ["aider", "--model", "claude-sonnet-4-6"]
description = "Code review"

[agent_types.writing]
command = ["claude", "--model", "claude-opus-4-6"]
description = "Writing and documentation"

Then connect using --type:

dictare agent myproject --type coding     # session "myproject", type "coding"
dictare agent frontend --type review      # session "frontend", type "review"
dictare agent -- claude --model opus      # explicit command override

Voice Commands

Say Action
"ok, submit" / "ok, send" / "ok, invia" / "ja, senden" Submit to agent (Enter)
"agent coding" / "agent review" Switch active agent type

Submit triggers are multilingual (en, it, es, de, fr) and fully configurable.

Hotkey Cheat Sheet

Default hotkey: Right ⌘ (macOS) / Scroll Lock (Linux).

Gesture Action
Single tap Toggle listening on/off
Double tap Submit (send Enter to agent)
Long press (≥0.8s) Switch mode: agents ↔ keyboard

Service Management

dictare service install     # Install + enable (auto-starts at login)
dictare service start       # Start the service
dictare service stop        # Stop the service
dictare service restart     # Restart the service
dictare service status      # Show service and engine status
dictare service logs        # View recent logs
dictare service uninstall   # Remove the service

Keyboard Mode

No agent? Use dictare as a dictation tool — voice to keystrokes in any app.

dictare config set output.mode keyboard

Hotkey to toggle listening (configurable):

  • macOS: Right ⌘ by default
  • Linux: Scroll Lock by default
dictare config set hotkey.key KEY_RIGHTALT   # change hotkey

Text-to-Speech

dictare speak "Hello world"
dictare speak --engine piper "Hello"
echo "Hello" | dictare speak

Engines: espeak, say (macOS), piper, kokoro

Configuration

dictare config edit           # Open config in editor
dictare config list           # Show all settings
dictare config get stt.model
dictare config set stt.language it

Requirements

  • Python 3.11
  • macOS or Linux

macOS: Grant Input Monitoring permission when prompted during dictare service install. System Settings → Privacy & Security → Input Monitoring → enable Dictare.

Linux: Join input group: sudo usermod -aG input $USER (log out/in).

Development

git clone https://github.com/dragfly/dictare && cd dictare

# macOS Apple Silicon (MLX GPU acceleration)
uv sync --python 3.11 --extra mlx

# macOS Intel / Linux
uv sync --python 3.11

# Run engine in foreground
uv run --python 3.11 dictare serve

# Tests
uv run --python 3.11 pytest tests/ -x

# Tests (parallel)
uv run --python 3.11 pytest tests/ -x -n auto

Ghostty users: add keybind = shift+enter=text:\n to config. See TERMINAL_COMPATIBILITY.md.

Roadmap

  • Plugin architecture: pipeline filters loadable as plugins, each declaring its model dependencies (STT, TTS, LLM, Vision).
  • Realtime partial transcription: stream partial results while speaking using a fast small model.
  • Cloud relay (Phase 2): E2E encrypted relay connecting web clients to local engines.

Protocol

dictare is the reference implementation of OpenVIP — an open protocol for voice input to AI agents. Any tool can implement the SSE endpoint and receive voice transcriptions from dictare.

License

MIT

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

dictare-0.1.140rc1.tar.gz (2.1 MB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

dictare-0.1.140rc1-py3-none-any.whl (1.9 MB view details)

Uploaded Python 3

File details

Details for the file dictare-0.1.140rc1.tar.gz.

File metadata

  • Download URL: dictare-0.1.140rc1.tar.gz
  • Upload date:
  • Size: 2.1 MB
  • Tags: Source
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.13.7

File hashes

Hashes for dictare-0.1.140rc1.tar.gz
Algorithm Hash digest
SHA256 6a05f514d8e9d4fc823f954aadd569b461475ac7d167e9295a237b3d88de6be5
MD5 cc3394e0e0066762d25dbecd9a58af55
BLAKE2b-256 f5dd039e3a833e0a393c6f100fbabd8e413e16eeede1c6aebf1cb7485ce99b16

See more details on using hashes here.

Provenance

The following attestation bundles were made for dictare-0.1.140rc1.tar.gz:

Publisher: publish-pypi.yml on dragfly/dictare

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file dictare-0.1.140rc1-py3-none-any.whl.

File metadata

  • Download URL: dictare-0.1.140rc1-py3-none-any.whl
  • Upload date:
  • Size: 1.9 MB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.13.7

File hashes

Hashes for dictare-0.1.140rc1-py3-none-any.whl
Algorithm Hash digest
SHA256 8fb5ff7ec4f711bb4ad3fa8021470c1d3653766bd8a683ea765abe5776e5d41a
MD5 d1b7600341f730d5d48c55103997c93f
BLAKE2b-256 8f7e789aa5383e196483336680466342ea64bc00befcc7a9dc2bb9836425a5ea

See more details on using hashes here.

Provenance

The following attestation bundles were made for dictare-0.1.140rc1-py3-none-any.whl:

Publisher: publish-pypi.yml on dragfly/dictare

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page