Skip to main content

Voice dictation daemon using NVIDIA Parakeet on Apple Silicon

Project description

🦜 Wordbird

Contextual voice dictation for macOS. Powered by NVIDIA Parakeet running locally on Apple Silicon via MLX.

Press a hotkey, speak, and your words are transcribed and pasted into whatever app is focused. A small LLM post-processes the transcription to fix errors, using project-specific context from a WORDBIRD.md file.

Getting started

Requires macOS on Apple Silicon (M1+) and Python 3.10+.

# Run with uvx (no install needed)
uvx wordbird

# Or run in the background
uvx wordbird start
uvx wordbird stop
uvx wordbird status

Context-aware correction

The key idea behind wordbird is contextual transcription correction. When dictating into Terminal.app, wordbird detects the focused tab's working directory and looks for a WORDBIRD.md file up the directory tree. This lets you teach wordbird your project's domain:

Context detection works with:

  • Terminal.app — detects the focused tab's shell working directory
  • VS Code / VS Code Insiders — via the Wordbird extension, which works with local and remote (SSH) workspaces

Transcription and pasting work in any app.

uvx wordbird init

This creates a WORDBIRD.md with the default prompt template. Edit it to add your project's terms, names, and jargon:

---
transcription_model: mlx-community/parakeet-tdt-0.6b-v2
fix_model: mlx-community/Qwen2.5-1.5B-Instruct-4bit
---

Fix transcription errors. Output only the corrected text.

Example 1:
Input: "the java script function isnt working"
Output: "The JavaScript function isn't working."

Example 2:
Input: "check the get ignore file for the repo"
Output: "Check the .gitignore file for the repo."

Example 3:
Input: "we need to refactor the a p i endpoint"
Output: "We need to refactor the API endpoint."

Key terms: MyClass, some_function, PostgreSQL
Names: Alice, Bob

Input: "{{ transcript }}"
Output:

The file is a Jinja template. {{ transcript }} is replaced with the raw transcription. If omitted, the transcript is appended automatically.

The YAML front matter lets you override models per-project. When you dictate into a Terminal tab whose shell is in that directory (or a child), wordbird picks up the nearest WORDBIRD.md and uses it.

Hotkeys

Action Default
Toggle recording Right ⌘ + Space
Hold to record Hold Right ⌘ for >1s, release to transcribe

Hotkeys are configurable:

--hold-key KEY       Hold key (default: rcmd). Options: rcmd, lcmd, ralt, lalt, rshift, lshift, rctrl, lctrl
--toggle-key KEY     Toggle key (default: space). Options: space, return, tab, escape

Options

--model MODEL        Transcription model (default: mlx-community/parakeet-tdt-0.6b-v2)
--fix-model MODEL    Post-processor model (default: mlx-community/Qwen2.5-1.5B-Instruct-4bit)
--no-fix             Disable LLM post-processing

Dashboard

Wordbird runs a local web dashboard at localhost:7870. Click the bird in the menu bar → Dashboard… to open it.

  • 📝 History — browse all your transcriptions with timestamps, app name, working directory, and duration. See both the original and corrected text.
  • ⚙️ Settings — configure hotkeys, models, and post-processing. Changes take effect immediately without restarting.

You can also view history from the command line:

uvx wordbird history

Menu bar

Wordbird shows a bird icon in the menu bar:

  • White — idle
  • 🟡 Yellow — connecting mic
  • 🔴 Red — listening
  • Sparkles — transcribing

Permissions

Wordbird needs three macOS permissions, granted to your terminal app:

  • 🎤 Microphone — to record your voice
  • 🔐 Accessibility — to paste text and intercept the hotkey
  • ⌨️ Input Monitoring — to detect the global hotkey

Wordbird checks these on startup and tells you what's missing.

License

MIT

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

wordbird-0.4.1.tar.gz (609.2 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

wordbird-0.4.1-py3-none-any.whl (243.0 kB view details)

Uploaded Python 3

File details

Details for the file wordbird-0.4.1.tar.gz.

File metadata

  • Download URL: wordbird-0.4.1.tar.gz
  • Upload date:
  • Size: 609.2 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.13.7

File hashes

Hashes for wordbird-0.4.1.tar.gz
Algorithm Hash digest
SHA256 e02509ba00513f55c561b3b458d646229577b96298ca3f130de0b4ae3bce14be
MD5 a1c8c160949935dcfc9df2906386e9f8
BLAKE2b-256 48a9f92bf216e8f785f9421d43c67ff85b6910b484d9e9dffaf4f947783060eb

See more details on using hashes here.

Provenance

The following attestation bundles were made for wordbird-0.4.1.tar.gz:

Publisher: main.yaml on tillahoffmann/wordbird

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file wordbird-0.4.1-py3-none-any.whl.

File metadata

  • Download URL: wordbird-0.4.1-py3-none-any.whl
  • Upload date:
  • Size: 243.0 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.13.7

File hashes

Hashes for wordbird-0.4.1-py3-none-any.whl
Algorithm Hash digest
SHA256 05f1778b790a60b6f4e3fc72c2a3cf7c4dc9c51e9353f82f93d728c9789f1ae7
MD5 65726e09d76d41984fd83f4b7e6b86c2
BLAKE2b-256 0f73305196bb7efca24b0317892df268433c465864767a572e9614ffdca81ae6

See more details on using hashes here.

Provenance

The following attestation bundles were made for wordbird-0.4.1-py3-none-any.whl:

Publisher: main.yaml on tillahoffmann/wordbird

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page