Voice dictation daemon using NVIDIA Parakeet on Apple Silicon
Project description
🦜 Wordbird
Contextual voice dictation for macOS. Powered by NVIDIA Parakeet running locally on Apple Silicon via MLX.
Press a hotkey, speak, and your words are transcribed and pasted into whatever app is focused. A small LLM post-processes the transcription to fix errors, using project-specific context from a WORDBIRD.md file.
Getting started
Requires macOS on Apple Silicon (M1+) and Python 3.10+.
# Run with uvx (no install needed)
uvx wordbird
# Or run in the background
uvx wordbird start
uvx wordbird stop
uvx wordbird status
Architecture
Wordbird runs as two sibling processes managed by a thin CLI:
- Server (
wordbird-server) — FastAPI app handling transcription, post-processing, history, config, and serving the React dashboard - Daemon (
wordbird-daemon) — macOS-native process handling hotkeys, mic recording, overlay HUD, menu bar, and clipboard pasting
The daemon sends recorded audio to the server via HTTP. The server runs ML inference in a thread pool so the dashboard stays responsive during transcription.
uvx wordbird # starts both (recommended)
uvx wordbird-server # just the API server
uvx wordbird-daemon # just the daemon (expects server running)
Context-aware correction
When dictating into Terminal.app, Wordbird detects the focused tab's working directory and looks for a WORDBIRD.md file up the directory tree. This lets you teach Wordbird your project's terms:
Context detection works with:
- Terminal.app — detects the focused tab's shell working directory
- VS Code / VS Code Insiders — via the Wordbird extension, which works with local and remote (SSH) workspaces
Transcription and pasting work in any app.
uvx wordbird init
This creates a WORDBIRD.md with the default prompt template. Edit it to add your project's terms:
---
transcription_model: mlx-community/parakeet-tdt-0.6b-v2
fix_model: mlx-community/Qwen2.5-1.5B-Instruct-4bit
---
{# Your correction prompt and examples here #}
{# Key terms: MyApp, some_function, PostgreSQL #}
{# Names: Alice, Bob #}
{# Misheard words: "bird word" should be "Birdword" #}
Input: "{{ transcript }}"
Output:
The file is a Jinja template. {{ transcript }} is replaced with the raw transcription. The YAML front matter lets you override models per-project.
Hotkey
| Action | Default |
|---|---|
| Toggle recording | Right ⌘ + Space |
| Transcribe and submit | Right ⌘ + Return (opt-in) |
The submit shortcut transcribes, pastes, and presses Return — useful for chat and terminal workflows. Enable it in the dashboard settings.
Configurable via CLI flags or the dashboard settings:
--modifier-key KEY Modifier key (default: rcmd). Options: rcmd, lcmd, ralt, lalt, rshift, lshift, rctrl, lctrl, fn
--toggle-key KEY Toggle key (default: space). Options: space, return, tab, escape
Options
--model MODEL Transcription model (default: mlx-community/parakeet-tdt-0.6b-v2)
--fix-model MODEL Post-processor model (default: mlx-community/Qwen2.5-1.5B-Instruct-4bit)
--no-fix Disable LLM post-processing
--no-server Don't spawn the API server (run it separately)
Dashboard
Wordbird runs a local web dashboard (default localhost:7870). Click the bird in the menu bar → Dashboard… to open it.
- History — browse transcriptions with timestamps, app name, working directory, and duration. See both original and corrected text.
- Settings — configure hotkey, models, and post-processing. Changes take effect within seconds.
- Stats — words dictated, recording time, WPM, session count.
uvx wordbird history # view history from the CLI
uvx wordbird config # show or create the config file
Data
All data is stored in ~/.wordbird/:
| File | Purpose |
|---|---|
wordbird.toml |
User configuration |
wordbird.db |
Transcription history (SQLite) |
server.json |
Server port discovery |
wordbird.pid |
Singleton lock |
wordbird.log |
Background mode logs |
Menu bar
Wordbird shows a bird icon in the menu bar:
- ⚪ White — idle
- 🟡 Yellow — connecting mic
- 🔴 Red — listening
- ✨ Sparkles — transcribing
Permissions
Wordbird needs three macOS permissions, granted to your terminal app:
- 🎤 Microphone — to record your voice
- 🔐 Accessibility — to paste text
- ⌨️ Input Monitoring — to detect the global hotkey
Wordbird checks these on startup and tells you what's missing.
Development
make backend-dev # API server with hot reload
make daemon-dev # daemon only (expects server running)
make frontend-dev # Vite dev server with API proxy
make dev # backend + frontend + daemon (all three)
make wordbird # build frontend + run everything
make backend-test # run pytest
License
MIT
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file wordbird-0.8.0.tar.gz.
File metadata
- Download URL: wordbird-0.8.0.tar.gz
- Upload date:
- Size: 707.3 kB
- Tags: Source
- Uploaded using Trusted Publishing? Yes
- Uploaded via: twine/6.1.0 CPython/3.13.7
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
d1cdfd6b4bb9aa23ff928953bcd0c725035431a4c7f5dc14a71d25f3f7d96e68
|
|
| MD5 |
8f3f601121e137a69b6eea08e4a3e51a
|
|
| BLAKE2b-256 |
2a04ab2d4eb452e63767a9e9696d3686b7e974ff602302e6f91c918354a7bb36
|
Provenance
The following attestation bundles were made for wordbird-0.8.0.tar.gz:
Publisher:
main.yaml on tillahoffmann/wordbird
-
Statement:
-
Statement type:
https://in-toto.io/Statement/v1 -
Predicate type:
https://docs.pypi.org/attestations/publish/v1 -
Subject name:
wordbird-0.8.0.tar.gz -
Subject digest:
d1cdfd6b4bb9aa23ff928953bcd0c725035431a4c7f5dc14a71d25f3f7d96e68 - Sigstore transparency entry: 1126242288
- Sigstore integration time:
-
Permalink:
tillahoffmann/wordbird@b31be47079c2fa3cb853464e8804bdd4726ec4c5 -
Branch / Tag:
refs/heads/main - Owner: https://github.com/tillahoffmann
-
Access:
public
-
Token Issuer:
https://token.actions.githubusercontent.com -
Runner Environment:
github-hosted -
Publication workflow:
main.yaml@b31be47079c2fa3cb853464e8804bdd4726ec4c5 -
Trigger Event:
push
-
Statement type:
File details
Details for the file wordbird-0.8.0-py3-none-any.whl.
File metadata
- Download URL: wordbird-0.8.0-py3-none-any.whl
- Upload date:
- Size: 256.2 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? Yes
- Uploaded via: twine/6.1.0 CPython/3.13.7
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
810eca14814f44a460fcc21dee7a98a03718a4f3461061ea2b7c891502e7dee1
|
|
| MD5 |
a7f604aee3333143be999ece26296b8e
|
|
| BLAKE2b-256 |
4cbdd1292766334dddaee306a55d69d69526f7830f31d3c6822766d5a1557100
|
Provenance
The following attestation bundles were made for wordbird-0.8.0-py3-none-any.whl:
Publisher:
main.yaml on tillahoffmann/wordbird
-
Statement:
-
Statement type:
https://in-toto.io/Statement/v1 -
Predicate type:
https://docs.pypi.org/attestations/publish/v1 -
Subject name:
wordbird-0.8.0-py3-none-any.whl -
Subject digest:
810eca14814f44a460fcc21dee7a98a03718a4f3461061ea2b7c891502e7dee1 - Sigstore transparency entry: 1126242455
- Sigstore integration time:
-
Permalink:
tillahoffmann/wordbird@b31be47079c2fa3cb853464e8804bdd4726ec4c5 -
Branch / Tag:
refs/heads/main - Owner: https://github.com/tillahoffmann
-
Access:
public
-
Token Issuer:
https://token.actions.githubusercontent.com -
Runner Environment:
github-hosted -
Publication workflow:
main.yaml@b31be47079c2fa3cb853464e8804bdd4726ec4c5 -
Trigger Event:
push
-
Statement type: