Free, fully-local voice dictation
Project description
yohoho
speak. it types. — free, fully-local voice dictation for developers.
yohoho turns speech into text entirely on your machine. Hit a hotkey, talk, and an on-device model
(NVIDIA Parakeet) transcribes your speech and pastes the text into whatever app is focused. No cloud,
no API key, no subscription — your voice never leaves your laptop.
It's a free, open-source alternative to Wispr Flow and VoiceInk, for people who'd rather own their tools than rent them. (The name is Brook's laugh from One Piece crossed with the "yo ho ho" shanty — a laugh is a voice, after all.)
Status
Working on macOS today. Press the hotkey, speak, press again — your words transcribe on-device and paste at the cursor, with a live dot-matrix panel and on/off chimes. Windows and a one-line installer are next.
| ✅ Working (macOS / Apple Silicon) | global hotkey, on-device transcription (Parakeet int8), live dot-matrix status panel, auto-paste, on/off chimes, run-on-login |
| 🚧 Next | smoother permission setup, background-daemon supervisor, Windows adapter |
Install & set up (macOS)
Install with whichever you have — each puts a yohoho command on your PATH:
npm i -g @by-k4n/yohoho # Node users — bootstraps Python via uv under the hood
uv tool install yohoho # uv users
pipx install yohoho # pipx users
The npm install adds yohoho to your PATH automatically (it lands in npm's global bin) — open a
new shell and you're set, no Python needed. With uv or pipx, if yohoho isn't found afterward,
run uv tool ensurepath (or pipx ensurepath) once to add their bin directory, then restart your shell.
Bleeding edge / no PyPI: uv tool install 'git+https://github.com/by-k4n/yohoho.git@vX.Y.Z'.
Then:
yohoho setup # pick a hotkey, grant permissions, download the model (~660 MB, first run)
yohoho start # press your hotkey anywhere to dictate
yohoho config # interactive settings menu — record a new hotkey, tweak chimes, and more
setup walks you through it, opens the right System Settings panes, and installs a launch-on-login
agent so yohoho is ready whenever you are; the default hotkey is ⌃⌥Space (Control-Option-Space).
start runs the dictation loop in the foreground now (Ctrl-C to quit).
To dictate: press ⌃⌥Space (you'll hear the "on" chime), speak, then press ⌃⌥Space again —
the text transcribes on-device and pastes at your cursor (the "off" chime confirms it). Run
yohoho doctor any time to check permissions and your hotkey.
Permissions (macOS) — please read
macOS gates the hotkey and the paste behind three privacy permissions. Grant them to the terminal app you launch yohoho from — Terminal, iTerm, Warp, Ghostty, … — not to "python":
| Permission | Why it's needed | System Settings ▸ Privacy & Security ▸ |
|---|---|---|
| Microphone | record your voice | Microphone |
| Input Monitoring | detect the global hotkey | Input Monitoring |
| Accessibility | paste into the focused app | Accessibility |
Why your terminal, not python? macOS attributes these grants to the responsible process — the app that launched yohoho — which is your terminal, not the Python interpreter.
yohoho setupopens the right panes; add your terminal app under each one and toggle it on. If you later launch from a different terminal, grant it there too.
Known rough edge: if dictation transcribes but doesn't paste (you have to press ⌘V yourself), your terminal is missing Accessibility — add it there and restart the terminal. This terminal-by-terminal grant is the price of shipping as a dev script today; a future version will ship a small signed app so you grant once and forget it. For now, that's a known trade-off we've chosen on purpose.
Why
- Private — audio is transcribed locally and never touches a server. Transcripts are never written to logs, and history stays on your machine.
- Fast — Parakeet runs several times faster than realtime on CPU; on Apple Silicon it offloads to the Neural Engine via CoreML. Text lands in ~1–2 s for a short clip.
- Free — MIT licensed. No subscription, ever.
Architecture
A portable core (identical on every OS) sits behind six small platform-adapter contracts —
hotkey · clipboard · inject · focus · autostart · permissions — the only OS-specific code, selected at
runtime by platform_factory. Engine: NVIDIA Parakeet TDT 0.6b v2 (int8 ONNX) via onnx-asr. UI: a
Tkinter dot-matrix panel. Output: clipboard paste (lossless, unlike per-key typing).
The full design is in docs/DESIGN.md; the 149-case failure-mode matrix is in
docs/edge-cases.md.
Roadmap
- M1 — portable core (engine, recorder, controller, config, observability, history) +
yohoho dictate - M2 — dot-matrix status panel (Tkinter)
- M3 — macOS adapter: global hotkey, TCC permissions, auto-paste, on/off chimes, run-on-login
- M4 (install) — PyPI + npm wrapper install (this ship); daemon/signed-app/tray are later M4 pieces
- M4 — background-daemon supervisor, smoother permission flow (signed app), full
status/history/logs - M5 — Windows adapter
- M6 — standalone per-OS binaries
Linux is on the map but deferred from v1; the adapter layer is kept Linux-ready.
Development
uv sync --extra dev
uv run pytest # unit suite
uv run pytest -m "gui or not gui" # include the Tk panel tests
uv run pytest -m integration # real-model test (needs the model cached + tests/fixtures/hello.wav)
uv run ruff check .
Design
Terminal / dot-matrix aesthetic — brand color #39BFC6 on near-black,
Doto wordmark, everything rendered in dots.
License
MIT — see LICENSE. If it saves you a subscription, buy yourself a coffee.
Project details
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file yohoho-0.2.0.tar.gz.
File metadata
- Download URL: yohoho-0.2.0.tar.gz
- Upload date:
- Size: 116.8 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: uv/0.9.3
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
a5a6b921f0a695026a0687cebabc0987b7d7d8257a83bd29726903bbbea90f3e
|
|
| MD5 |
cdb5311816f93130a753d8e32e98d298
|
|
| BLAKE2b-256 |
9e17e442dacd7f66de709f609e935af18bc1e101b3ea02bbfd457c673171a561
|
File details
Details for the file yohoho-0.2.0-py3-none-any.whl.
File metadata
- Download URL: yohoho-0.2.0-py3-none-any.whl
- Upload date:
- Size: 141.8 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: uv/0.9.3
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
01b1435644aded5d5f4e93d585d5ed497dd0b28a114ca90b3b53caff595296d2
|
|
| MD5 |
048288a6fe8c37ffc9c54f04bf5abdb5
|
|
| BLAKE2b-256 |
213c5e465fad6dec298edbf9b50c50b66617565b6ea0b0e1b20f6379d6ecef1b
|