Skip to main content

Free, fully-local voice dictation

Project description

yohoho

speak. it types. — free, fully-local voice dictation for developers.

yohoho turns speech into text entirely on your machine. Hit a hotkey, talk, and an on-device model (NVIDIA Parakeet) transcribes your speech and pastes the text into whatever app is focused. No cloud, no API key, no subscription — your voice never leaves your laptop.

It's a free, open-source alternative to Wispr Flow and VoiceInk, for people who'd rather own their tools than rent them. (The name is Brook's laugh from One Piece crossed with the "yo ho ho" shanty — a laugh is a voice, after all.)

Status

Working on macOS today. Press the hotkey, speak, press again — your words transcribe on-device and paste at the cursor, with a live dot-matrix panel and on/off chimes. Windows and a one-line installer are next.

✅ Working (macOS / Apple Silicon) global hotkey, on-device transcription (Parakeet int8), live dot-matrix status panel, auto-paste, on/off chimes, run-on-login
🚧 Next smoother permission setup, background-daemon supervisor, Windows adapter

Install & set up (macOS)

Install with whichever you have — each puts a yohoho command on your PATH:

npm i -g @by-k4n/yohoho    # Node users — bootstraps Python via uv under the hood
uv tool install yohoho     # uv users
pipx install yohoho        # pipx users

The npm install adds yohoho to your PATH automatically (it lands in npm's global bin) — open a new shell and you're set, no Python needed. With uv or pipx, if yohoho isn't found afterward, run uv tool ensurepath (or pipx ensurepath) once to add their bin directory, then restart your shell.

Bleeding edge / no PyPI: uv tool install 'git+https://github.com/by-k4n/yohoho.git@vX.Y.Z'.

Then:

yohoho setup     # pick a hotkey, grant permissions, download the model (~660 MB, first run)
yohoho start     # press your hotkey anywhere to dictate
yohoho config    # interactive settings menu — record a new hotkey, tweak chimes, and more

setup walks you through it, opens the right System Settings panes, and installs a launch-on-login agent so yohoho is ready whenever you are; the default hotkey is ⌃⌥Space (Control-Option-Space). start runs the dictation loop in the foreground now (Ctrl-C to quit).

To dictate: press ⌃⌥Space (you'll hear the "on" chime), speak, then press ⌃⌥Space again — the text transcribes on-device and pastes at your cursor (the "off" chime confirms it). Run yohoho doctor any time to check permissions and your hotkey.

Permissions (macOS) — please read

macOS gates the hotkey and the paste behind three privacy permissions. Grant them to the terminal app you launch yohoho from — Terminal, iTerm, Warp, Ghostty, … — not to "python":

Permission Why it's needed System Settings ▸ Privacy & Security ▸
Microphone record your voice Microphone
Input Monitoring detect the global hotkey Input Monitoring
Accessibility paste into the focused app Accessibility

Why your terminal, not python? macOS attributes these grants to the responsible process — the app that launched yohoho — which is your terminal, not the Python interpreter. yohoho setup opens the right panes; add your terminal app under each one and toggle it on. If you later launch from a different terminal, grant it there too.

Known rough edge: if dictation transcribes but doesn't paste (you have to press ⌘V yourself), your terminal is missing Accessibility — add it there and restart the terminal. This terminal-by-terminal grant is the price of shipping as a dev script today; a future version will ship a small signed app so you grant once and forget it. For now, that's a known trade-off we've chosen on purpose.

Why

  • Private — audio is transcribed locally and never touches a server. Transcripts are never written to logs, and history stays on your machine.
  • Fast — Parakeet runs several times faster than realtime on CPU; on Apple Silicon it offloads to the Neural Engine via CoreML. Text lands in ~1–2 s for a short clip.
  • Free — MIT licensed. No subscription, ever.

Architecture

A portable core (identical on every OS) sits behind six small platform-adapter contracts — hotkey · clipboard · inject · focus · autostart · permissions — the only OS-specific code, selected at runtime by platform_factory. Engine: NVIDIA Parakeet TDT 0.6b v2 (int8 ONNX) via onnx-asr. UI: a Tkinter dot-matrix panel. Output: clipboard paste (lossless, unlike per-key typing).

The full design is in docs/DESIGN.md; the 149-case failure-mode matrix is in docs/edge-cases.md.

Roadmap

  • M1 — portable core (engine, recorder, controller, config, observability, history) + yohoho dictate
  • M2 — dot-matrix status panel (Tkinter)
  • M3 — macOS adapter: global hotkey, TCC permissions, auto-paste, on/off chimes, run-on-login
  • M4 (install) — PyPI + npm wrapper install (this ship); daemon/signed-app/tray are later M4 pieces
  • M4 — background-daemon supervisor, smoother permission flow (signed app), full status/history/logs
  • M5 — Windows adapter
  • M6 — standalone per-OS binaries

Linux is on the map but deferred from v1; the adapter layer is kept Linux-ready.

Development

uv sync --extra dev
uv run pytest                       # unit suite
uv run pytest -m "gui or not gui"   # include the Tk panel tests
uv run pytest -m integration        # real-model test (needs the model cached + tests/fixtures/hello.wav)
uv run ruff check .

Design

Terminal / dot-matrix aesthetic — brand color #39BFC6 on near-black, Doto wordmark, everything rendered in dots.

License

MIT — see LICENSE. If it saves you a subscription, buy yourself a coffee.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

yohoho-0.4.0.tar.gz (134.7 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

yohoho-0.4.0-py3-none-any.whl (161.4 kB view details)

Uploaded Python 3

File details

Details for the file yohoho-0.4.0.tar.gz.

File metadata

  • Download URL: yohoho-0.4.0.tar.gz
  • Upload date:
  • Size: 134.7 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: uv/0.9.3

File hashes

Hashes for yohoho-0.4.0.tar.gz
Algorithm Hash digest
SHA256 89b84f8cca51146466f7be2a82769e3fcc9eff51a0a231d134caacf49bcedbde
MD5 b860e85adc0745ef4df96187a05ce1bb
BLAKE2b-256 990674f5ced6c5009fd042fa273e8a73dd29e7db2cc6e7dd87cb3ca2923ffa9c

See more details on using hashes here.

File details

Details for the file yohoho-0.4.0-py3-none-any.whl.

File metadata

  • Download URL: yohoho-0.4.0-py3-none-any.whl
  • Upload date:
  • Size: 161.4 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: uv/0.9.3

File hashes

Hashes for yohoho-0.4.0-py3-none-any.whl
Algorithm Hash digest
SHA256 cecad8b7c722400de8be0905a1eaa018eff843b630fa7d51c9fe3e7fe017a9fb
MD5 12f697ca40f4b296a9953b5a3af5c1ea
BLAKE2b-256 90e4521826e0243c68ff97e38c07ea06aa44da03d5d053e096b410aa35352e2e

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page