Skip to main content

Free, fully-local voice dictation

Project description

yohoho

speak. it types. — free, fully-local voice dictation for developers.

yohoho turns speech into text entirely on your machine. Hit a hotkey, talk, and an on-device model (NVIDIA Parakeet) transcribes your speech and pastes the text into whatever app is focused. No cloud, no API key, no subscription — your voice never leaves your laptop.

It's a free, open-source alternative to Wispr Flow and VoiceInk, for people who'd rather own their tools than rent them. (The name is Brook's laugh from One Piece crossed with the "yo ho ho" shanty — a laugh is a voice, after all.)

Status

Working on macOS today. Press the hotkey, speak, press again — your words transcribe on-device and paste at the cursor, with a live dot-matrix panel and on/off chimes. Windows and a one-line installer are next.

✅ Working (macOS / Apple Silicon) global hotkey, on-device transcription (Parakeet int8), live dot-matrix status panel, auto-paste, on/off chimes, run-on-login
🚧 Next smoother permission setup, background-daemon supervisor, Windows adapter

Install & set up (macOS)

Install with whichever you have — each puts a yohoho command on your PATH:

npm i -g @by-k4n/yohoho    # Node users — bootstraps Python via uv under the hood
uv tool install yohoho     # uv users
pipx install yohoho        # pipx users

The npm install adds yohoho to your PATH automatically (it lands in npm's global bin) — open a new shell and you're set, no Python needed. With uv or pipx, if yohoho isn't found afterward, run uv tool ensurepath (or pipx ensurepath) once to add their bin directory, then restart your shell.

Bleeding edge / no PyPI: uv tool install 'git+https://github.com/by-k4n/yohoho.git@vX.Y.Z'.

Then:

yohoho setup     # pick a hotkey, grant permissions, download the model (~660 MB, first run)
yohoho start     # press your hotkey anywhere to dictate
yohoho config    # interactive settings menu — record a new hotkey, tweak chimes, and more

setup walks you through it, opens the right System Settings panes, and installs a launch-on-login agent so yohoho is ready whenever you are; the default hotkey is ⌃⌥Space (Control-Option-Space). start runs the dictation loop in the foreground now (Ctrl-C to quit).

To dictate: press ⌃⌥Space (you'll hear the "on" chime), speak, then press ⌃⌥Space again — the text transcribes on-device and pastes at your cursor (the "off" chime confirms it). Run yohoho doctor any time to check permissions and your hotkey.

Permissions (macOS) — please read

macOS gates the hotkey and the paste behind three privacy permissions. Grant them to the terminal app you launch yohoho from — Terminal, iTerm, Warp, Ghostty, … — not to "python":

Permission Why it's needed System Settings ▸ Privacy & Security ▸
Microphone record your voice Microphone
Input Monitoring detect the global hotkey Input Monitoring
Accessibility paste into the focused app Accessibility

Why your terminal, not python? macOS attributes these grants to the responsible process — the app that launched yohoho — which is your terminal, not the Python interpreter. yohoho setup opens the right panes; add your terminal app under each one and toggle it on. If you later launch from a different terminal, grant it there too.

Known rough edge: if dictation transcribes but doesn't paste (you have to press ⌘V yourself), your terminal is missing Accessibility — add it there and restart the terminal. This terminal-by-terminal grant is the price of shipping as a dev script today; a future version will ship a small signed app so you grant once and forget it. For now, that's a known trade-off we've chosen on purpose.

Why

  • Private — audio is transcribed locally and never touches a server. Transcripts are never written to logs, and history stays on your machine.
  • Fast — Parakeet runs several times faster than realtime on CPU; on Apple Silicon it offloads to the Neural Engine via CoreML. Text lands in ~1–2 s for a short clip.
  • Free — MIT licensed. No subscription, ever.

Architecture

A portable core (identical on every OS) sits behind six small platform-adapter contracts — hotkey · clipboard · inject · focus · autostart · permissions — the only OS-specific code, selected at runtime by platform_factory. Engine: NVIDIA Parakeet TDT 0.6b v2 (int8 ONNX) via onnx-asr. UI: a Tkinter dot-matrix panel. Output: clipboard paste (lossless, unlike per-key typing).

The full design is in docs/DESIGN.md; the 149-case failure-mode matrix is in docs/edge-cases.md.

Roadmap

  • M1 — portable core (engine, recorder, controller, config, observability, history) + yohoho dictate
  • M2 — dot-matrix status panel (Tkinter)
  • M3 — macOS adapter: global hotkey, TCC permissions, auto-paste, on/off chimes, run-on-login
  • M4 (install) — PyPI + npm wrapper install (this ship); daemon/signed-app/tray are later M4 pieces
  • M4 — background-daemon supervisor, smoother permission flow (signed app), full status/history/logs
  • M5 — Windows adapter
  • M6 — standalone per-OS binaries

Linux is on the map but deferred from v1; the adapter layer is kept Linux-ready.

Development

uv sync --extra dev
uv run pytest                       # unit suite
uv run pytest -m "gui or not gui"   # include the Tk panel tests
uv run pytest -m integration        # real-model test (needs the model cached + tests/fixtures/hello.wav)
uv run ruff check .

Design

Terminal / dot-matrix aesthetic — brand color #39BFC6 on near-black, Doto wordmark, everything rendered in dots.

License

MIT — see LICENSE. If it saves you a subscription, buy yourself a coffee.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

yohoho-0.2.2.tar.gz (118.2 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

yohoho-0.2.2-py3-none-any.whl (143.2 kB view details)

Uploaded Python 3

File details

Details for the file yohoho-0.2.2.tar.gz.

File metadata

  • Download URL: yohoho-0.2.2.tar.gz
  • Upload date:
  • Size: 118.2 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: uv/0.9.3

File hashes

Hashes for yohoho-0.2.2.tar.gz
Algorithm Hash digest
SHA256 177121c60d28ef70922a262f98df4b57037f704b004e3b6e3a96050b2fde73c1
MD5 39cbff5c8a45f57eebdacbe4540ba720
BLAKE2b-256 acc687bf1c7edeb2eb02c47557572baaca24a054e8e7f91d0cbcc02de0919922

See more details on using hashes here.

File details

Details for the file yohoho-0.2.2-py3-none-any.whl.

File metadata

  • Download URL: yohoho-0.2.2-py3-none-any.whl
  • Upload date:
  • Size: 143.2 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: uv/0.9.3

File hashes

Hashes for yohoho-0.2.2-py3-none-any.whl
Algorithm Hash digest
SHA256 a4c2bda3e2ecbc7dca6c44de5a4301621a1e20368e04a5f541a9ebc043b4872a
MD5 822e4c41610b1805d1577fe0705c013b
BLAKE2b-256 2b34f004f3078b36535966da55d97e221412bc45f8b15669e5ef3fd57014d21d

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page