Skip to main content

Free, fully-local voice dictation

Project description

yohoho

speak. it types. — free, fully-local voice dictation for developers.

yohoho turns speech into text entirely on your machine. Hit a hotkey, talk, and an on-device model (NVIDIA Parakeet) transcribes your speech and pastes the text into whatever app is focused. No cloud, no API key, no subscription — your voice never leaves your laptop.

It's a free, open-source alternative to Wispr Flow and VoiceInk, for people who'd rather own their tools than rent them. (The name is Brook's laugh from One Piece crossed with the "yo ho ho" shanty — a laugh is a voice, after all.)

Status

Working on macOS today. Press the hotkey, speak, press again — your words transcribe on-device and paste at the cursor, with a live dot-matrix panel and on/off chimes. Windows and a one-line installer are next.

✅ Working (macOS / Apple Silicon) global hotkey, on-device transcription (Parakeet int8), live dot-matrix status panel, auto-paste, on/off chimes, run-on-login
🚧 Next smoother permission setup, background-daemon supervisor, Windows adapter

Install & set up (macOS)

Install with whichever you have — each puts a yohoho command on your PATH:

npm i -g @by-k4n/yohoho    # Node users — bootstraps Python via uv under the hood
uv tool install yohoho     # uv users
pipx install yohoho        # pipx users

The npm install adds yohoho to your PATH automatically (it lands in npm's global bin) — open a new shell and you're set, no Python needed. With uv or pipx, if yohoho isn't found afterward, run uv tool ensurepath (or pipx ensurepath) once to add their bin directory, then restart your shell.

Bleeding edge / no PyPI: uv tool install 'git+https://github.com/by-k4n/yohoho.git@vX.Y.Z'.

Then:

yohoho setup     # pick a hotkey, grant permissions, download the model (~660 MB, first run)
yohoho start     # press your hotkey anywhere to dictate
yohoho config    # interactive settings menu — record a new hotkey, tweak chimes, and more

setup walks you through it, opens the right System Settings panes, and installs a launch-on-login agent so yohoho is ready whenever you are; the default hotkey is ⌃⌥Space (Control-Option-Space). start runs the dictation loop in the foreground now (Ctrl-C to quit).

To dictate: press ⌃⌥Space (you'll hear the "on" chime), speak, then press ⌃⌥Space again — the text transcribes on-device and pastes at your cursor (the "off" chime confirms it). Run yohoho doctor any time to check permissions and your hotkey.

Permissions (macOS) — please read

macOS gates the hotkey and the paste behind three privacy permissions. Grant them to the terminal app you launch yohoho from — Terminal, iTerm, Warp, Ghostty, … — not to "python":

Permission Why it's needed System Settings ▸ Privacy & Security ▸
Microphone record your voice Microphone
Input Monitoring detect the global hotkey Input Monitoring
Accessibility paste into the focused app Accessibility

Why your terminal, not python? macOS attributes these grants to the responsible process — the app that launched yohoho — which is your terminal, not the Python interpreter. yohoho setup opens the right panes; add your terminal app under each one and toggle it on. If you later launch from a different terminal, grant it there too.

Known rough edge: if dictation transcribes but doesn't paste (you have to press ⌘V yourself), your terminal is missing Accessibility — add it there and restart the terminal. This terminal-by-terminal grant is the price of shipping as a dev script today; a future version will ship a small signed app so you grant once and forget it. For now, that's a known trade-off we've chosen on purpose.

Why

  • Private — audio is transcribed locally and never touches a server. Transcripts are never written to logs, and history stays on your machine.
  • Fast — Parakeet runs several times faster than realtime on CPU; on Apple Silicon it offloads to the Neural Engine via CoreML. Text lands in ~1–2 s for a short clip.
  • Free — MIT licensed. No subscription, ever.

Architecture

A portable core (identical on every OS) sits behind six small platform-adapter contracts — hotkey · clipboard · inject · focus · autostart · permissions — the only OS-specific code, selected at runtime by platform_factory. Engine: NVIDIA Parakeet TDT 0.6b v2 (int8 ONNX) via onnx-asr. UI: a Tkinter dot-matrix panel. Output: clipboard paste (lossless, unlike per-key typing).

The full design is in docs/DESIGN.md; the 149-case failure-mode matrix is in docs/edge-cases.md.

Roadmap

  • M1 — portable core (engine, recorder, controller, config, observability, history) + yohoho dictate
  • M2 — dot-matrix status panel (Tkinter)
  • M3 — macOS adapter: global hotkey, TCC permissions, auto-paste, on/off chimes, run-on-login
  • M4 (install) — PyPI + npm wrapper install (this ship); daemon/signed-app/tray are later M4 pieces
  • M4 — background-daemon supervisor, smoother permission flow (signed app), full status/history/logs
  • M5 — Windows adapter
  • M6 — standalone per-OS binaries

Linux is on the map but deferred from v1; the adapter layer is kept Linux-ready.

Development

uv sync --extra dev
uv run pytest                       # unit suite
uv run pytest -m "gui or not gui"   # include the Tk panel tests
uv run pytest -m integration        # real-model test (needs the model cached + tests/fixtures/hello.wav)
uv run ruff check .

Design

Terminal / dot-matrix aesthetic — brand color #39BFC6 on near-black, Doto wordmark, everything rendered in dots.

License

MIT — see LICENSE. If it saves you a subscription, buy yourself a coffee.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

yohoho-0.1.3.tar.gz (107.1 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

yohoho-0.1.3-py3-none-any.whl (130.3 kB view details)

Uploaded Python 3

File details

Details for the file yohoho-0.1.3.tar.gz.

File metadata

  • Download URL: yohoho-0.1.3.tar.gz
  • Upload date:
  • Size: 107.1 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: uv/0.9.3

File hashes

Hashes for yohoho-0.1.3.tar.gz
Algorithm Hash digest
SHA256 2ce3015809aeb1f6624e6cd8b9824b227938a99c65d8800706231206eeb76b94
MD5 5cf329511991bb0a99a85b55977b0bd0
BLAKE2b-256 5fde85ccc38aae568722b7c7f4536bcd0e311878bab223687f1b2a26797da881

See more details on using hashes here.

File details

Details for the file yohoho-0.1.3-py3-none-any.whl.

File metadata

  • Download URL: yohoho-0.1.3-py3-none-any.whl
  • Upload date:
  • Size: 130.3 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: uv/0.9.3

File hashes

Hashes for yohoho-0.1.3-py3-none-any.whl
Algorithm Hash digest
SHA256 af42bde2458391ec2ad0280095a57e95a233de837a695938d26031564b30b50b
MD5 82656a3ac5c24d4e4eaaf437a429332f
BLAKE2b-256 9ec8e176febb6e215a2e7611ac5bc3c3ade8e4c19efaa4ec88356b9d76d6012a

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page