Skip to main content

Offline push-to-talk voice-to-text CLI powered by whisper.cpp

Project description

morvox

An awesome push-to-talk-style voice-to-text widget for everyone.

One command (morvox) that toggles:

  1. First press → starts recording from the default mic, remembers the currently focused window/app, and shows a "Recording…" widget.
  2. While recording → the widget shows a live transcription preview above the VU meter.
  3. Second press → stops the recorder, transcribes the clip with pywhispercpp, and types the transcription into your target app.

On first use, morvox auto-downloads its built-in base Whisper model if it is missing: English uses ggml-base.en.bin, while non-English languages such as --lang es use the multilingual ggml-base.bin.

Note: Windows 11 has a built-in dictation tool — press Win+H to open it. macOS has System Dictation built in too, accessible via System Settings → Keyboard → Dictation (typically triggered by double-pressing Fn). morvox is an alternative: it runs a local whisper.cpp model entirely offline through pywhispercpp, gives you a visual VU-meter widget, and wires into any hotkey manager you already use.

morvox auto-selects a platform backend:

  • Linux — uses parecord for capture and xdotool for window control + keystroke injection. We also support wayland.
  • macOS — uses ffmpeg (avfoundation) for capture and osascript (System Events) for window focus + keystrokes.
  • Windows 11 — uses ffmpeg (WASAPI) for capture and Win32 APIs for keystroke injection. On Windows, morvox inserts into the window that is focused when transcription finishes: it tries several automatic clipboard paste methods first, then direct Unicode typing, and only leaves the transcript on the clipboard if all insertion methods are blocked.

You can force a backend with MORVOX_BACKEND=x11, MORVOX_BACKEND=macos, or MORVOX_BACKEND=windows.

Table of Contents

Epistemology

The name is based on morhook and voice. mor-vox. I know, if I explain the joke, it's not funny. Don't judge me.

Screenshots

capturing on a terminal capturing on vscode capturing on opencode morvox recording inside opencode

What it does

  • It embeds whisper.cpp via pywhispercpp and shows a live widget with a VU meter plus rolling transcription preview. You need to add the hotkey configuration on your OS/Desktop Environment.
  • The built-in default model is cached under $XDG_CACHE_HOME/morvox/models/ or ~/.cache/morvox/models/ and is downloaded automatically on first use. en uses ggml-base.en.bin; other languages use ggml-base.bin.

Setup & installation

Setup, dependencies, install steps, and hotkey configuration are in INSTALLATION.md.

Usage

# print the installed or checkout version
morvox --version

# toggle (start, then stop+transcribe+type)
morvox

# fallback if you prefer module execution
python -m morvox

# status (for i3blocks / polybar)
morvox --status        # prints "recording" or "idle"

# abort an in-flight recording without transcribing
morvox --cancel

# keep the wav/txt around for debugging
morvox --keep-temp

# use a different model / source / typing speed
morvox --model /path/to/ggml-tiny.en.bin
morvox --lang es
morvox --source alsa_input.usb-Maono_Maonocaster…
morvox --threads 8
morvox --type-delay 5

# disable the floating widget (headless / SSH / debugging)
morvox --no-widget

When you use toggle-time options such as --lang es, invoke morvox with the same flags on both presses.

From a source checkout, you can still run ./morvox before installing, including ./morvox --version.

If you use the built-in managed model, morvox downloads it on first use to $XDG_CACHE_HOME/morvox/models/ or ~/.cache/morvox/models/. English uses ggml-base.en.bin; non-English languages such as --lang es use ggml-base.bin. Custom --model paths are not auto-downloaded and must already exist.

State files live in $XDG_RUNTIME_DIR/morvox/ on Linux, falling back to /tmp/morvox-$UID/ when $XDG_RUNTIME_DIR is unset; ~/Library/Caches/morvox/ on macOS; and %LOCALAPPDATA%\morvox\ on Windows. Override with the MORVOX_STATE_DIR env var:

  • rec.pid — recorder PID
  • target_window — saved focused window id
  • rec.wav / rec.txt — audio + transcript
  • parecord.log / whisper.log — diagnostic logs

By default these are deleted after a successful type. Pass --keep-temp to keep them.

The widget

While recording, morvox shows a small borderless window centred near the bottom of the screen. It contains:

  • a pulsing red dot (recording indicator),
  • a live VU meter that reacts to your microphone level,
  • an elapsed-time counter.

When you stop recording, the meter is replaced by a "Transcribing…" spinner that stays visible until whisper finishes and the transcript has been typed. If whisper produced only silence the widget briefly shows "No speech detected" instead.

The widget is a self-spawned subprocess of morvox (uses Python's stdlib tkinter). Its stderr is written to the platform state dir's widget.log for debugging. On Linux/X11 it uses _NET_WM_WINDOW_TYPE_DOCK so i3 won't try to tile it. On Wayland-only sessions without XWayland, or on hosts without $DISPLAY, the widget is skipped silently.

To disable the widget entirely (e.g. on a headless machine or over SSH), pass --no-widget.

Troubleshooting

  • No audio recorded / empty wav (Linux) Check the active sources: pactl list short sources. Pass an explicit source with --source <NAME>. Inspect $XDG_RUNTIME_DIR/morvox/parecord.log or /tmp/morvox-$UID/parecord.log.

  • No audio recorded / empty wav (macOS) List devices with ffmpeg -f avfoundation -list_devices true -i "" and pass an explicit --source :<idx>. Inspect ~/Library/Caches/morvox/parecord.log. If ffmpeg complains about permissions, grant the terminal Microphone access.

  • No audio recorded / empty wav (Windows) List audio devices with ffmpeg -list_devices true -f wasapi -i dummy (or ffmpeg -list_devices true -f dshow -i dummy if your ffmpeg build lacks WASAPI) and pass an explicit --source "<device name>". Inspect %LOCALAPPDATA%\morvox\parecord.log. If ffmpeg cannot access the microphone, check Settings -> Privacy & security -> Microphone.

  • Text typed into wrong window On Linux and macOS, the originally focused window/app may have been destroyed before you stopped recording. morvox falls back to typing into whatever is currently focused and prints a warning to stderr. On Windows, morvox intentionally types into the window that is focused when transcription finishes.

  • Linux Wayland: nothing is typed (GNOME/Ubuntu) GNOME/Mutter doesn't implement the wtype keyboard protocol and xdotool is a no-op against native Wayland windows. Either set up ydotoold (sudo systemctl enable --now ydotoold and add your user to the input group), or install wl-clipboard so the transcript lands on your clipboard for manual Ctrl+Shift+V. If you launch morvox from a GNOME custom shortcut, prefer /bin/sh -lc 'morvox >/dev/null 2>/dev/null' to avoid occasional transcription hangups; replace morvox with your checkout path if needed. See INSTALLATION.md (Linux / Wayland).

  • Linux: widget never appears (asdf/pyenv/conda Python) The widget runs as a Python subprocess and needs tkinter. Many third-party Python builds ship without it. Check $XDG_RUNTIME_DIR/morvox/widget.log or /tmp/morvox-$UID/widget.log for No module named 'tkinter'. Install python3-tk and run morvox under the system Python, rebuild your managed Python with Tk support, or use --no-widget to silence the warning.

  • macOS: keystrokes silently do nothing Accessibility permission isn't granted. System Settings → Privacy & Security → Accessibility → enable your terminal app.

  • Windows: text does not type into an elevated app Windows blocks lower-integrity processes from injecting keystrokes into elevated/admin windows. Run morvox from an elevated terminal too, or type into a non-elevated app.

  • Windows: transcript only appears on the clipboard On Windows 11, morvox first tries several automatic paste methods into the currently focused window and then falls back to direct typing. If all of those are blocked by the app or OS policy, morvox leaves the transcript on the clipboard so you can paste it manually. Inspect %LOCALAPPDATA%\morvox\whisper.log for a windows-insert: trace showing which insertion path ran and what failed.

  • Whisper too slow Use a smaller model — ggml-tiny.en.bin is roughly 5× faster than base.en with a small accuracy hit. Increase --threads up to your physical core count.

  • Default model keeps re-downloading unexpectedly The built-in model cache lives under $XDG_CACHE_HOME/morvox/models/ or ~/.cache/morvox/models/. If your environment sets XDG_CACHE_HOME to a temporary location, point it at a persistent cache directory.

  • Nothing is typed and notification says "Empty recording" Whisper produced only a noise token (e.g. [BLANK_AUDIO]). Speak closer to the mic or check input gain.

License

MIT — see LICENSE.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

morvox-2.0.0.tar.gz (42.9 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

morvox-2.0.0-py3-none-any.whl (43.7 kB view details)

Uploaded Python 3

File details

Details for the file morvox-2.0.0.tar.gz.

File metadata

  • Download URL: morvox-2.0.0.tar.gz
  • Upload date:
  • Size: 42.9 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.13.12

File hashes

Hashes for morvox-2.0.0.tar.gz
Algorithm Hash digest
SHA256 cbdbf977f8c05e04eaaae3c614043f9fceb5aeae71b382a312e2b160ca69afd8
MD5 0749b32854e8e8256a8f80cbdddc46de
BLAKE2b-256 a0e59cd7bd6f8eeb0784f0ff51f411b090bd658f284cb9b4c7f9d1afb9973e3b

See more details on using hashes here.

Provenance

The following attestation bundles were made for morvox-2.0.0.tar.gz:

Publisher: release.yml on morhook/morvox

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file morvox-2.0.0-py3-none-any.whl.

File metadata

  • Download URL: morvox-2.0.0-py3-none-any.whl
  • Upload date:
  • Size: 43.7 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.13.12

File hashes

Hashes for morvox-2.0.0-py3-none-any.whl
Algorithm Hash digest
SHA256 ac6acf84ca2077d98d40e9c16f567a1a2d0fb8bbf44b3f4208dd4eb8bf0aa28a
MD5 26b306bb5a10ca0f8b400982e44e04cb
BLAKE2b-256 410d86f28efad441ef89c36d1e80fdae616cbfc3b2609d037afd1b9a1368051e

See more details on using hashes here.

Provenance

The following attestation bundles were made for morvox-2.0.0-py3-none-any.whl:

Publisher: release.yml on morhook/morvox

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page