Skip to main content

Speak → text, locally, instantly.

Project description

voiceio

 ██╗   ██╗ ██████╗ ██╗ ██████╗███████╗██╗ ██████╗
 ██║   ██║██╔═══██╗██║██╔════╝██╔════╝██║██╔═══██╗
 ██║   ██║██║   ██║██║██║     █████╗  ██║██║   ██║
 ╚██╗ ██╔╝██║   ██║██║██║     ██╔══╝  ██║██║   ██║
  ╚████╔╝ ╚██████╔╝██║╚██████╗███████╗██║╚██████╔╝
   ╚═══╝   ╚═════╝ ╚═╝ ╚═════╝╚══════╝╚═╝ ╚═════╝

CI PyPI Python License: MIT Downloads

Speak → text, locally, instantly.

https://github.com/user-attachments/assets/9cf5d1ac-b4bb-4cf8-b775-7a66dc16b376

Quick start

# 1. Install system dependencies (Ubuntu/Debian)
sudo apt install pipx ibus gir1.2-ibus-1.0 python3-gi python3-dev portaudio19-dev

# 2. Install voiceio
pipx install python-voiceio

# 3. Run the setup wizard
voiceio setup

That's it. Press Ctrl+Alt+V (or your chosen hotkey) to start dictating.

Fedora
sudo dnf install pipx ibus python3-gobject python3-devel portaudio-devel
pipx install python-voiceio
voiceio setup
Arch Linux
sudo pacman -S python-pipx ibus python-gobject portaudio
# Note: Arch includes Python headers by default with the python package
pipx install python-voiceio
voiceio setup
Windows
# Option A: Install with pip (requires Python 3.11+)
pip install python-voiceio
voiceio setup

# Option B: Download the installer from GitHub Releases (no Python needed)
# https://github.com/Hugo0/voiceio/releases
# Also available as a portable .zip if you prefer no installation.

Windows uses pynput for hotkeys and text injection. No extra system dependencies required.

macOS
pipx install python-voiceio
voiceio setup
Build from source

If you want the source code locally to hack on or customize for personal use. PRs are welcome!

git clone https://github.com/Hugo0/voiceio
cd voiceio
uv pip install -e ".[linux,dev]"

# Bootstrap CLI commands onto PATH (creates ~/.local/bin/voiceio)
uv run voiceio setup

Note: Source installs live inside a virtualenv, so voiceio isn't on PATH until setup creates symlinks in ~/.local/bin/. If voiceio isn't found after setup, restart your terminal or run export PATH="$HOME/.local/bin:$PATH".

You can also install with uv tool install python-voiceio or pip install python-voiceio.

How it works

hotkey → mic capture → whisper (local) → text at cursor
          pre-buffered   streaming        IBus / clipboard

Press your hotkey to start recording (1s pre-buffer catches the first syllable). Text streams into the focused app as an underlined preview. Press again to commit. Transcription runs locally via faster-whisper, text is injected through IBus (any GTK/Qt app) with clipboard fallback for terminals.

Features

  • Streaming: text appears as you speak, not after you stop
  • Works everywhere: IBus input method for GUI apps, clipboard for terminals
  • Wayland + X11: evdev hotkeys work on both, no root required
  • Pre-buffer: never miss the first syllable
  • Voice commands: "new line", "comma", "scratch that", punctuation by name
  • Autocorrect: LLM-powered review of recurring Whisper mistakes (voiceio correct)
  • Text-to-speech: hear selected text spoken back (Piper, eSpeak, Edge TTS)
  • Smart post-processing: numbers ("twenty five" → "25"), punctuation, capitalization
  • Auto-healing: falls back to the next working backend if one fails
  • Autostart: optional systemd service, restarts on crash
  • Self-diagnosing: voiceio doctor checks everything, --fix repairs it

Models

Model Size Speed Accuracy Good for
tiny 75 MB ~10x realtime Basic Quick notes, low-end hardware
base 150 MB ~7x realtime Good Daily use (default)
small 500 MB ~4x realtime Better Longer dictation
medium 1.5 GB ~2x realtime Great Accuracy-sensitive work
large-v3 3 GB ~1x realtime Best Maximum quality, GPU recommended

Models download automatically on first use. Switch anytime: voiceio --model small.

Commands

voiceio                  Start the daemon
voiceio setup            Interactive setup wizard
voiceio doctor           Health check (--fix to auto-repair)
voiceio test             Test microphone + live transcription
voiceio demo             Interactive guided tour of all features
voiceio toggle           Toggle recording on a running daemon
voiceio correct          Review and fix recurring transcription errors
voiceio history          View transcription history
voiceio update           Update to latest version
voiceio service install  Autostart on login (systemd / Windows Startup)
voiceio logs             View recent logs
voiceio uninstall        Remove all system integrations

Configuration

voiceio setup handles everything interactively. To tweak later, edit the config file or override at runtime:

  • Linux/macOS: ~/.config/voiceio/config.toml
  • Windows: %LOCALAPPDATA%\voiceio\config\config.toml
voiceio --model large-v3 --language auto -v

See config.example.toml for all options.

Troubleshooting

voiceio doctor           # see what's working
voiceio doctor --fix     # auto-fix issues
voiceio logs             # check debug output
Problem Fix
No text appears voiceio doctor --fix - usually a missing IBus component or GNOME input source
Hotkey doesn't work on Wayland sudo usermod -aG input $USER then log out and back in
Transcription too slow Use a smaller model: voiceio --model tiny
Want to start fresh voiceio uninstall then voiceio setup
Windows: antivirus blocks hotkeys pynput uses global keyboard hooks — add an exception for voiceio
Windows: no sound feedback Check voiceio logs for audio device info
macOS issues Experimental — consider aquavoice.com or contribute a PR

Platform support

Platform Status Text injection Hotkeys Streaming preview
Ubuntu / Debian (GNOME, Wayland) Tested daily IBus evdev / GNOME shortcut Yes
Ubuntu / Debian (GNOME, X11) Supported IBus evdev / pynput Yes
Fedora (GNOME) Supported IBus evdev / GNOME shortcut Yes
Arch Linux Supported IBus evdev Yes
KDE / Sway / Hyprland Should work IBus / ydotool / wtype evdev Yes
Windows 10/11 Experimental pynput / clipboard pynput Type-and-correct (no preedit)
macOS Experimental pynput / clipboard pynput Type-and-correct (no preedit)

voiceio auto-detects your platform and picks the best available backends. Run voiceio doctor to see what's working on your system.

Uninstall

voiceio uninstall        # removes service, IBus, shortcuts, symlinks
pipx uninstall python-voiceio   # removes the package

Roadmap

Contributions welcome! See CONTRIBUTING.md and open issues.

Now

  • macOS polish (IMKit for native preedit, Accessibility API for text injection)

Soon

  • Per-app context awareness (detect focused app, adapt formatting/behavior)
  • File/audio transcription mode (voiceio transcribe recording.mp3)

Backlog

  • Multiple engine backends (whisper.cpp for Vulkan/AMD, VOSK for low-end hardware)
  • Echo cancellation (filter system audio for meeting use)
  • Wake word activation ("Hey voiceio") Done
  • Text-to-speech output (Piper/eSpeak/Edge TTS — completes the "io")
  • LLM auto-audit dictionary (voiceio correct --auto — scan history with LLM, interactive correction)
  • LLM post-processing via Ollama (grammar cleanup, spelling fixes on final pass)
  • Corrections dictionary — auto-replace misheard words, "correct that" voice command
  • Transcription history — searchable log of everything you've dictated
  • Number-to-digit conversion ("three hundred forty two" → "342")
  • VAD-based silence filtering (Silero VAD, prevents Whisper hallucinations)
  • Voice commands — "new line", "new paragraph", "scratch that", punctuation by name
  • Custom vocabulary / personal dictionary (bias Whisper via initial_prompt)
  • Smart punctuation & capitalization post-processing
  • Windows support
  • System tray icon with animated states
  • Auto-stop on silence

License

MIT

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

python_voiceio-0.3.12.tar.gz (2.1 MB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

python_voiceio-0.3.12-py3-none-any.whl (2.1 MB view details)

Uploaded Python 3

File details

Details for the file python_voiceio-0.3.12.tar.gz.

File metadata

  • Download URL: python_voiceio-0.3.12.tar.gz
  • Upload date:
  • Size: 2.1 MB
  • Tags: Source
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.13.12

File hashes

Hashes for python_voiceio-0.3.12.tar.gz
Algorithm Hash digest
SHA256 dd5301d5f17a052b797941763587ce812d836ebb07af92c915dfb24e535f9bc0
MD5 0910084d1073702ef6e20d9f0f89b7cb
BLAKE2b-256 e0a54c0e4f39bf6b54938e7a3450e3c0385758a847b876f3c6fcadb3e9914d14

See more details on using hashes here.

Provenance

The following attestation bundles were made for python_voiceio-0.3.12.tar.gz:

Publisher: publish.yml on Hugo0/voiceio

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file python_voiceio-0.3.12-py3-none-any.whl.

File metadata

File hashes

Hashes for python_voiceio-0.3.12-py3-none-any.whl
Algorithm Hash digest
SHA256 6109c0278015c74c2f3af8dc6e4ceef134761e5ec3bb8bea5af6feabbb80d34d
MD5 ab490abeb656dcab5f230e4964048fab
BLAKE2b-256 416500e2e707a2db7e237e84c5720f9f0fd14846a24d6e91d52cebd1fb7f368e

See more details on using hashes here.

Provenance

The following attestation bundles were made for python_voiceio-0.3.12-py3-none-any.whl:

Publisher: publish.yml on Hugo0/voiceio

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page