Speak → text, locally, instantly.

These details have been verified by PyPI

Project links

GitHub Statistics

Maintainers

hugo0.com

These details have not been verified by PyPI

Project description

voiceio

Speak → text, locally, instantly.

Quick start

# 1. Install system dependencies (Ubuntu/Debian)
sudo apt install pipx ibus gir1.2-ibus-1.0 python3-gi portaudio19-dev

# 2. Install voiceio
pipx install python-voiceio

# 3. Run the setup wizard
voiceio setup

That's it. Press Ctrl+Alt+V (or your chosen hotkey) to start dictating.

Fedora

sudo dnf install pipx ibus python3-gobject portaudio-devel
pipx install python-voiceio
voiceio setup

Arch Linux

sudo pacman -S python-pipx ibus python-gobject portaudio
pipx install python-voiceio
voiceio setup

Windows

# Option A: Install with pip (requires Python 3.11+)
pip install python-voiceio
voiceio setup

# Option B: Download the installer from GitHub Releases (no Python needed)
# https://github.com/Hugo0/voiceio/releases
# Also available as a portable .zip if you prefer no installation.

Windows uses pynput for hotkeys and text injection. No extra system dependencies required.

macOS

pipx install python-voiceio
voiceio setup

Build from source

If you want the source code locally to hack on or customize for personal use. PRs are welcome!

git clone https://github.com/Hugo0/voiceio
cd voiceio
uv pip install -e ".[linux,dev]"

# Bootstrap CLI commands onto PATH (creates ~/.local/bin/voiceio)
uv run voiceio setup

Note: Source installs live inside a virtualenv, so voiceio isn't on PATH until setup creates symlinks in ~/.local/bin/. If voiceio isn't found after setup, restart your terminal or run export PATH="$HOME/.local/bin:$PATH".

You can also install with uv tool install python-voiceio or pip install python-voiceio.

How it works

hotkey → mic capture → whisper (local) → text at cursor
          pre-buffered   streaming        IBus / clipboard

Press your hotkey to start recording (1s pre-buffer catches the first syllable). Text streams into the focused app as an underlined preview. Press again to commit. Transcription runs locally via faster-whisper, text is injected through IBus (any GTK/Qt app) with clipboard fallback for terminals.

Features

Streaming: text appears as you speak, not after you stop
Works everywhere: IBus input method for GUI apps, clipboard for terminals
Wayland + X11: evdev hotkeys work on both, no root required
Pre-buffer: never miss the first syllable
Voice commands: "new line", "comma", "scratch that", punctuation by name
Autocorrect: LLM-powered review of recurring Whisper mistakes (voiceio correct)
Text-to-speech: hear selected text spoken back (Piper, eSpeak, Edge TTS)
Smart post-processing: numbers ("twenty five" → "25"), punctuation, capitalization
Auto-healing: falls back to the next working backend if one fails
Autostart: optional systemd service, restarts on crash
Self-diagnosing: voiceio doctor checks everything, --fix repairs it

Models

Model	Size	Speed	Accuracy	Good for
`tiny`	75 MB	~10x realtime	Basic	Quick notes, low-end hardware
`base`	150 MB	~7x realtime	Good	Daily use (default)
`small`	500 MB	~4x realtime	Better	Longer dictation
`medium`	1.5 GB	~2x realtime	Great	Accuracy-sensitive work
`large-v3`	3 GB	~1x realtime	Best	Maximum quality, GPU recommended

Models download automatically on first use. Switch anytime: voiceio --model small.

Commands

voiceio                  Start the daemon
voiceio setup            Interactive setup wizard
voiceio doctor           Health check (--fix to auto-repair)
voiceio test             Test microphone + live transcription
voiceio demo             Interactive guided tour of all features
voiceio toggle           Toggle recording on a running daemon
voiceio correct          Review and fix recurring transcription errors
voiceio history          View transcription history
voiceio update           Update to latest version
voiceio service install  Autostart on login (systemd / Windows Startup)
voiceio logs             View recent logs
voiceio uninstall        Remove all system integrations

Configuration

voiceio setup handles everything interactively. To tweak later, edit the config file or override at runtime:

Linux/macOS: ~/.config/voiceio/config.toml
Windows: %LOCALAPPDATA%\voiceio\config\config.toml

voiceio --model large-v3 --language auto -v

See config.example.toml for all options.

Troubleshooting

voiceio doctor           # see what's working
voiceio doctor --fix     # auto-fix issues
voiceio logs             # check debug output

Problem	Fix
No text appears	`voiceio doctor --fix` - usually a missing IBus component or GNOME input source
Hotkey doesn't work on Wayland	`sudo usermod -aG input $USER` then log out and back in
Transcription too slow	Use a smaller model: `voiceio --model tiny`
Want to start fresh	`voiceio uninstall` then `voiceio setup`
Windows: antivirus blocks hotkeys	pynput uses global keyboard hooks — add an exception for voiceio
Windows: no sound feedback	Check `voiceio logs` for audio device info
macOS issues	Experimental — consider aquavoice.com or contribute a PR

Platform support

Platform	Status	Text injection	Hotkeys	Streaming preview
Ubuntu / Debian (GNOME, Wayland)	Tested daily	IBus	evdev / GNOME shortcut	Yes
Ubuntu / Debian (GNOME, X11)	Supported	IBus	evdev / pynput	Yes
Fedora (GNOME)	Supported	IBus	evdev / GNOME shortcut	Yes
Arch Linux	Supported	IBus	evdev	Yes
KDE / Sway / Hyprland	Should work	IBus / ydotool / wtype	evdev	Yes
Windows 10/11	Experimental	pynput / clipboard	pynput	Type-and-correct (no preedit)
macOS	Experimental	pynput / clipboard	pynput	Type-and-correct (no preedit)

voiceio auto-detects your platform and picks the best available backends. Run voiceio doctor to see what's working on your system.

Uninstall

voiceio uninstall        # removes service, IBus, shortcuts, symlinks
pipx uninstall python-voiceio   # removes the package

Roadmap

Contributions welcome! See CONTRIBUTING.md and open issues.

Now

macOS polish (IMKit for native preedit, Accessibility API for text injection)

Soon

Per-app context awareness (detect focused app, adapt formatting/behavior)
File/audio transcription mode (voiceio transcribe recording.mp3)

Backlog

Multiple engine backends (whisper.cpp for Vulkan/AMD, VOSK for low-end hardware)
Echo cancellation (filter system audio for meeting use)
Wake word activation ("Hey voiceio") Done
Text-to-speech output (Piper/eSpeak/Edge TTS — completes the "io")
LLM auto-audit dictionary (voiceio correct --auto — scan history with LLM, interactive correction)
LLM post-processing via Ollama (grammar cleanup, spelling fixes on final pass)
Corrections dictionary — auto-replace misheard words, "correct that" voice command
Transcription history — searchable log of everything you've dictated
Number-to-digit conversion ("three hundred forty two" → "342")
VAD-based silence filtering (Silero VAD, prevents Whisper hallucinations)
Voice commands — "new line", "new paragraph", "scratch that", punctuation by name
Custom vocabulary / personal dictionary (bias Whisper via initial_prompt)
Smart punctuation & capitalization post-processing
Windows support
System tray icon with animated states
Auto-stop on silence

License

MIT

Project details

These details have been verified by PyPI

Project links

GitHub Statistics

Maintainers

hugo0.com

These details have not been verified by PyPI

Release history Release notifications | RSS feed

0.3.12

Apr 24, 2026

0.3.11

Apr 13, 2026

0.3.10

Apr 10, 2026

0.3.9

Apr 9, 2026

This version

0.3.8

Apr 9, 2026

0.3.7

Apr 8, 2026

0.3.6

Apr 8, 2026

0.3.5

Mar 15, 2026

0.3.4

Mar 14, 2026

0.3.3

Mar 14, 2026

0.3.2

Mar 12, 2026

0.3.1

Mar 11, 2026

0.3.0

Mar 11, 2026

0.2.4

Mar 9, 2026

0.2.3

Mar 9, 2026

0.2.1

Mar 9, 2026

0.2.0

Mar 9, 2026

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

python_voiceio-0.3.8.tar.gz (2.1 MB view details)

Uploaded Apr 9, 2026 Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

The dropdown lists show the available interpreters, ABIs, and platforms. Enable javascript to be able to filter the list of wheel files.

python_voiceio-0.3.8-py3-none-any.whl (2.1 MB view details)

Uploaded Apr 9, 2026 Python 3

File details

Details for the file python_voiceio-0.3.8.tar.gz.

File metadata

Download URL: python_voiceio-0.3.8.tar.gz
Upload date: Apr 9, 2026
Size: 2.1 MB
Tags: Source
Uploaded using Trusted Publishing? Yes
Uploaded via: twine/6.1.0 CPython/3.13.12

File hashes

Hashes for python_voiceio-0.3.8.tar.gz
Algorithm	Hash digest
SHA256	`04262531f07c75db3affdbdb8525ed935394dcce3b747e76bc9de17cd3f340ed`
MD5	`a082868fc06bfa2fcc39b85ac07828ad`
BLAKE2b-256	`c0bc4f417ab6a416a0ff09285c86a5620bda1edf6d3c0d5334a025539acaf284`

See more details on using hashes here.

Provenance

The following attestation bundles were made for python_voiceio-0.3.8.tar.gz:

Publisher: publish.yml on Hugo0/voiceio

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

Statement:
- Statement type: https://in-toto.io/Statement/v1
- Predicate type: https://docs.pypi.org/attestations/publish/v1
- Subject name: python_voiceio-0.3.8.tar.gz
- Subject digest: 04262531f07c75db3affdbdb8525ed935394dcce3b747e76bc9de17cd3f340ed
- Sigstore transparency entry: 1261977390
- Sigstore integration time: Apr 9, 2026
Source repository:
- Permalink: Hugo0/voiceio@ed4a92cfb8f9f5498f4ec542a980cb52486902a7
- Branch / Tag: refs/tags/v0.3.8
- Owner: https://github.com/Hugo0
- Access: public
Publication detail:
- Token Issuer: https://token.actions.githubusercontent.com
- Runner Environment: github-hosted
- Publication workflow: publish.yml@ed4a92cfb8f9f5498f4ec542a980cb52486902a7
- Trigger Event: push

File details

Details for the file python_voiceio-0.3.8-py3-none-any.whl.

File metadata

Download URL: python_voiceio-0.3.8-py3-none-any.whl
Upload date: Apr 9, 2026
Size: 2.1 MB
Tags: Python 3
Uploaded using Trusted Publishing? Yes
Uploaded via: twine/6.1.0 CPython/3.13.12

File hashes

Hashes for python_voiceio-0.3.8-py3-none-any.whl
Algorithm	Hash digest
SHA256	`d423b76091a499d995e4427c38cbe147cea87ffd02267bb0a27c50c4c6d8e2e5`
MD5	`3b6ba7f1e5f01592307c604a2d9ec210`
BLAKE2b-256	`82871161a749740f363d5edfd5f53341daae98cc3c631f30782016ab38a480f6`

See more details on using hashes here.

Provenance

The following attestation bundles were made for python_voiceio-0.3.8-py3-none-any.whl:

Publisher: publish.yml on Hugo0/voiceio

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

Statement:
- Statement type: https://in-toto.io/Statement/v1
- Predicate type: https://docs.pypi.org/attestations/publish/v1
- Subject name: python_voiceio-0.3.8-py3-none-any.whl
- Subject digest: d423b76091a499d995e4427c38cbe147cea87ffd02267bb0a27c50c4c6d8e2e5
- Sigstore transparency entry: 1261977410
- Sigstore integration time: Apr 9, 2026
Source repository:
- Permalink: Hugo0/voiceio@ed4a92cfb8f9f5498f4ec542a980cb52486902a7
- Branch / Tag: refs/tags/v0.3.8
- Owner: https://github.com/Hugo0
- Access: public
Publication detail:
- Token Issuer: https://token.actions.githubusercontent.com
- Runner Environment: github-hosted
- Publication workflow: publish.yml@ed4a92cfb8f9f5498f4ec542a980cb52486902a7
- Trigger Event: push

python-voiceio 0.3.8

Navigation

Verified details

Project links

GitHub Statistics

Maintainers

Unverified details

Meta

Classifiers

Project description

voiceio

Quick start

How it works

Features

Models

Commands

Configuration

Troubleshooting

Platform support

Uninstall

Roadmap

License

Project details

Verified details

Project links

GitHub Statistics

Maintainers

Unverified details

Meta

Classifiers

Release history Release notifications | RSS feed

Download files

Source Distribution

Built Distribution

File details

File metadata

File hashes

Provenance

File details

File metadata

File hashes

Provenance