Voice input layer: captures spoken intent, transcribes locally, injects into active workflow or agent pipeline.
Project description
Vox
Vox is the voice input layer for the system. It captures speech via push-to-talk, transcribes it locally with faster-whisper, and injects the text into the clipboard (and optionally into the focused window). No cloud calls; no silent failures.
Install
From PyPI (no clone required; package name is vox-core because vox is taken on PyPI):
uvx vox-core
Or install the tool, then run it (the CLI command is still vox):
pip install vox-core
vox
From source (development or latest):
git clone https://github.com/jeffrichley/vox.git && cd vox
uv sync
uv run vox
Or install in editable mode: pip install -e . (use a venv), then vox.
Pre-built binaries (GitHub Releases)
Packaged binaries for Windows, macOS, and Linux are built on each GitHub Release. Download the archive for your platform (e.g. vox-<version>-windows-amd64.zip, vox-<version>-macos-arm64.zip, vox-<version>-linux-x86_64.tar.gz), unpack it, then run the binary:
- Windows: Unzip the archive, then run
vox.exefrom inside thevoxfolder (e.g..\vox\vox.exe --help). - macOS / Linux: Unzip or untar the archive, then run
./vox/voxfrom the extracted directory (e.g../vox/vox --help).
On first run, the Whisper model is downloaded automatically (see Transcription model). Binaries are built with PyInstaller and are not signed; you may see a security or Gatekeeper prompt on Windows or macOS.
Configuration
- Config file:
~/.vox/vox.toml. Create the directory if needed:mkdir -p ~/.vox. - Override path: set
VOX_CONFIGto the full path of your config file. - Env overrides:
VOX_HOTKEY,VOX_DEVICE_ID,VOX_MODEL_SIZE,VOX_COMPUTE_TYPE,VOX_COMPUTE_DEVICE,VOX_INJECTION_MODE,VOX_TRAYoverride the same keys from the file.
Copy the example config and edit:
cp vox.toml.example ~/.vox/vox.toml
# Edit hotkey and optionally device_id, model_size, etc.
Commands
voxorvox run— Start push-to-talk. By default a small window with a Stop button appears; withuse_tray = truein config (orVOX_TRAY=1), a system tray icon is shown instead—click the icon and choose Quit to stop. Press and hold your configured hotkey, speak, release; the audio is transcribed and placed on the clipboard (and optionally pasted into the focused window).vox devices— List audio input devices (ID, name, host API). Use this to choosedevice_idin config.vox test-mic [--device ID] [--seconds N]— Record for N seconds, play back the recording, then transcribe and print text. Default 2 seconds. Use to verify mic and model before usingvox.
Transcription model (faster-whisper)
- First run: The model is downloaded automatically from Hugging Face (size from config, default
base). No system FFmpeg required (PyAV is used). - CPU: Use
compute_type = "int8"in config for lower memory and faster inference. - GPU: Set
compute_device = "cuda"andcompute_type = "float16"(orint8) in config. Requires CUDA 12 and cuDNN 9. - Model size: Set
model_sizein config (e.g.tiny,base,small,medium,large-v3) for speed vs accuracy.
OS permissions
- Microphone: Required for capture. On Windows, allow app access to the microphone. On macOS, grant Microphone access when prompted.
- Accessibility / input injection: Only needed if you use
injection_mode = "clipboard_and_paste"(paste into focused window). On Windows, run the app with normal privileges; on macOS, grant Accessibility permission to Terminal (or the app runningvox run) so it can simulate paste.
Definition of Visible Done
A human can verify the MVP by:
- Install: From repo run
uv sync(orpip install -e .). - Configure: Copy
vox.toml.exampleto~/.vox/vox.toml; sethotkey(e.g.ctrl+shift+v) and optionallydevice_id,model_size,compute_type,injection_mode. - Run: Execute
uv run voxorvox(orvox run). A small “Vox” window with a Stop button appears, or a tray icon ifuse_trayis enabled. - Trigger: Focus any text field (or leave focus anywhere). Press and hold the configured hotkey, speak a short phrase, release the key.
- Verify: Paste from clipboard (Ctrl+V / Cmd+V) and see the transcribed phrase. If
injection_mode = "clipboard_and_paste", the text also appears in the focused field. - Stop: Click Stop in the Vox window (or close the window) to exit.
- Errors: If mic or model is missing, a clear error message appears (no silent failure).
Development
- Quality gate:
just test quality(tests, format, lint, types, docstrings, security checks). - Tests:
just test(pytest). Unit tests undertests/unit/, integration undertests/integration/.
Project details
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file vox_core-0.2.1.tar.gz.
File metadata
- Download URL: vox_core-0.2.1.tar.gz
- Upload date:
- Size: 7.7 MB
- Tags: Source
- Uploaded using Trusted Publishing? Yes
- Uploaded via: uv/0.10.11 {"installer":{"name":"uv","version":"0.10.11","subcommand":["publish"]},"python":null,"implementation":{"name":null,"version":null},"distro":{"name":"Ubuntu","version":"24.04","id":"noble","libc":null},"system":{"name":null,"release":null},"cpu":null,"openssl_version":null,"setuptools_version":null,"rustc_version":null,"ci":true}
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
16a039ba13f240b5e51137159c0e1cc24da7575bb5c291c40564ec098c540f53
|
|
| MD5 |
1db8b03a259aaa32d90bb16628a09cd5
|
|
| BLAKE2b-256 |
6a28a32a3393b2053d41144db456aad49977ce47f8194f887beb4076d538437b
|
File details
Details for the file vox_core-0.2.1-py3-none-any.whl.
File metadata
- Download URL: vox_core-0.2.1-py3-none-any.whl
- Upload date:
- Size: 3.7 MB
- Tags: Python 3
- Uploaded using Trusted Publishing? Yes
- Uploaded via: uv/0.10.11 {"installer":{"name":"uv","version":"0.10.11","subcommand":["publish"]},"python":null,"implementation":{"name":null,"version":null},"distro":{"name":"Ubuntu","version":"24.04","id":"noble","libc":null},"system":{"name":null,"release":null},"cpu":null,"openssl_version":null,"setuptools_version":null,"rustc_version":null,"ci":true}
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
708eb4989a28905ad365ecfba685a09ef4fb40a0e2b01a0b58acbc1ef94d02d8
|
|
| MD5 |
2957660631cf6d57828a5a7a75f97d41
|
|
| BLAKE2b-256 |
1f1ec9f75cf4ca97671c5cc184c7c6afb53a03cb0feab336d4a5cedb6b6ce185
|