Voice dictation daemon for Linux with Sarvam AI STT
Project description
voxd
Voice dictation daemon for Linux with modular architecture:
core: recording, Sarvam AI transcription, and clipboard integrationwrapper.dwm: X11-focused wrapper fordwmwrapper.hyprland: Wayland-focused wrapper forhyprland
Press keybind once to start recording, press again to stop. Transcribed text is copied to clipboard.
Install
pip install voxd
System dependencies:
ffmpeg(audio capture)xcliporxsel(clipboard on X11)wl-clipboard(clipboard on Wayland)
Setup
1. Configure API key
voxd config set api_key YOUR_SARVAM_API_KEY
2. (Optional) Set language
voxd config set language hi-IN # Hindi
voxd config set language en-IN # English (default)
3. Start daemon at boot
dwm / X11 - Add to ~/.xinitrc (before exec dwm):
voxd-daemon &
hyprland / Wayland - Add to ~/.config/hypr/hyprland.conf:
exec-once = voxd-daemon
4. Setup keybind
dwm - Edit your config.h:
{ MODKEY, XK_semicolon, spawn, SHCMD("voxd-dwm toggle") },
Then recompile: sudo make clean install
hyprland - Add to ~/.config/hypr/hyprland.conf:
bind = SUPER, semicolon, exec, voxd-hypr toggle
Reload: hyprctl reload
Usage
Press your keybind once to start recording, press again to stop. The transcribed text is copied to clipboard - paste with Ctrl+V!
Terminal commands:
voxd toggle # Toggle recording
voxd status # Check status
voxd quit # Stop daemon
voxd config list # Show config
voxd config set key val # Set config value
Configuration
Config is stored in ~/.config/voxd/config.json
Available settings:
api_key- Sarvam AI API keymodel- STT model (default: saaras:v3)language- Language code (default: en-IN)output_mode- Output mode (auto/x11/wayland)
Architecture
src/voxd/
core/
config.py # env + runtime paths
config_manager.py # persistent config
recorder.py # ffmpeg process lifecycle
sarvam_client.py # Sarvam SDK integration
injector.py # clipboard integration
service.py # unix socket daemon + toggle logic
wrappers/
dwm.py # X11 wrapper (output mode x11)
hyprland.py # Wayland wrapper (output mode wayland)
cli.py # daemon server + client commands
Notes
- Socket path:
${XDG_RUNTIME_DIR}/voxd/control.sock - Captured
.wavfiles:${XDG_RUNTIME_DIR}/voxd/ - Status file:
${XDG_RUNTIME_DIR}/voxd/status - Notifications show recording state via
notify-send
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file voxd-0.1.0.tar.gz.
File metadata
- Download URL: voxd-0.1.0.tar.gz
- Upload date:
- Size: 11.6 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.2.0 CPython/3.14.3
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
468d269d44a859f1ec5c2757d90cb2de811563d45cee4df8f03df55323d953e8
|
|
| MD5 |
08cfe9d851cb18fd2d3f14a5a83997f9
|
|
| BLAKE2b-256 |
4085d3f6bdc7f86b60056603cce61873ca8680fdff9ef33f8cdfac4d2be74821
|
File details
Details for the file voxd-0.1.0-py3-none-any.whl.
File metadata
- Download URL: voxd-0.1.0-py3-none-any.whl
- Upload date:
- Size: 12.6 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.2.0 CPython/3.14.3
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
4a5819aaf9ce7bd5408914c439c59a342c548b940bb2e8c67ab5002ad056bdca
|
|
| MD5 |
381d03799d05aad87122f2ef455ebbc8
|
|
| BLAKE2b-256 |
4fc03eda37ad02fb03f8603f59f20f3daaaeaf733acfca2a9cb5b896933d9ccb
|