Skip to main content

Voice dictation daemon for Linux with Sarvam AI STT

Project description

voxd

Voice dictation daemon for Linux with modular architecture:

  • core: recording, Sarvam AI transcription, and clipboard integration
  • wrapper.dwm: X11-focused wrapper for dwm
  • wrapper.hyprland: Wayland-focused wrapper for hyprland

Press keybind once to start recording, press again to stop. Transcribed text is copied to clipboard.

Install

pip install voxd

System dependencies:

  • ffmpeg (audio capture)
  • xclip or xsel (clipboard on X11)
  • wl-clipboard (clipboard on Wayland)

Setup

1. Configure API key

voxd config set api_key YOUR_SARVAM_API_KEY

2. (Optional) Set language

voxd config set language hi-IN  # Hindi
voxd config set language en-IN  # English (default)

3. Start daemon at boot

dwm / X11 - Add to ~/.xinitrc (before exec dwm):

voxd-daemon &

hyprland / Wayland - Add to ~/.config/hypr/hyprland.conf:

exec-once = voxd-daemon

4. Setup keybind

dwm - Edit your config.h:

{ MODKEY, XK_semicolon, spawn, SHCMD("voxd-dwm toggle") },

Then recompile: sudo make clean install

hyprland - Add to ~/.config/hypr/hyprland.conf:

bind = SUPER, semicolon, exec, voxd-hypr toggle

Reload: hyprctl reload

Usage

Press your keybind once to start recording, press again to stop. The transcribed text is copied to clipboard - paste with Ctrl+V!

Terminal commands:

voxd toggle              # Toggle recording
voxd status              # Check status
voxd quit                # Stop daemon
voxd config list         # Show config
voxd config set key val  # Set config value

Configuration

Config is stored in ~/.config/voxd/config.json

Available settings:

  • api_key - Sarvam AI API key
  • model - STT model (default: saaras:v3)
  • language - Language code (default: en-IN)
  • output_mode - Output mode (auto/x11/wayland)

Architecture

src/voxd/
  core/
    config.py          # env + runtime paths
    config_manager.py  # persistent config
    recorder.py        # ffmpeg process lifecycle
    sarvam_client.py   # Sarvam SDK integration
    injector.py        # clipboard integration
    service.py         # unix socket daemon + toggle logic
  wrappers/
    dwm.py             # X11 wrapper (output mode x11)
    hyprland.py        # Wayland wrapper (output mode wayland)
  cli.py               # daemon server + client commands

Notes

  • Socket path: ${XDG_RUNTIME_DIR}/voxd/control.sock
  • Captured .wav files: ${XDG_RUNTIME_DIR}/voxd/
  • Status file: ${XDG_RUNTIME_DIR}/voxd/status
  • Notifications show recording state via notify-send

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

voxd-0.1.0.tar.gz (11.6 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

voxd-0.1.0-py3-none-any.whl (12.6 kB view details)

Uploaded Python 3

File details

Details for the file voxd-0.1.0.tar.gz.

File metadata

  • Download URL: voxd-0.1.0.tar.gz
  • Upload date:
  • Size: 11.6 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.14.3

File hashes

Hashes for voxd-0.1.0.tar.gz
Algorithm Hash digest
SHA256 468d269d44a859f1ec5c2757d90cb2de811563d45cee4df8f03df55323d953e8
MD5 08cfe9d851cb18fd2d3f14a5a83997f9
BLAKE2b-256 4085d3f6bdc7f86b60056603cce61873ca8680fdff9ef33f8cdfac4d2be74821

See more details on using hashes here.

File details

Details for the file voxd-0.1.0-py3-none-any.whl.

File metadata

  • Download URL: voxd-0.1.0-py3-none-any.whl
  • Upload date:
  • Size: 12.6 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.14.3

File hashes

Hashes for voxd-0.1.0-py3-none-any.whl
Algorithm Hash digest
SHA256 4a5819aaf9ce7bd5408914c439c59a342c548b940bb2e8c67ab5002ad056bdca
MD5 381d03799d05aad87122f2ef455ebbc8
BLAKE2b-256 4fc03eda37ad02fb03f8603f59f20f3daaaeaf733acfca2a9cb5b896933d9ccb

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page