Skip to main content

Voice Recognition Bridge for Linux - Speak naturally, control your system, type hands-free

Project description

Termivox

Voice Recognition Bridge for Linux โ€” Speak naturally, control your system, type hands-free.


๐ŸŽฏ Overview

Termivox is a Linux-based voice recognition system that transforms your speech into text and system commands. Using offline voice recognition (Vosk), it provides:

  • Hands-free dictation - Speak and watch your words appear
  • Voice-controlled system commands - Copy, paste, click, scroll by voice
  • Multi-language support - English and French recognition
  • Toggle control - Pause/resume recognition instantly like a guitar pedal
  • Privacy-first - All processing happens locally, no cloud required

โœจ Features

๐ŸŽค Voice Recognition

  • Offline speech-to-text powered by Vosk
  • Bilingual support: English (en) and French (fr)
  • Punctuation by voice - Say "comma", "period", "question mark"
  • Edit commands - "new line", "tab", "new paragraph"
  • System commands - "copy", "paste", "click", "scroll up/down"

๐ŸŽ›๏ธ Toggle Control (NEW!)

Control voice recognition ON/OFF with multiple interfaces:

โŒจ๏ธ Global Hotkey

  • Press Ctrl+Alt+V from anywhere to toggle
  • Customizable key combination
  • Works across all applications

๐Ÿ–ฑ๏ธ Desktop Widget

  • Minimal floating window (160ร—70px)
  • One-click toggle button
  • Visual status: "LISTENING" (green) / "MUTED" (gray)
  • Draggable, always-on-top
  • Never steals cursor focus

๐ŸŽ›๏ธ System Tray Icon

  • Green/red status indicator
  • Click to toggle
  • Right-click menu

๐ŸŽฎ Hardware Support (Coming Soon)

  • USB foot pedal support
  • MIDI controller integration
  • Custom button devices

๐Ÿ“ฆ Installation

Prerequisites

System Requirements:

  • Linux (tested on Ubuntu 24.04)
  • Python 3.8+
  • Microphone input

System Dependencies:

sudo apt install python3-pyaudio xdotool sox portaudio19-dev -y

Setup

  1. Clone the repository:

    git clone https://github.com/Gerico1007/termivox.git
    cd termivox
    
  2. Create virtual environment:

    python3 -m venv termivox-env
    source termivox-env/bin/activate
    
  3. Install Python dependencies:

    pip install -r requirements.txt
    
  4. Download voice model (if not already present):

    python download_model.py
    
  5. Run Termivox:

    ./run.sh
    

๐Ÿš€ Usage

Quick Start

Launch with toggle control:

./run.sh

Original mode (no toggle):

source termivox-env/bin/activate
python src/main.py --no-toggle

Test voice recognition only:

source termivox-env/bin/activate
python src/test_voice_script.py --lang en

Toggle Control

Once Termivox is running, control it using:

Hotkey:

  • Press Ctrl+Alt+V โ†’ Pauses/resumes voice recognition
  • Works from any window, keeps cursor position

Widget:

  • Click the floating "LISTENING" or "MUTED" button
  • Drag the title bar to reposition
  • Right-click to close widget

Indicator:

  • Green = Voice recognition ACTIVE (listening)
  • Gray/Red = Voice recognition MUTED (paused)

Voice Commands

Dictation:

"Hello world" โ†’ types: Hello world

Punctuation:

"Hello comma world period" โ†’ types: Hello, world.

Available punctuation:

  • comma, period, question mark, exclamation mark
  • colon, semicolon, dash, quote, apostrophe

Editing:

"new line"       โ†’ โ†ต
"new paragraph"  โ†’ โ†ตโ†ต
"tab"            โ†’ โ‡ฅ

System Commands:

"copy"           โ†’ Ctrl+C
"paste"          โ†’ Ctrl+V
"select all"     โ†’ Ctrl+A
"click"          โ†’ Mouse click
"scroll up"      โ†’ Scroll wheel up
"scroll down"    โ†’ Scroll wheel down

Language Selection

English (default):

./run.sh
# or
python src/main.py --lang en

French:

python src/main.py --lang fr

โš™๏ธ Configuration

Edit config/settings.json to customize behavior:

{
  "interfaces": {
    "hotkey": {
      "enabled": true,
      "key": "ctrl+alt+v"        // Change hotkey here
    },
    "tray": {
      "enabled": false            // Enable system tray icon
    },
    "widget": {
      "enabled": true,            // Desktop widget
      "position": {"x": 100, "y": 100},
      "size": {"width": 160, "height": 70},
      "always_on_top": true
    }
  },
  "voice": {
    "language": "en",             // Default language
    "auto_space": true            // Auto-add spaces
  }
}

Custom Hotkey Examples:

  • "ctrl+shift+v"
  • "ctrl+alt+t"
  • "super+v"

๐Ÿ“ Project Structure

termivox/
โ”œโ”€โ”€ src/
โ”‚   โ”œโ”€โ”€ main.py                    # Main entry point with toggle support
โ”‚   โ”œโ”€โ”€ test_voice_script.py       # Standalone testing utility
โ”‚   โ”œโ”€โ”€ voice/
โ”‚   โ”‚   โ”œโ”€โ”€ recognizer.py          # Vosk voice recognition engine
โ”‚   โ”‚   โ””โ”€โ”€ __init__.py
โ”‚   โ”œโ”€โ”€ bridge/
โ”‚   โ”‚   โ”œโ”€โ”€ xdotool_bridge.py      # System command executor
โ”‚   โ”‚   โ””โ”€โ”€ __init__.py
โ”‚   โ”œโ”€โ”€ ui/                        # Toggle control interfaces
โ”‚   โ”‚   โ”œโ”€โ”€ toggle_controller.py   # Central state management
โ”‚   โ”‚   โ”œโ”€โ”€ hotkey_interface.py    # Global hotkey listener
โ”‚   โ”‚   โ”œโ”€โ”€ tray_interface.py      # System tray icon
โ”‚   โ”‚   โ”œโ”€โ”€ widget_interface.py    # Desktop widget
โ”‚   โ”‚   โ”œโ”€โ”€ hardware_interface.py  # Hardware button stub
โ”‚   โ”‚   โ”œโ”€โ”€ config_loader.py       # Configuration system
โ”‚   โ”‚   โ””โ”€โ”€ __init__.py
โ”‚   โ””โ”€โ”€ utils/
โ”‚       โ””โ”€โ”€ __init__.py
โ”œโ”€โ”€ config/
โ”‚   โ””โ”€โ”€ settings.json              # User configuration
โ”œโ”€โ”€ voice_models/                  # Vosk language models
โ”‚   โ””โ”€โ”€ vosk-model-small-en-us-0.15/
โ”œโ”€โ”€ requirements.txt               # Python dependencies
โ”œโ”€โ”€ run.sh                         # Launch script
โ”œโ”€โ”€ download_model.py              # Model downloader
โ””โ”€โ”€ README.md

๐Ÿ› ๏ธ Dependencies

Python Packages:

  • Vosk - Offline speech recognition
  • pyaudio - Microphone input
  • numpy - Audio processing
  • pynput - Global hotkey support
  • pystray - System tray icon
  • Pillow - Icon generation
  • xdotool - System command execution

System Packages:

  • python3-pyaudio - PyAudio bindings
  • xdotool - Keyboard/mouse automation
  • sox - Audio utilities
  • portaudio19-dev - Audio development headers

๐ŸŽจ Toggle Widget Design

Minimal Professional Aesthetic:

โ”Œโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”
โ”‚ TERMIVOX         โ— โ”‚  โ† Dark title bar (draggable)
โ”œโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”ค
โ”‚                     โ”‚
โ”‚    LISTENING        โ”‚  โ† Green button (active state)
โ”‚                     โ”‚
โ””โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”˜

Features:

  • Compact: 160ร—70 pixels
  • Unfocusable: Never steals cursor
  • Draggable: Reposition anywhere
  • Color-coded: Green (ON) / Gray (OFF)
  • Always-on-top: Stays visible

๐Ÿงช Testing

Test voice recognition without typing:

source termivox-env/bin/activate
python src/test_voice_script.py --lang en

Test with toggle control:

./run.sh
# Then try:
# 1. Speak something
# 2. Press Ctrl+Alt+V
# 3. Speak again (should not type)
# 4. Press Ctrl+Alt+V
# 5. Speak (should type again)

Test different languages:

python src/test_voice_script.py --lang fr  # French
python src/test_voice_script.py --lang en  # English

๐Ÿ› Troubleshooting

Hotkey doesn't work:

  • Check terminal for errors
  • Try different hotkey in config/settings.json
  • Ensure pynput is installed: pip list | grep pynput

No voice recognition:

  • Check microphone: arecord -l
  • Test PyAudio: python -c "import pyaudio; print('OK')"
  • Verify Vosk model downloaded in voice_models/

Widget not visible:

  • Enable in config: "widget": {"enabled": true}
  • Check if tkinter available: python -c "import tkinter"

System tray icon missing:

  • Desktop environment may not support system tray
  • Use widget or hotkey instead
  • Try enabling: "tray": {"enabled": true}

๐Ÿค Contributing

Contributions welcome! Areas for enhancement:

  • Additional language models
  • Custom wake word detection
  • Audio feedback on toggle
  • Hardware button integration
  • Voice command macros
  • GUI configuration tool

To contribute:

  1. Fork the repository
  2. Create feature branch: git checkout -b feature/amazing-feature
  3. Commit changes: git commit -m 'Add amazing feature'
  4. Push to branch: git push origin feature/amazing-feature
  5. Open Pull Request

๐Ÿ“„ License

MIT License - See LICENSE file for details


๐Ÿ™ Acknowledgments

  • Vosk - Offline speech recognition engine
  • pynput - Cross-platform input control
  • pystray - System tray integration
  • xdotool - X11 automation

๐Ÿ”ฎ Roadmap

  • Voice command macros
  • Custom wake word support
  • GUI settings editor
  • Hardware button integration (foot pedal, MIDI)
  • Audio feedback options
  • Additional language models
  • Plugin system for custom commands
  • Cloud sync for settings (optional)

โ™ ๏ธ Nyro - Structural foundation, modular architecture ๐ŸŒฟ Aureon - Flow preservation, accessibility focus ๐ŸŽธ JamAI - Musical encoding, harmonic design

Built with recursive intention. Speak, toggle, flow.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

termivox-0.1.2.tar.gz (27.8 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

termivox-0.1.2-py3-none-any.whl (27.7 kB view details)

Uploaded Python 3

File details

Details for the file termivox-0.1.2.tar.gz.

File metadata

  • Download URL: termivox-0.1.2.tar.gz
  • Upload date:
  • Size: 27.8 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.10.12

File hashes

Hashes for termivox-0.1.2.tar.gz
Algorithm Hash digest
SHA256 5adb1c2e83f8f42459230000f6c8ef8537a2b573c6fd605b5aae71cf7fceacbc
MD5 fa799d5997bbc0b737d91fd46e006e60
BLAKE2b-256 55df63acfb68b5993562b8880b977b3303559938ae5b11c2060b4777b5e7e6ed

See more details on using hashes here.

File details

Details for the file termivox-0.1.2-py3-none-any.whl.

File metadata

  • Download URL: termivox-0.1.2-py3-none-any.whl
  • Upload date:
  • Size: 27.7 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.10.12

File hashes

Hashes for termivox-0.1.2-py3-none-any.whl
Algorithm Hash digest
SHA256 f0acc8c82f32a2f40c8e072b30f625c37db586abd324315d7bef4981a435c68c
MD5 ce5cc30d71cb3c11dca626192279f314
BLAKE2b-256 9c96e7aaa6c045f68261f8b10e35cc646cd5cca83c6b8ded6203aa7a64894aba

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page