Skip to main content

A seamless voice dictation system for Linux

Project description

Vocalinux

Voice-to-text for Linux, finally done right!

Status: Alpha GitHub release License: GPL v3

Vocalinux CI Platform: Linux Python 3.8+ Made with GTK codecov

GitHub stars GitHub forks GitHub watchers Last commit Commit activity Contributions welcome GitHub issues

Vocalinux Users

A seamless free open-source private voice dictation system for Linux, comparable to the built-in solutions on macOS and Windows.

๐ŸŽ‰ Alpha Release!

We're excited to share Vocalinux with the community. Try it out and let us know what you think!


โœจ Features

  • ๐ŸŽค Double-tap Ctrl to start/stop voice dictation
  • โšก Real-time transcription with minimal latency
  • ๐ŸŒŽ Universal compatibility across all Linux applications
  • ๐Ÿ”’ Offline operation for privacy and reliability (with VOSK)
  • ๐Ÿค– Optional Whisper AI support for enhanced accuracy
  • ๐ŸŽจ System tray integration with visual status indicators
  • ๐Ÿ”Š Audio feedback for recording status
  • โš™๏ธ Graphical settings dialog for easy configuration

๐Ÿš€ Quick Install

One-liner Installation (Recommended)

curl -fsSL https://raw.githubusercontent.com/jatinkrmalik/vocalinux/main/install.sh | bash -s -- --tag=v0.4.1-alpha

Note: Installs v0.4.1-alpha. For the most recent version, check GitHub Releases.

This will:

  • Clone the repository to ~/.local/share/vocalinux-install
  • Install all system dependencies
  • Set up a virtual environment in ~/.local/share/vocalinux/venv
  • Install both VOSK and Whisper AI speech engines:
    • VOSK: installs the vosk Python package from PyPI
    • Whisper: installs the openai-whisper package from PyPI, which also pulls in PyTorch (the ML framework Whisper requires)
  • Create a symlink at ~/.local/bin/vocalinux
  • Download the default Whisper tiny speech model (~75MB)

โฑ๏ธ Note: Installation takes ~5-10 minutes due to Whisper AI dependencies (PyTorch with CUDA support, ~2.3GB).

Whisper with CPU-only PyTorch (no NVIDIA GPU needed):

curl -fsSL https://raw.githubusercontent.com/jatinkrmalik/vocalinux/main/install.sh | bash -s -- --tag=v0.4.1-alpha --whisper-cpu

This installs Whisper with CPU-only PyTorch (~200MB instead of ~2.3GB). Works great for systems without NVIDIA GPU.

For low-RAM systems (8GB or less) - VOSK only:

curl -fsSL https://raw.githubusercontent.com/jatinkrmalik/vocalinux/main/install.sh | bash -s -- --tag=v0.4.1-alpha --no-whisper

This skips Whisper installation entirely and configures VOSK as the default engine.

Alternative: Install from Source

# Clone the repository
git clone https://github.com/jatinkrmalik/vocalinux.git
cd vocalinux

# Run the installer (will prompt for Whisper)
./install.sh

# Or with Whisper support
./install.sh --with-whisper

The installer handles everything: system dependencies, Python environment, speech models, and desktop integration.

After Installation

# If ~/.local/bin is in your PATH (recommended):
vocalinux

# Or activate the virtual environment first:
source ~/.local/bin/activate-vocalinux.sh
vocalinux

# Or run directly:
~/.local/share/vocalinux/venv/bin/vocalinux

Or launch it from your application menu!

Uninstall

# If installed via curl:
curl -fsSL https://raw.githubusercontent.com/jatinkrmalik/vocalinux/main/uninstall.sh | bash

# If installed from source:
./uninstall.sh

๐Ÿ“‹ Requirements

  • OS: Ubuntu 22.04+ (other Linux distros may work)
  • Python: 3.8 or newer
  • Display: X11 or Wayland
  • Hardware: Microphone for voice input

๐ŸŽ™๏ธ Usage

Voice Dictation

  1. Double-tap Ctrl to start recording
  2. Speak clearly into your microphone
  3. Double-tap Ctrl again (or pause speaking) to stop

Voice Commands

Command Action
"new line" Inserts a line break
"period" / "full stop" Types a period (.)
"comma" Types a comma (,)
"question mark" Types a question mark (?)
"exclamation mark" Types an exclamation mark (!)
"delete that" Deletes the last sentence
"capitalize" Capitalizes the next word

Command Line Options

vocalinux --help              # Show all options
vocalinux --debug             # Enable debug logging
vocalinux --engine whisper    # Use Whisper AI engine
vocalinux --model medium      # Use medium-sized model
vocalinux --wayland           # Force Wayland mode

โš™๏ธ Configuration

Configuration is stored in ~/.config/vocalinux/config.json:

{
  "speech_recognition": {
    "engine": "vosk",
    "model_size": "small",
    "vad_sensitivity": 3,
    "silence_timeout": 2.0
  }
}

You can also configure settings through the graphical Settings dialog (right-click the tray icon).

๐Ÿ”ง Development Setup

# Clone and install in dev mode
git clone https://github.com/jatinkrmalik/vocalinux.git
cd vocalinux
./install.sh --dev

# Activate environment
source venv/bin/activate

# Run tests
pytest

# Run from source with debug
python -m vocalinux.main --debug

๐Ÿ“ Project Structure

vocalinux/
โ”œโ”€โ”€ src/vocalinux/           # Main application code
โ”‚   โ”œโ”€โ”€ speech_recognition/  # Speech recognition engines
โ”‚   โ”œโ”€โ”€ text_injection/      # Text injection (X11/Wayland)
โ”‚   โ”œโ”€โ”€ ui/                  # GTK UI components
โ”‚   โ””โ”€โ”€ utils/               # Utility functions
โ”œโ”€โ”€ tests/                   # Test suite
โ”œโ”€โ”€ resources/               # Icons and sounds
โ”œโ”€โ”€ docs/                    # Documentation
โ””โ”€โ”€ web/                     # Website source

๐Ÿ“– Documentation

๐Ÿ—บ๏ธ Roadmap

  • Custom icon design โœ…
  • Graphical settings dialog โœ…
  • Whisper AI support โœ…
  • Multi-language support (FR, DE, RU) โœ…
  • In-app update mechanism
  • Application-specific commands
  • Debian/Ubuntu package (.deb)
  • Improved Wayland support
  • Voice command customization

๐Ÿค Contributing

We welcome contributions! Whether it's bug reports, feature requests, or code contributions, please check out our Contributing Guide.

Quick Links

โญ Support

If you find Vocalinux useful, please consider:

  • โญ Starring this repository
  • ๐Ÿ› Reporting bugs you encounter
  • ๐Ÿ“– Improving documentation
  • ๐Ÿ”€ Contributing code

๐Ÿ“œ License

This project is licensed under the GNU General Public License v3.0 - see the LICENSE file for details.


Made with โค๏ธ for the Linux community

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

vocalinux-0.4.1a0.tar.gz (775.7 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

vocalinux-0.4.1a0-py3-none-any.whl (871.4 kB view details)

Uploaded Python 3

File details

Details for the file vocalinux-0.4.1a0.tar.gz.

File metadata

  • Download URL: vocalinux-0.4.1a0.tar.gz
  • Upload date:
  • Size: 775.7 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.10.19

File hashes

Hashes for vocalinux-0.4.1a0.tar.gz
Algorithm Hash digest
SHA256 13ef9e307e5c55fa8e88b67cd78e062a69bf351e95711d2bd4d95b2ce8c84571
MD5 1eca8a3155e64359f11d51f6d761d2b3
BLAKE2b-256 69fd6e9aa57bdcee9e908230ec6a28b768ee15d737b916fdcb4c14a06b5bff08

See more details on using hashes here.

File details

Details for the file vocalinux-0.4.1a0-py3-none-any.whl.

File metadata

  • Download URL: vocalinux-0.4.1a0-py3-none-any.whl
  • Upload date:
  • Size: 871.4 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.10.19

File hashes

Hashes for vocalinux-0.4.1a0-py3-none-any.whl
Algorithm Hash digest
SHA256 f86bc73822f570dfd20b41fdbc330aa432cb85192ac5d21b013c4dea7b81cc0f
MD5 d3fdd0798acf3ddaefe737b9a0b5a07a
BLAKE2b-256 784156e9f46a719fd0ac1b0e1ae037441c08b5568d10bc2e4d105fe7d2a5d5e2

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page