A seamless voice dictation system for Linux

These details have not been verified by PyPI

Project links

Project description

Vocalinux

Voice-to-text for Linux, finally done right!

Vocalinux Users

A seamless free open-source private voice dictation system for Linux, comparable to built-in solutions on macOS and Windows.

📚 What's New in v0.6.0-beta

We're excited to announce our first Beta release! Vocalinux has evolved significantly with the biggest change yet: whisper.cpp is now the default speech recognition engine, bringing massive improvements to speed, compatibility, and ease of installation.

🚀 Major Change: whisper.cpp is Now Default!

Why this is a game-changer:

⚡ 10x faster installation - No more 2.3GB PyTorch downloads (was ~5-10 min, now ~1-2 min)
🎮 Universal GPU support - Works with AMD, Intel, and NVIDIA via Vulkan (not just NVIDIA CUDA)
💾 Smaller footprint - Tiny model is only ~39MB vs ~75MB Whisper
🔥 Better performance - C++ optimized inference with multi-threading
🐍 No Python GIL - True parallel processing for faster transcription

🐛 Critical Bug Fixes

Fixed text escaping issues (apostrophes and quotes no longer have backslash escapes)
Fixed stop sound being transcribed as "thank you"
Fixed missing spaces after punctuation when transcribing again

✨ Key New Features

🤖 whisper.cpp integration - High-performance C++ speech recognition with Vulkan GPU acceleration
📦 Interactive installer - Choose between 3 engines: whisper.cpp (recommended), Whisper, or VOSK
🔧 Hardware auto-detection - Automatically detects your GPU and recommends optimal settings
Customizable keyboard shortcuts - Configure your own activation shortcuts via GUI
Modern GNOME HIG settings dialog - Complete UI overhaul following GNOME design guidelines
Pleasant audio feedback - Gliding tones replace harsh beeps
Better Wayland support - Native keyboard shortcuts without XWayland

🔧 Quality Improvements

80%+ test coverage - Comprehensive test suite across all modules
Microphone reconnection - Automatic recovery when microphone disconnects
Audio buffer management - Prevents memory issues during long recordings
Enhanced logging - Comprehensive debug info for whisper.cpp troubleshooting

🎉 Beta Release with whisper.cpp!

We're excited to share Vocalinux Beta with the community! whisper.cpp brings 10x faster installation and universal GPU support. See "What's New" above for details.

✨ Features

🎤 Double-tap Ctrl to start/stop voice dictation
⚡ Real-time transcription with minimal latency
🌎 Universal compatibility across all Linux applications
🔒 100% Offline operation for privacy and reliability
🤖 whisper.cpp by default - High-performance C++ speech recognition
🎮 Universal GPU support - Vulkan acceleration for AMD, Intel, and NVIDIA
🎨 System tray integration with visual status indicators
🔊 Pleasant audio feedback - smooth gliding tones, headphone-friendly
⚙️ Graphical settings dialog for easy configuration
📦 3 engine choices - whisper.cpp (default), OpenAI Whisper, or VOSK

📸 Screenshots

Here are some screenshots showcasing Vocalinux in action:

Real-time voice-to-text transcription	System tray with listening indicator
About view with version info	Log viewer for debugging
Overview of key features and configuration options with annotations

🚀 Quick Install

Interactive Install (Recommended)

Our new interactive installer guides you through setup with intelligent hardware detection:

curl -fsSL raw.githubusercontent.com/jatinkrmalik/vocalinux/main/install.sh -o /tmp/vl.sh && bash /tmp/vl.sh --tag=v0.6.0-beta

Choose your engine:

whisper.cpp ⭐ (Recommended) - Fast, works with any GPU via Vulkan
Whisper (OpenAI) - PyTorch-based, NVIDIA GPU only
VOSK - Lightweight, works on older systems

The installer will:

Auto-detect your hardware (GPU, RAM, Vulkan support)
Recommend the best engine for your system
Download the appropriate model (~39MB for whisper.cpp tiny)
Install in ~1-2 minutes (vs 5-10 min with old Whisper)

Note: Installs v0.6.0-beta. For other versions, check GitHub Releases.

Installation Options

Default (whisper.cpp - recommended):

curl -fsSL raw.githubusercontent.com/jatinkrmalik/vocalinux/main/install.sh -o /tmp/vl.sh && bash /tmp/vl.sh --tag=v0.6.0-beta

Fastest installation (~1-2 min), universal GPU support via Vulkan.

Whisper (OpenAI) - if you prefer PyTorch:

curl -fsSL raw.githubusercontent.com/jatinkrmalik/vocalinux/main/install.sh -o /tmp/vl.sh && bash /tmp/vl.sh --tag=v0.6.0-beta --engine=whisper

NVIDIA GPU only (~5-10 min, downloads PyTorch + CUDA).

VOSK only - for low-RAM systems:

curl -fsSL raw.githubusercontent.com/jatinkrmalik/vocalinux/main/install.sh -o /tmp/vl.sh && bash /tmp/vl.sh --tag=v0.6.0-beta --engine=vosk

Lightweight option (~40MB), works on systems with 4GB RAM.

Alternative: Install from Source

# Clone the repository
git clone https://github.com/jatinkrmalik/vocalinux.git
cd vocalinux

# Run the installer (will prompt for Whisper)
./install.sh

# Or with Whisper support
./install.sh --with-whisper

The installer handles everything: system dependencies, Python environment, speech models, and desktop integration.

After Installation

# If ~/.local/bin is in your PATH (recommended):
vocalinux

# Or activate the virtual environment first:
source ~/.local/bin/activate-vocalinux.sh
vocalinux

# Or run directly:
~/.local/share/vocalinux/venv/bin/vocalinux

Or launch it from your application menu!

📋 Requirements

OS: Linux (tested on Ubuntu 22.04+, Debian 11+, Fedora 39+, Arch Linux, openSUSE Tumbleweed)
Python: 3.8 or newer
Display: X11 or Wayland
Hardware: Microphone for voice input

Note: See Distribution Compatibility for distribution-specific information and experimental support for Gentoo, Alpine, Void, Solus, and more.

🎙️ Usage

Voice Dictation

Double-tap Ctrl to start recording
Speak clearly into your microphone
Double-tap Ctrl again (or pause speaking) to stop

Voice Commands

Command	Action
"new line"	Inserts a line break
"period" / "full stop"	Types a period (.)
"comma"	Types a comma (,)
"question mark"	Types a question mark (?)
"exclamation mark"	Types an exclamation mark (!)
"delete that"	Deletes the last sentence
"capitalize"	Capitalizes the next word

Command Line Options

vocalinux --help                  # Show all options
vocalinux --debug                 # Enable debug logging
vocalinux --engine whisper_cpp    # Use whisper.cpp engine (default)
vocalinux --engine whisper        # Use OpenAI Whisper engine
vocalinux --engine vosk           # Use VOSK engine
vocalinux --model medium          # Use medium-sized model
vocalinux --wayland               # Force Wayland mode

⚙️ Configuration

Configuration is stored in ~/.config/vocalinux/config.json:

{
  "speech_recognition": {
    "engine": "whisper_cpp",
    "model_size": "tiny",
    "vad_sensitivity": 3,
    "silence_timeout": 2.0
  }
}

You can also configure settings through the graphical Settings dialog (right-click the tray icon).

🔧 Development Setup

# Clone and install in dev mode
git clone https://github.com/jatinkrmalik/vocalinux.git
cd vocalinux
./install.sh --dev

# Activate environment
source venv/bin/activate

# Run tests
pytest

# Run from source with debug
python -m vocalinux.main --debug

📁 Project Structure

vocalinux/
├── src/vocalinux/                 # Main application code
│   ├── speech_recognition/        # Speech recognition engines (VOSK, Whisper, whisper.cpp)
│   │   └── recognition_manager.py # Unified engine interface
│   ├── text_injection/            # Text injection (X11/Wayland)
│   ├── ui/                        # GTK UI components
│   └── utils/                     # Utility functions
│       ├── whispercpp_model_info.py   # whisper.cpp model metadata & hardware detection
│       └── vosk_model_info.py         # VOSK model metadata
├── tests/                         # Test suite
├── scripts/                       # Development utilities
│   └── generate_sounds.py         # Sound generation script
├── resources/                     # Icons and sounds
├── docs/                          # Documentation
└── web/                           # Website source

📖 Documentation

Installation Guide - Detailed installation instructions
Update Guide - How to update Vocalinux
User Guide - Complete user documentation
Contributing - Development setup and contribution guidelines

🔊 Sound Customization

Vocalinux uses smooth, pleasant gliding tones for audio feedback:

Start: Ascending F4→A4 (0.6s) - positive, uplifting
Stop: Descending A4→F4 (0.6s) - resolves completion
Error: Lower descending E4→C4 (0.7s) - gentle but noticeable

All sounds use pure sine waves with smoothstep interpolation for buttery smooth pitch transitions - perfect for headphone use!

Regenerate Sounds

To modify or regenerate the notification sounds:

python scripts/generate_sounds.py

This script generates all three sounds using the same smooth glide algorithm. You can edit the frequencies, durations, and amplitudes in the script to customize the sounds to your preference.

🗺️ Roadmap

~~Custom icon design~~ ✅
~~Graphical settings dialog~~ ✅
~~Whisper AI support~~ ✅
~~Multi-language support (FR, DE, RU)~~ ✅
~~whisper.cpp integration (default engine)~~ ✅
~~Vulkan GPU support~~ ✅
In-app update mechanism
Application-specific commands
Debian/Ubuntu package (.deb)
~~Improved Wayland support~~ ✅
Voice command customization

🤝 Contributing

We welcome contributions! Whether it's bug reports, feature requests, or code contributions, please check out our Contributing Guide.

Quick Links

⭐ Support

If you find Vocalinux useful, please consider:

⭐ Starring this repository
🐛 Reporting bugs you encounter
📖 Improving documentation
🔀 Contributing code

📜 License

This project is licensed under the GNU General Public License v3.0 - see the LICENSE file for details.

Made with ❤️ for the Linux community

Project details

These details have not been verified by PyPI

Project links

Release history Release notifications | RSS feed

0.10.1b0 pre-release

Mar 30, 2026

0.10.0b0 pre-release

Mar 26, 2026

0.9.0b0 pre-release

Mar 14, 2026

0.8.0b0 pre-release

Mar 1, 2026

0.7.0b0 pre-release

Feb 23, 2026

0.6.3b0 pre-release

Feb 19, 2026

0.6.2b0 pre-release

Feb 18, 2026

0.6.1b0 pre-release

Feb 12, 2026

This version

0.6.0b0 pre-release

Feb 12, 2026

0.5.0b0 pre-release

Feb 6, 2026

0.4.1a0 pre-release

Jan 29, 2026

0.4.0a0 pre-release

Jan 29, 2026

0.3.0a0 pre-release

Jan 21, 2026

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

vocalinux-0.6.0b0.tar.gz (397.1 kB view details)

Uploaded Feb 12, 2026 Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

The dropdown lists show the available interpreters, ABIs, and platforms. Enable javascript to be able to filter the list of wheel files.

vocalinux-0.6.0b0-py3-none-any.whl (470.1 kB view details)

Uploaded Feb 12, 2026 Python 3

File details

Details for the file vocalinux-0.6.0b0.tar.gz.

File metadata

Download URL: vocalinux-0.6.0b0.tar.gz
Upload date: Feb 12, 2026
Size: 397.1 kB
Tags: Source
Uploaded using Trusted Publishing? No
Uploaded via: twine/6.2.0 CPython/3.10.19

File hashes

Hashes for vocalinux-0.6.0b0.tar.gz
Algorithm	Hash digest
SHA256	`27861fcaf8f89f1ab4ed44b78edefce7247357e84476bf15b7bab70425c5a001`
MD5	`5aa69d413171aaf946dfc88f8c29d94e`
BLAKE2b-256	`8a75b703aa11f305e3e25669b10f88c78b81bdce0ce9b793a70ce8f72a8cc233`

See more details on using hashes here.

File details

Details for the file vocalinux-0.6.0b0-py3-none-any.whl.

File metadata

Download URL: vocalinux-0.6.0b0-py3-none-any.whl
Upload date: Feb 12, 2026
Size: 470.1 kB
Tags: Python 3
Uploaded using Trusted Publishing? No
Uploaded via: twine/6.2.0 CPython/3.10.19

File hashes

Hashes for vocalinux-0.6.0b0-py3-none-any.whl
Algorithm	Hash digest
SHA256	`c3d3d329996bde6c253483aba8f53054b4b2503b818695c10770e6a1eb506338`
MD5	`4cfe74e139f1008046ab3e30afaf78ce`
BLAKE2b-256	`6e6c86710922ff1239e52f4665e18d276b27bafbb0d438ae54f68535ea786fe1`

See more details on using hashes here.

vocalinux 0.6.0b0

Navigation

Verified details

Maintainers

Unverified details

Project links

Meta

Classifiers

Project description

Vocalinux

Voice-to-text for Linux, finally done right!

📚 What's New in v0.6.0-beta

🚀 Major Change: whisper.cpp is Now Default!

🐛 Critical Bug Fixes

✨ Key New Features

🔧 Quality Improvements

✨ Features

📸 Screenshots

🚀 Quick Install

Interactive Install (Recommended)

Installation Options

Alternative: Install from Source

After Installation

📋 Requirements

🎙️ Usage

Voice Dictation

Voice Commands

Command Line Options

⚙️ Configuration

🔧 Development Setup

📁 Project Structure

📖 Documentation

🔊 Sound Customization

Regenerate Sounds

🗺️ Roadmap

🤝 Contributing

Quick Links

⭐ Support

📜 License

Project details

Verified details

Maintainers

Unverified details

Project links

Meta

Classifiers

Release history Release notifications | RSS feed

Download files

Source Distribution

Built Distribution

File details

File metadata

File hashes

File details

File metadata

File hashes