A seamless voice dictation system for Linux
Project description
Vocalinux
Voice-to-text for Linux, finally done right!
A seamless free open-source private voice dictation system for Linux, comparable to the built-in solutions on macOS and Windows.
๐ Alpha Release!
We're excited to share Vocalinux with the community. Try it out and let us know what you think!
โจ Features
- ๐ค Double-tap Ctrl to start/stop voice dictation
- โก Real-time transcription with minimal latency
- ๐ Universal compatibility across all Linux applications
- ๐ Offline operation for privacy and reliability (with VOSK)
- ๐ค Optional Whisper AI support for enhanced accuracy
- ๐จ System tray integration with visual status indicators
- ๐ Audio feedback for recording status
- โ๏ธ Graphical settings dialog for easy configuration
๐ Quick Install
One-liner Installation (Recommended)
curl -fsSL https://raw.githubusercontent.com/jatinkrmalik/vocalinux/main/install.sh | bash -s -- --tag=v0.3.0-alpha
Note: Installs the latest stable release (v0.3.0-alpha). For the most recent version, check GitHub Releases.
This will:
- Clone the repository to
~/.local/share/vocalinux-install - Install all system dependencies
- Set up a virtual environment in
~/.local/share/vocalinux/venv - Install both VOSK and Whisper AI speech engines:
- VOSK: installs the
voskPython package from PyPI - Whisper: installs the
openai-whisperpackage from PyPI, which also pulls in PyTorch (the ML framework Whisper requires)
- VOSK: installs the
- Create a symlink at
~/.local/bin/vocalinux - Download the default Whisper tiny speech model (~75MB)
โฑ๏ธ Note: Installation takes ~5-10 minutes due to Whisper AI dependencies (PyTorch with CUDA support, ~2.3GB).
Whisper with CPU-only PyTorch (no NVIDIA GPU needed):
curl -fsSL https://raw.githubusercontent.com/jatinkrmalik/vocalinux/main/install.sh | bash -s -- --tag=v0.3.0-alpha --whisper-cpu
This installs Whisper with CPU-only PyTorch (~200MB instead of ~2.3GB). Works great for systems without NVIDIA GPU.
For low-RAM systems (8GB or less) - VOSK only:
curl -fsSL https://raw.githubusercontent.com/jatinkrmalik/vocalinux/main/install.sh | bash -s -- --tag=v0.3.0-alpha --no-whisper
This skips Whisper installation entirely and configures VOSK as the default engine.
Alternative: Install from Source
# Clone the repository
git clone https://github.com/jatinkrmalik/vocalinux.git
cd vocalinux
# Run the installer (will prompt for Whisper)
./install.sh
# Or with Whisper support
./install.sh --with-whisper
The installer handles everything: system dependencies, Python environment, speech models, and desktop integration.
After Installation
# If ~/.local/bin is in your PATH (recommended):
vocalinux
# Or activate the virtual environment first:
source ~/.local/bin/activate-vocalinux.sh
vocalinux
# Or run directly:
~/.local/share/vocalinux/venv/bin/vocalinux
Or launch it from your application menu!
Uninstall
# If installed via curl:
curl -fsSL https://raw.githubusercontent.com/jatinkrmalik/vocalinux/main/uninstall.sh | bash
# If installed from source:
./uninstall.sh
๐ Requirements
- OS: Ubuntu 22.04+ (other Linux distros may work)
- Python: 3.8 or newer
- Display: X11 or Wayland
- Hardware: Microphone for voice input
๐๏ธ Usage
Voice Dictation
- Double-tap Ctrl to start recording
- Speak clearly into your microphone
- Double-tap Ctrl again (or pause speaking) to stop
Voice Commands
| Command | Action |
|---|---|
| "new line" | Inserts a line break |
| "period" / "full stop" | Types a period (.) |
| "comma" | Types a comma (,) |
| "question mark" | Types a question mark (?) |
| "exclamation mark" | Types an exclamation mark (!) |
| "delete that" | Deletes the last sentence |
| "capitalize" | Capitalizes the next word |
Command Line Options
vocalinux --help # Show all options
vocalinux --debug # Enable debug logging
vocalinux --engine whisper # Use Whisper AI engine
vocalinux --model medium # Use medium-sized model
vocalinux --wayland # Force Wayland mode
โ๏ธ Configuration
Configuration is stored in ~/.config/vocalinux/config.json:
{
"speech_recognition": {
"engine": "vosk",
"model_size": "small",
"vad_sensitivity": 3,
"silence_timeout": 2.0
}
}
You can also configure settings through the graphical Settings dialog (right-click the tray icon).
๐ง Development Setup
# Clone and install in dev mode
git clone https://github.com/jatinkrmalik/vocalinux.git
cd vocalinux
./install.sh --dev
# Activate environment
source venv/bin/activate
# Run tests
pytest
# Run from source with debug
python -m vocalinux.main --debug
๐ Project Structure
vocalinux/
โโโ src/vocalinux/ # Main application code
โ โโโ speech_recognition/ # Speech recognition engines
โ โโโ text_injection/ # Text injection (X11/Wayland)
โ โโโ ui/ # GTK UI components
โ โโโ utils/ # Utility functions
โโโ tests/ # Test suite
โโโ resources/ # Icons and sounds
โโโ docs/ # Documentation
โโโ web/ # Website source
๐ Documentation
- Installation Guide - Detailed installation instructions
- User Guide - Complete user documentation
- Contributing - Development setup and contribution guidelines
๐บ๏ธ Roadmap
-
Custom icon designโ -
Graphical settings dialogโ -
Whisper AI supportโ - Multi-language support
- Application-specific commands
- Debian/Ubuntu package (.deb)
- Improved Wayland support
- Voice command customization
๐ค Contributing
We welcome contributions! Whether it's bug reports, feature requests, or code contributions, please check out our Contributing Guide.
Quick Links
- ๐ Report a Bug
- ๐ก Request a Feature
- ๐ฌ Discussions
โญ Support
If you find Vocalinux useful, please consider:
- โญ Starring this repository
- ๐ Reporting bugs you encounter
- ๐ Improving documentation
- ๐ Contributing code
๐ License
This project is licensed under the GNU General Public License v3.0 - see the LICENSE file for details.
Made with โค๏ธ for the Linux community
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file vocalinux-0.4.0a0.tar.gz.
File metadata
- Download URL: vocalinux-0.4.0a0.tar.gz
- Upload date:
- Size: 773.7 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.2.0 CPython/3.10.19
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
aec985f69e070df15445c4a8ae8c3bd2081eecb7aa2028d34536f749f334822a
|
|
| MD5 |
8bcc97562e6bd89bdb34161c8fdc1215
|
|
| BLAKE2b-256 |
5b135739424f6bdf53e7959b1f53459b219c4a6e0a05b52f078df0bbeb122585
|
File details
Details for the file vocalinux-0.4.0a0-py3-none-any.whl.
File metadata
- Download URL: vocalinux-0.4.0a0-py3-none-any.whl
- Upload date:
- Size: 869.3 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.2.0 CPython/3.10.19
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
49bc851c0cada6d96156cb0c48cc4e922dab082a127fac0ec589343d3148ca5e
|
|
| MD5 |
4a566f629209d3a4985621cae81404f3
|
|
| BLAKE2b-256 |
31b9e123663132a78e7f1aa1781e286fedd57beec9bcddeb05b40024dd2ff786
|