Voice transcription with global hotkeys and LLM correction
Project description
UltraWhisper
Open-source, context-aware voice transcription for Linux
An open-source alternative to SuperWhisper (Mac-only), combining OpenAI's Whisper speech-to-text with LLM-powered intelligence for smart, accurate transcriptions that adapt to your workflow.
What Makes UltraWhisper Different?
UltraWhisper goes beyond basic speech-to-text by understanding what you're working on and adapting its transcription accordingly. Whether you're coding in VS Code, browsing GitHub, or working in a terminal, it delivers transcriptions that fit seamlessly into your context.
Key Features
Context-Aware Transcription
- Automatically detects your active application (VS Code, Chrome, terminal, etc.)
- Adapts transcription to preserve code syntax, technical terms, and domain-specific language
LLM-Powered Correction
- Cleans up Whisper transcription using GPT-4, Claude, or local models
- Applies application-specific prompts for better accuracy
- Gracefully degrades to raw Whisper output if LLM is unavailable
Multi-Provider LLM Support
- OpenAI, Anthropic, Local Models (OpenAI-compatible)
Flexible Input Methods
- Double-tap: Quickly tap a key twice to toggle recording
- Push-to-talk: Hold to record, release to transcribe
Beautiful Terminal Interface
- Interactive TUI built with prompt-toolkit
- Real-time status display showing LLM connection, context, and system state
- Live logs and configuration visibility
Chat Mode (Conversational AI)
- Voice conversations with your AI assistant
- Maintains conversation history across questions
- Context-aware responses based on your active application
- TTS support for spoken responses
- MCP (Model Context Protocol) integration for extended capabilities
- Web search enabled by default
Privacy-First
- Use local LLMs for complete offline operation
- No data leaves your machine when using local models
Quick Start
Installation
# Clone the repository
git clone https://github.com/casonclagg/ultrawhisper.git
cd ultrawhisper
# Install dependencies
uv sync
# Run interactive setup to set API keys etc
uv run ultrawhisper setup
Basic Usage
# Run it!
uv run ultrawhisper
Configuration
Configuration is stored at ~/.config/ultrawhisper/config.yml. See config.example.yml for a complete example with all options.
Features in Detail
Context-Aware Prompts
UltraWhisper dynamically builds LLM prompts by combining:
- Base prompt from your configuration
- Application-specific prompts (VS Code, Chrome, terminals, etc.)
- Pattern matching against window titles (GitHub, Stack Overflow, etc.)
This ensures your transcriptions are corrected appropriately for your current context.
Mode Switching
Switch between Transcription Mode and Question Mode (soon to be called Chat Mode):
System Requirements
- Python: 3.10 or higher
- Operating System: Linux (X11) for full context detection
- Optional Dependencies:
xdotool- For advanced context detectionx11-utils- For window property detectionespeakorfestival- For system TTS (question mode)
Installing System Dependencies
# Ubuntu/Debian
sudo apt install xdotool x11-utils espeak
# Arch Linux
sudo pacman -S xdotool xorg-xprop espeak
# Fedora
sudo dnf install xdotool xorg-x11-utils espeak
Development
# Code formatting
uv run black src/
# Type checking
uv run mypy src/
# Linting
uv run flake8 src/
# Build package
uv build
# Run from source
uv run ultrawhisper
Architecture
UltraWhisper uses an orchestrator pattern where TranscriptionApp coordinates:
- Audio recording via configurable backends
- Whisper transcription (local or API)
- Context detection from active window
- LLM correction with context-aware prompts
- Text output to clipboard or active window
License
MIT License - See LICENSE for details
Contributing
Contributions are welcome! Please feel free to submit a Pull Request.
Author
Cason Clagg - GitHub
Acknowledgments
- Built with OpenAI Whisper
- Uses faster-whisper for optimized inference
- Powered by OpenAI and Anthropic LLMs
- Terminal UI built with prompt-toolkit
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file ultrawhisper-0.1.0.tar.gz.
File metadata
- Download URL: ultrawhisper-0.1.0.tar.gz
- Upload date:
- Size: 240.7 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.2.0 CPython/3.10.13
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
921a4a96bdbe1e29ef1e9d082d21b9964d907033bb91c0025becda80771f7f8c
|
|
| MD5 |
a335ce111c250a8c4b465a25c6b03bdc
|
|
| BLAKE2b-256 |
6b22e94dd2bf9932ebf67a0432df598ce04305b247bf6f1d808d26c78ccc4bc9
|
File details
Details for the file ultrawhisper-0.1.0-py3-none-any.whl.
File metadata
- Download URL: ultrawhisper-0.1.0-py3-none-any.whl
- Upload date:
- Size: 82.5 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.2.0 CPython/3.10.13
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
fb4c18b3fd1ec4e039db3dfa5fac18fa80a6d0c3695dd161f461dde102f83adb
|
|
| MD5 |
cbca038792a55050907593b7207699d7
|
|
| BLAKE2b-256 |
8c9b031b87eb2192467d186fe837f4d814e48f16bdd37dba8cf02c5912e1d93a
|