Skip to main content

Voice transcription with global hotkeys and LLM correction

Project description

UltraWhisper

Open-source, context-aware voice transcription for Linux

An open-source alternative to SuperWhisper (Mac-only), combining OpenAI's Whisper speech-to-text with LLM-powered intelligence for smart, accurate transcriptions that adapt to your workflow.

UltraWhisper TUI

What Makes UltraWhisper Different?

UltraWhisper goes beyond basic speech-to-text by understanding what you're working on and adapting its transcription accordingly. Whether you're coding in VS Code, browsing GitHub, or working in a terminal, it delivers transcriptions that fit seamlessly into your context.

Quick Start

Try It (No Installation Required)

# Run directly with uvx - no installation needed!

# Setup your config
uvx ultrawhisper setup

# Run it
uvx ultrawhisper

Key Features

Context-Aware Transcription

  • Automatically detects your active application (VS Code, Chrome, terminal, etc.)
  • Adapts transcription to preserve code syntax, technical terms, and domain-specific language

LLM-Powered Correction

  • Cleans up Whisper transcription using GPT-4, Claude, or local models
  • Applies application-specific prompts for better accuracy
  • Gracefully degrades to raw Whisper output if LLM is unavailable

Multi-Provider LLM Support

  • OpenAI, Anthropic, Local Models (OpenAI-compatible)

Flexible Input Methods

  • Double-tap: Quickly tap a key twice to toggle recording
  • Push-to-talk: Hold to record, release to transcribe

Beautiful Terminal Interface

  • Interactive TUI built with prompt-toolkit
  • Real-time status display showing LLM connection, context, and system state
  • Live logs and configuration visibility

Chat Mode (Conversational AI)

  • Voice conversations with your AI assistant
  • Maintains conversation history across questions
  • Context-aware responses based on your active application
  • TTS support for spoken responses
  • MCP (Model Context Protocol) integration for extended capabilities
  • Web search enabled by default

Privacy-First

  • Use local LLMs for complete offline operation
  • No data leaves your machine when using local models

Installation

For regular use, install from PyPI:

# Install with uv
uv pip install ultrawhisper

# Or with pip
pip install ultrawhisper

# Run interactive setup
ultrawhisper setup

# Run it
ultrawhisper

Configuration

Configuration is stored at ~/.config/ultrawhisper/config.yml. See config.example.yml for a complete example with all options.

Features in Detail

Context-Aware Prompts

UltraWhisper dynamically builds LLM prompts by combining:

  • Base prompt from your configuration
  • Application-specific prompts (VS Code, Chrome, terminals, etc.)
  • Pattern matching against window titles (GitHub, Stack Overflow, etc.)

This ensures your transcriptions are corrected appropriately for your current context.

Mode Switching

Switch between Transcription Mode and Question Mode (soon to be called Chat Mode):

System Requirements

  • Python: 3.10 or higher
  • Operating System: Linux (X11) for full context detection
  • Optional Dependencies:
    • xdotool - For advanced context detection
    • x11-utils - For window property detection
    • espeak or festival - For system TTS (question mode)

Installing System Dependencies

# Ubuntu/Debian
sudo apt install xdotool x11-utils espeak

# Arch Linux
sudo pacman -S xdotool xorg-xprop espeak

# Fedora
sudo dnf install xdotool xorg-x11-utils espeak

Development

Want to contribute or modify UltraWhisper? Here's how to set up a development environment:

# Clone the repository
git clone https://github.com/casonclagg/ultrawhisper.git
cd ultrawhisper

# Install dependencies
uv sync

# Run from source
uv run ultrawhisper

# Code formatting
uv run black src/

# Type checking
uv run mypy src/

# Linting
uv run flake8 src/

# Build package
uv build

Architecture

UltraWhisper uses an orchestrator pattern where TranscriptionApp coordinates:

  1. Audio recording via configurable backends
  2. Whisper transcription (local or API)
  3. Context detection from active window
  4. LLM correction with context-aware prompts
  5. Text output to clipboard or active window

License

MIT License - See LICENSE for details

Contributing

Contributions are welcome! Please feel free to submit a Pull Request.

Author

Cason Clagg - GitHub

Acknowledgments

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

ultrawhisper-1.0.1.tar.gz (240.9 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

ultrawhisper-1.0.1-py3-none-any.whl (82.7 kB view details)

Uploaded Python 3

File details

Details for the file ultrawhisper-1.0.1.tar.gz.

File metadata

  • Download URL: ultrawhisper-1.0.1.tar.gz
  • Upload date:
  • Size: 240.9 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.10.13

File hashes

Hashes for ultrawhisper-1.0.1.tar.gz
Algorithm Hash digest
SHA256 fafd0d618eaf5657ae8659d8b46ff0d71c9498f74a78b35bf6b53bcd30c17c99
MD5 7289e4906553cc669440e63ba00cbbee
BLAKE2b-256 7e01a977aee585600abb0b4979f2de1f6c15a6b1425c3582fe49a8fc7e0a72be

See more details on using hashes here.

File details

Details for the file ultrawhisper-1.0.1-py3-none-any.whl.

File metadata

  • Download URL: ultrawhisper-1.0.1-py3-none-any.whl
  • Upload date:
  • Size: 82.7 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.10.13

File hashes

Hashes for ultrawhisper-1.0.1-py3-none-any.whl
Algorithm Hash digest
SHA256 3a958bce6881f642580f9c667d31564063c8e75356ebc9d06e29f27304685655
MD5 661fd306b4f94eab443168054df1f015
BLAKE2b-256 1fa181fb67544997afb2a79bab45c5ffaecc9cfc662af671b35922485936f1b5

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page