Skip to main content

Voice-controlled AI assistant interface for Claude Code using Apple MLX Whisper

Project description

Claude Whisper

Voice-controlled interface for Claude Code using Apple MLX Whisper for speech recognition.

Overview

Claude Whisper enables voice interaction with Claude Code through push-to-talk functionality. Hold a configurable key (default: ESC), speak your wake word followed by a command, and Claude Whisper will transcribe and execute it using the Claude Agent SDK.

Features

  • Push-to-talk interface with configurable hotkey
  • Real-time speech recognition using MLX Whisper (optimized for Apple Silicon)
  • Desktop notifications for task status
  • Direct integration with Claude Agent SDK
  • Configurable wake word to trigger Claude commands
  • TOML-based configuration with environment variable overrides

Requirements

  • macOS with Apple Silicon (for MLX acceleration)
  • Python 3.10+
  • PortAudio (for microphone input)
  • Anthropic API key (for Claude Agent SDK)

macOS Permissions

Your terminal application needs the following permissions enabled in System Settings > Security & Privacy:

  • Input Monitoring - Required for detecting push-to-talk key presses
  • Accessibility - Required for keyboard event monitoring
  • Screen Recording - Required for the screenshot tool to capture your screen

Installation

Prerequisites

Install PortAudio (required for microphone input):

brew install portaudio

Using uvx (Recommended)

uvx claude-whisper /path/to/your/project

Using pipx

pipx install claude-whisper
claude-whisper /path/to/your/project

Using pip

pip install claude-whisper
claude-whisper /path/to/your/project

From Source

  1. Install system dependencies:
make install-deps
  1. Install the package:
uv sync

Usage

Audio Mode (Push-to-Talk)

Run Claude Whisper with a working directory:

claude-whisper /path/to/your/project

Once running:

  1. Hold the push-to-talk key (default: ESC)
  2. Say your wake word followed by your command (e.g., "Jarvis, create a README for this project")
  3. Release the key when done speaking
  4. The audio will be transcribed and sent to Claude
  5. Desktop notifications will alert you when tasks start and finish

Configuration

Configure Claude Whisper using environment variables with the CLAUDE_WHISPER_ prefix or a TOML config file at ~/.config/claude-whisper/config.toml:

Configuration Options

Variable Default Description
CLAUDE_WHISPER_MODEL_NAME mlx-community/whisper-medium-mlx-8bit Whisper model for transcription
CLAUDE_WHISPER_FORMAT paInt16 Audio format (16-bit int)
CLAUDE_WHISPER_CHANNELS 1 Number of audio channels (mono)
CLAUDE_WHISPER_RATE 16000 Sampling rate in Hz
CLAUDE_WHISPER_CHUNK 1024 Audio buffer size
CLAUDE_WHISPER_SILENCE_THRESHOLD 500 Amplitude threshold for detecting silence
CLAUDE_WHISPER_SILENCE_CHUNKS 30 Consecutive silent chunks before stopping
CLAUDE_WHISPER_COMMAND jarvis Wake word to trigger Claude
CLAUDE_WHISPER_PERMISSION_MODE acceptEdits Claude permission mode
CLAUDE_WHISPER_PUSH_TO_TALK_KEY esc Key to hold for recording

Example TOML Configuration

Create ~/.config/claude-whisper/config.toml:

model_name = "mlx-community/whisper-medium-mlx-8bit"
command = "jarvis"
push_to_talk_key = "esc"
permission_mode = "acceptEdits"

Available Push-to-Talk Keys

  • esc, escape - Escape key
  • space - Space bar
  • enter - Enter key
  • tab - Tab key
  • ctrl, shift, alt, cmd - Modifier keys
  • Any single character (e.g., a, z, 1)

Development

Testing

Run the test suite:

make test

Run tests with coverage:

pytest --cov=src/claude_whisper --cov-report=term-missing

Run specific test files:

pytest tests/test_config.py
pytest tests/test_main.py
pytest tests/test_integration.py

Code Quality

Format code:

make format

Run linter:

make lint

Fix linting issues:

make lint-fix

Check formatting and linting:

make check

License

See LICENSE file for details.

Author

Ashton Sidhu (ashton@sidhulabs.ca)

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

claude_whisper-0.1.2.tar.gz (8.9 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

claude_whisper-0.1.2-py3-none-any.whl (10.3 kB view details)

Uploaded Python 3

File details

Details for the file claude_whisper-0.1.2.tar.gz.

File metadata

  • Download URL: claude_whisper-0.1.2.tar.gz
  • Upload date:
  • Size: 8.9 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.13.7

File hashes

Hashes for claude_whisper-0.1.2.tar.gz
Algorithm Hash digest
SHA256 b8dfd938aa5bd5f25ec8a00439edaeee21d0d29219f893bf8acc56aedab44658
MD5 d053db57e8696a303769e3e3503592e5
BLAKE2b-256 dfd19eb81ccad2cd7b0cf62b7230a73c00703657287855055da269350e207e8b

See more details on using hashes here.

Provenance

The following attestation bundles were made for claude_whisper-0.1.2.tar.gz:

Publisher: publish.yml on Ashton-Sidhu/claude-whisper

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file claude_whisper-0.1.2-py3-none-any.whl.

File metadata

  • Download URL: claude_whisper-0.1.2-py3-none-any.whl
  • Upload date:
  • Size: 10.3 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.13.7

File hashes

Hashes for claude_whisper-0.1.2-py3-none-any.whl
Algorithm Hash digest
SHA256 1e37bab01c4bb8fa3670365b2706ed366c54339eb2c88c8c6117e71a69baf382
MD5 f89e8a21c556f655ac40ac1d08cafc6a
BLAKE2b-256 b689569e8eb0f4cb5ee89d3a63c2c897673792860bbb3a2000ed4c5d2727e2d8

See more details on using hashes here.

Provenance

The following attestation bundles were made for claude_whisper-0.1.2-py3-none-any.whl:

Publisher: publish.yml on Ashton-Sidhu/claude-whisper

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page