Voice-controlled AI assistant interface for Claude Code using Apple MLX Whisper
Project description
Claude Whisper
Voice-controlled interface for Claude Code using Apple MLX Whisper for speech recognition.
Overview
Claude Whisper enables voice interaction with Claude Code through push-to-talk functionality. Hold a configurable key (default: ESC), speak your wake word followed by a command, and Claude Whisper will transcribe and execute it using the Claude Agent SDK.
Features
- Push-to-talk interface with configurable hotkey
- Real-time speech recognition using MLX Whisper (optimized for Apple Silicon)
- Desktop notifications for task status
- Direct integration with Claude Agent SDK
- Configurable wake word to trigger Claude commands
- TOML-based configuration with environment variable overrides
Requirements
- macOS with Apple Silicon (for MLX acceleration)
- Python 3.10+
- PortAudio (for microphone input)
- Anthropic API key (for Claude Agent SDK)
macOS Permissions
Your terminal application needs the following permissions enabled in System Settings > Security & Privacy:
- Input Monitoring - Required for detecting push-to-talk key presses
- Accessibility - Required for keyboard event monitoring
Installation
Prerequisites
Install PortAudio (required for microphone input):
brew install portaudio
Using uvx (Recommended)
uvx claude-whisper /path/to/your/project
Using pipx
pipx install claude-whisper
claude-whisper /path/to/your/project
Using pip
pip install claude-whisper
claude-whisper /path/to/your/project
From Source
- Install system dependencies:
make install-deps
- Install the package:
uv sync
Usage
Audio Mode (Push-to-Talk)
Run Claude Whisper with a working directory:
claude-whisper /path/to/your/project
Once running:
- Hold the push-to-talk key (default: ESC)
- Say your wake word followed by your command (e.g., "Jarvis, create a README for this project")
- Release the key when done speaking
- The audio will be transcribed and sent to Claude
- Desktop notifications will alert you when tasks start and finish
Configuration
Configure Claude Whisper using environment variables with the CLAUDE_WHISPER_ prefix or a TOML config file at ~/.config/claude-whisper/config.toml:
Configuration Options
| Variable | Default | Description |
|---|---|---|
CLAUDE_WHISPER_MODEL_NAME |
mlx-community/whisper-medium-mlx-8bit |
Whisper model for transcription |
CLAUDE_WHISPER_FORMAT |
paInt16 |
Audio format (16-bit int) |
CLAUDE_WHISPER_CHANNELS |
1 |
Number of audio channels (mono) |
CLAUDE_WHISPER_RATE |
16000 |
Sampling rate in Hz |
CLAUDE_WHISPER_CHUNK |
1024 |
Audio buffer size |
CLAUDE_WHISPER_SILENCE_THRESHOLD |
500 |
Amplitude threshold for detecting silence |
CLAUDE_WHISPER_SILENCE_CHUNKS |
30 |
Consecutive silent chunks before stopping |
CLAUDE_WHISPER_COMMAND |
jarvis |
Wake word to trigger Claude |
CLAUDE_WHISPER_PERMISSION_MODE |
acceptEdits |
Claude permission mode |
CLAUDE_WHISPER_PUSH_TO_TALK_KEY |
esc |
Key to hold for recording |
Example TOML Configuration
Create ~/.config/claude-whisper/config.toml:
model_name = "mlx-community/whisper-medium-mlx-8bit"
command = "jarvis"
push_to_talk_key = "esc"
permission_mode = "acceptEdits"
Available Push-to-Talk Keys
esc,escape- Escape keyspace- Space barenter- Enter keytab- Tab keyctrl,shift,alt,cmd- Modifier keys- Any single character (e.g.,
a,z,1)
Development
Testing
Run the test suite:
make test
Run tests with coverage:
pytest --cov=src/claude_whisper --cov-report=term-missing
Run specific test files:
pytest tests/test_config.py
pytest tests/test_main.py
pytest tests/test_integration.py
Code Quality
Format code:
make format
Run linter:
make lint
Fix linting issues:
make lint-fix
Check formatting and linting:
make check
License
See LICENSE file for details.
Author
Ashton Sidhu (ashton@sidhulabs.ca)
Project details
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file claude_whisper-0.1.1.tar.gz.
File metadata
- Download URL: claude_whisper-0.1.1.tar.gz
- Upload date:
- Size: 7.9 kB
- Tags: Source
- Uploaded using Trusted Publishing? Yes
- Uploaded via: twine/6.1.0 CPython/3.13.7
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
dd770e50c22e911965a9affaa0cb4d7a8fdf605305f504495f53d62262b9a3ce
|
|
| MD5 |
2f72b48db49edec2a295746cf968c7da
|
|
| BLAKE2b-256 |
08ee772d8e0c29e2b1002d54b772efa02b025241bb3d897ab936f42592b9f8c5
|
Provenance
The following attestation bundles were made for claude_whisper-0.1.1.tar.gz:
Publisher:
publish.yml on Ashton-Sidhu/claude-whisper
-
Statement:
-
Statement type:
https://in-toto.io/Statement/v1 -
Predicate type:
https://docs.pypi.org/attestations/publish/v1 -
Subject name:
claude_whisper-0.1.1.tar.gz -
Subject digest:
dd770e50c22e911965a9affaa0cb4d7a8fdf605305f504495f53d62262b9a3ce - Sigstore transparency entry: 929870503
- Sigstore integration time:
-
Permalink:
Ashton-Sidhu/claude-whisper@23ac5d9ac526bfacf619ba936ffdaa5cb074a219 -
Branch / Tag:
refs/tags/v0.1.1 - Owner: https://github.com/Ashton-Sidhu
-
Access:
public
-
Token Issuer:
https://token.actions.githubusercontent.com -
Runner Environment:
github-hosted -
Publication workflow:
publish.yml@23ac5d9ac526bfacf619ba936ffdaa5cb074a219 -
Trigger Event:
release
-
Statement type:
File details
Details for the file claude_whisper-0.1.1-py3-none-any.whl.
File metadata
- Download URL: claude_whisper-0.1.1-py3-none-any.whl
- Upload date:
- Size: 8.9 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? Yes
- Uploaded via: twine/6.1.0 CPython/3.13.7
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
bc1efc4dbb2858a14d33e727b2991e97e3ea4d0d673e1aa8079128170e9d404a
|
|
| MD5 |
885ded9e12ebe29cafd7092df37ff4c4
|
|
| BLAKE2b-256 |
03265419b177e5cf7b4b811f2fe9de6cec8d3522cceba51d0d6fbe03dc5df6e0
|
Provenance
The following attestation bundles were made for claude_whisper-0.1.1-py3-none-any.whl:
Publisher:
publish.yml on Ashton-Sidhu/claude-whisper
-
Statement:
-
Statement type:
https://in-toto.io/Statement/v1 -
Predicate type:
https://docs.pypi.org/attestations/publish/v1 -
Subject name:
claude_whisper-0.1.1-py3-none-any.whl -
Subject digest:
bc1efc4dbb2858a14d33e727b2991e97e3ea4d0d673e1aa8079128170e9d404a - Sigstore transparency entry: 929870505
- Sigstore integration time:
-
Permalink:
Ashton-Sidhu/claude-whisper@23ac5d9ac526bfacf619ba936ffdaa5cb074a219 -
Branch / Tag:
refs/tags/v0.1.1 - Owner: https://github.com/Ashton-Sidhu
-
Access:
public
-
Token Issuer:
https://token.actions.githubusercontent.com -
Runner Environment:
github-hosted -
Publication workflow:
publish.yml@23ac5d9ac526bfacf619ba936ffdaa5cb074a219 -
Trigger Event:
release
-
Statement type: