VoiceMode - Voice interaction capabilities for AI assistants (formerly voice-mcp)

These details have been verified by PyPI

Project links

GitHub Statistics

Maintainers

mbailey

These details have not been verified by PyPI

Project description

VoiceMode

Install via: uv tool install voice-mode | getvoicemode.com

Natural voice conversations for AI assistants. VoiceMode brings human-like voice interactions to Claude Code, AI code editors through the Model Context Protocol (MCP).

🖥️ Compatibility

Runs on: Linux • macOS • Windows (WSL) • NixOS | Python: 3.10+

✨ Features

🎙️ Natural Voice Conversations with Claude Code - ask questions and hear responses
🗣️ Supports local VoiceModels - works with any OpenAI API compatible STT/TTS services
⚡ Real-time - low-latency voice interactions with automatic transport selection
🔧 MCP Integration - seamless with Claude Code (and other MCP clients)
🎯 Silence detection - automatically stops recording when you stop speaking (no more waiting!)
🔄 Multiple transports - local microphone or LiveKit room-based communication

🎯 Simple Requirements

All you need to get started:

🎤 Computer with microphone and speakers
🔑 OpenAI API Key (optional) - VoiceMode can install free, open-source transcription and text-to-speech services locally

Optional for enhanced performance:

🍎 Xcode (macOS only) - Required for Core ML acceleration of Whisper models (2-3x faster inference). Install from Mac App Store then run sudo xcode-select -s /Applications/Xcode.app/Contents/Developer

Quick Start

Automatic Installation (Recommended)

Install Claude Code with VoiceMode configured and ready to run on Linux, macOS, and Windows WSL:

# Download and run the installer
curl -O https://getvoicemode.com/install.sh && bash install.sh

# While local voice services can be installed automatically, we recommend
# providing an OpenAI API key as a fallback in case local services are unavailable
export OPENAI_API_KEY=your-openai-key  # Optional but recommended

# Start a voice conversation
claude converse

This installer will:

Install all system dependencies (Node.js, audio libraries, etc.)
Install Claude Code if not already installed
Configure VoiceMode as an MCP server
Set up your system for voice conversations
Offer to install free local STT/TTS services if no API key is provided

Manual Installation

For manual setup steps, see the Getting Started Guide.

🎬 Demo

Watch VoiceMode in action with Claude Code:

The converse function makes voice interactions natural - it automatically waits for your response by default, creating a real conversation flow.

Installation

Prerequisites

Python >= 3.10
Astral UV - Package manager (install with curl -LsSf https://astral.sh/uv/install.sh | sh)
OpenAI API Key (or compatible service)

System Dependencies

Ubuntu/Debian

sudo apt update
sudo apt install -y python3-dev libasound2-dev libasound2-plugins libportaudio2 portaudio19-dev ffmpeg pulseaudio pulseaudio-utils

Note for WSL2 users: WSL2 requires additional audio packages (pulseaudio, libasound2-plugins) for microphone access.

Fedora/RHEL

sudo dnf install python3-devel alsa-lib-devel portaudio-devel ffmpeg

macOS

# Install Homebrew if not already installed
/bin/bash -c "$(curl -fsSL https://raw.githubusercontent.com/Homebrew/install/HEAD/install.sh)"

# Install dependencies
brew install portaudio ffmpeg cmake

Windows (WSL)

Follow the Ubuntu/Debian instructions above within WSL.

NixOS

VoiceMode includes a flake.nix with all required dependencies. You can either:

Use the development shell (temporary):

nix develop github:mbailey/voicemode

Install system-wide (see Installation section below)

Quick Install

# Using Claude Code (recommended)
claude mcp add --scope user voicemode uvx --refresh voice-mode

Configuration for AI Coding Assistants

📖 Looking for detailed setup instructions? Check our comprehensive Getting Started Guide for step-by-step instructions!

Below are quick configuration snippets. For full installation and setup instructions, see the integration guides above.

Claude Code (CLI)

claude mcp add voicemode -- uvx --refresh voice-mode

Or with environment variables:

claude mcp add voicemode --env OPENAI_API_KEY=your-openai-key -- uvx --refresh voice-mode

Alternative Installation Options

From source

git clone https://github.com/mbailey/voicemode.git
cd voicemode
pip install -e .

NixOS Installation Options

1. Install with nix profile (user-wide):

nix profile install github:mbailey/voicemode

2. Add to NixOS configuration (system-wide):

# In /etc/nixos/configuration.nix
environment.systemPackages = [
  (builtins.getFlake "github:mbailey/voicemode").packages.${pkgs.system}.default
];

3. Add to home-manager:

# In home-manager configuration
home.packages = [
  (builtins.getFlake "github:mbailey/voicemode").packages.${pkgs.system}.default
];

4. Run without installing:

nix run github:mbailey/voicemode

Configuration

📖 Getting Started - Step-by-step setup guide
🔧 Configuration Reference - All environment variables

Quick Setup

The only required configuration is your OpenAI API key:

export OPENAI_API_KEY="your-key"

Local STT/TTS Services

For privacy-focused or offline usage, VoiceMode supports local speech services:

Whisper.cpp - Local speech-to-text with OpenAI-compatible API
Kokoro - Local text-to-speech with multiple voice options

These services provide the same API interface as OpenAI, allowing seamless switching between cloud and local processing.

Troubleshooting

Common Issues

No microphone access: Check system permissions for terminal/application
- WSL2 Users: Additional audio packages (pulseaudio, libasound2-plugins) required for microphone access
UV not found: Install with curl -LsSf https://astral.sh/uv/install.sh | sh
OpenAI API error: Verify your OPENAI_API_KEY is set correctly
No audio output: Check system audio settings and available devices

Audio Saving

To save all audio files (both TTS output and STT input):

export VOICEMODE_SAVE_AUDIO=true

Audio files are saved to: ~/.voicemode/audio/YYYY/MM/ with timestamps in the filename.

Documentation

📚 Read the full documentation at voice-mode.readthedocs.io

Getting Started

Getting Started - Step-by-step setup for all supported tools
Configuration Guide - Complete environment variable reference

Development

Development Setup - Local development guide

Service Guides

Whisper.cpp Setup - Local speech-to-text configuration
Kokoro Setup - Local text-to-speech configuration
LiveKit Integration - Real-time voice communication

License

MIT - A Failmode Project

mcp-name: com.failmode/voicemode

Project details

These details have been verified by PyPI

Project links

GitHub Statistics

Maintainers

mbailey

These details have not been verified by PyPI

Release history Release notifications | RSS feed

8.6.1

Apr 21, 2026

8.6.0

Apr 16, 2026

8.5.1

Mar 13, 2026

8.5.0

Mar 7, 2026

8.4.0

Mar 5, 2026

8.3.0

Feb 24, 2026

8.2.1

Feb 19, 2026

8.2.0

Feb 13, 2026

8.1.0

Feb 2, 2026

8.0.8

Jan 28, 2026

8.0.7

Jan 28, 2026

8.0.6

Jan 28, 2026

8.0.5

Jan 28, 2026

8.0.4

Jan 28, 2026

8.0.3

Jan 28, 2026

8.0.2

Jan 24, 2026

8.0.1

Jan 24, 2026

8.0.0

Jan 24, 2026

7.4.2

Jan 16, 2026

7.4.1

Jan 16, 2026

7.4.0

Jan 6, 2026

7.3.0

Jan 6, 2026

7.2.0

Jan 5, 2026

7.1.2

Dec 26, 2025

7.1.1

Dec 25, 2025

7.1.0

Dec 24, 2025

7.0.1

Dec 2, 2025

7.0.0

Nov 26, 2025

6.2.0

Nov 24, 2025

6.1.1

Nov 11, 2025

6.1.0

Nov 10, 2025

6.0.5

Oct 26, 2025

6.0.4

Oct 26, 2025

6.0.3

Oct 26, 2025

6.0.2

Oct 26, 2025

6.0.1

Oct 19, 2025

6.0.0

Oct 15, 2025

5.1.9

Oct 13, 2025

5.1.8

Oct 12, 2025

5.1.7

Oct 12, 2025

5.1.6

Oct 12, 2025

5.1.5

Oct 12, 2025

5.1.4

Oct 12, 2025

5.1.3

Oct 12, 2025

5.1.2

Oct 12, 2025

5.1.1

Oct 12, 2025

5.1.0

Oct 11, 2025

5.0.3

Oct 4, 2025

5.0.2

Oct 4, 2025

5.0.1

Oct 3, 2025

5.0.0

Oct 3, 2025

4.8.0

Oct 3, 2025

4.7.1

Sep 22, 2025

4.7.0

Sep 22, 2025

This version

4.6.0

Sep 21, 2025

4.5.0

Sep 17, 2025

4.4.0

Sep 10, 2025

4.3.2

Sep 2, 2025

4.3.1

Sep 2, 2025

4.3.0

Sep 22, 2025

4.2.0

Sep 2, 2025

4.1.0

Aug 31, 2025

4.0.1

Aug 31, 2025

3.34.3

Aug 26, 2025

2.34.2

Aug 26, 2025

2.34.1

Aug 26, 2025

2.34.0

Aug 26, 2025

2.33.4

Aug 25, 2025

2.33.3

Aug 25, 2025

2.33.2

Aug 25, 2025

2.33.0

Aug 25, 2025

2.32.0

Aug 24, 2025

2.31.0

Aug 24, 2025

2.30.0

Aug 24, 2025

2.29.0

Aug 24, 2025

2.28.3

Aug 24, 2025

2.28.2

Aug 24, 2025

2.28.1

Aug 24, 2025

2.28.0

Aug 23, 2025

2.27.0

Aug 20, 2025

2.26.0

Aug 18, 2025

2.25.1

Aug 17, 2025

2.25.0

Aug 17, 2025

2.24.0

Aug 16, 2025

2.23.0

Aug 16, 2025

2.22.3

Aug 16, 2025

2.22.2

Aug 16, 2025

2.22.1

Aug 16, 2025

2.22.0

Aug 16, 2025

2.21.1

Aug 12, 2025

2.21.0

Aug 12, 2025

2.20.1

Aug 11, 2025

2.20.0

Aug 10, 2025

2.19.0

Aug 9, 2025

2.18.0

Aug 9, 2025

2.17.3

Aug 6, 2025

2.17.2

Jul 28, 2025

2.17.1

Jul 28, 2025

2.17.0

Jul 28, 2025

2.16.0 yanked

Jul 27, 2025

Reason this release was yanked:

bug

2.15.0

Jul 22, 2025

2.14.0

Jul 20, 2025

2.13.0

Jul 14, 2025

2.12.0

Jul 6, 2025

2.11.0

Jul 5, 2025

2.10.0

Jul 5, 2025

2.9.0

Jul 3, 2025

2.8.0

Jul 3, 2025

2.7.1

Jul 2, 2025

2.7.0

Jul 2, 2025

2.6.0

Jun 29, 2025

2.5.1

Jun 27, 2025

2.4.1

Jun 25, 2025

2.4.0

Jun 24, 2025

2.3.0

Jun 23, 2025

2.2.0

Jun 22, 2025

2.1.3

Jun 20, 2025

2.1.1

Jun 20, 2025

2.1.0

Jun 20, 2025

2.0.3

Jun 19, 2025

0.1.26

Jun 17, 2025

0.1.25

Jun 17, 2025

0.1.24

Jun 17, 2025

0.1.22

Jun 17, 2025

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

voice_mode-4.6.0.tar.gz (302.1 kB view details)

Uploaded Sep 21, 2025 Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

The dropdown lists show the available interpreters, ABIs, and platforms. Enable javascript to be able to filter the list of wheel files.

voice_mode-4.6.0-py3-none-any.whl (1.5 MB view details)

Uploaded Sep 21, 2025 Python 3

File details

Details for the file voice_mode-4.6.0.tar.gz.

File metadata

Download URL: voice_mode-4.6.0.tar.gz
Upload date: Sep 21, 2025
Size: 302.1 kB
Tags: Source
Uploaded using Trusted Publishing? Yes
Uploaded via: twine/6.1.0 CPython/3.13.7

File hashes

Hashes for voice_mode-4.6.0.tar.gz
Algorithm	Hash digest
SHA256	`9633db5b5fd9780d17f528f09732991d41f4868b8025dd4187e9b310a969d549`
MD5	`3bd6ee1ecdf2755ea5cd1c79af6bebae`
BLAKE2b-256	`6e1e5926157f7ef6c6e1656fb703fcce56ae30c52e3011bddaed2d767d0232c2`

See more details on using hashes here.

Provenance

The following attestation bundles were made for voice_mode-4.6.0.tar.gz:

Publisher: publish-pypi-and-mcp.yml on mbailey/voicemode

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

Statement:
- Statement type: https://in-toto.io/Statement/v1
- Predicate type: https://docs.pypi.org/attestations/publish/v1
- Subject name: voice_mode-4.6.0.tar.gz
- Subject digest: 9633db5b5fd9780d17f528f09732991d41f4868b8025dd4187e9b310a969d549
- Sigstore transparency entry: 543713322
- Sigstore integration time: Sep 21, 2025
Source repository:
- Permalink: mbailey/voicemode@e1552e7bcd42ca17ba5a83f346a05b7ec37b536c
- Branch / Tag: refs/tags/v4.6.0
- Owner: https://github.com/mbailey
- Access: public
Publication detail:
- Token Issuer: https://token.actions.githubusercontent.com
- Runner Environment: github-hosted
- Publication workflow: publish-pypi-and-mcp.yml@e1552e7bcd42ca17ba5a83f346a05b7ec37b536c
- Trigger Event: push

File details

Details for the file voice_mode-4.6.0-py3-none-any.whl.

File metadata

Download URL: voice_mode-4.6.0-py3-none-any.whl
Upload date: Sep 21, 2025
Size: 1.5 MB
Tags: Python 3
Uploaded using Trusted Publishing? Yes
Uploaded via: twine/6.1.0 CPython/3.13.7

File hashes

Hashes for voice_mode-4.6.0-py3-none-any.whl
Algorithm	Hash digest
SHA256	`4c66a04179cc587505d9f5363945e579292000806e9ce58180a8e0a8ee443b70`
MD5	`eae12c656bb7d4536f255ec2fd3b22c3`
BLAKE2b-256	`cd110ad3adebb6362ebf4c40b6d6e30f2f0a806f83b4f4e395a461750395ed0d`

See more details on using hashes here.

Provenance

The following attestation bundles were made for voice_mode-4.6.0-py3-none-any.whl:

Publisher: publish-pypi-and-mcp.yml on mbailey/voicemode

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

Statement:
- Statement type: https://in-toto.io/Statement/v1
- Predicate type: https://docs.pypi.org/attestations/publish/v1
- Subject name: voice_mode-4.6.0-py3-none-any.whl
- Subject digest: 4c66a04179cc587505d9f5363945e579292000806e9ce58180a8e0a8ee443b70
- Sigstore transparency entry: 543713323
- Sigstore integration time: Sep 21, 2025
Source repository:
- Permalink: mbailey/voicemode@e1552e7bcd42ca17ba5a83f346a05b7ec37b536c
- Branch / Tag: refs/tags/v4.6.0
- Owner: https://github.com/mbailey
- Access: public
Publication detail:
- Token Issuer: https://token.actions.githubusercontent.com
- Runner Environment: github-hosted
- Publication workflow: publish-pypi-and-mcp.yml@e1552e7bcd42ca17ba5a83f346a05b7ec37b536c
- Trigger Event: push

voice-mode 4.6.0

Navigation

Verified details

Project links

GitHub Statistics

Maintainers

Unverified details

Meta

Classifiers

Project description

VoiceMode

🖥️ Compatibility

✨ Features

🎯 Simple Requirements

Quick Start

Automatic Installation (Recommended)

Manual Installation

🎬 Demo

Installation

Prerequisites

System Dependencies

Quick Install

Configuration for AI Coding Assistants

Alternative Installation Options

Configuration

Quick Setup

Local STT/TTS Services

Troubleshooting

Common Issues

Audio Saving

Documentation

Getting Started

Development

Service Guides

Links

Community

See Also

License

Project details

Verified details

Project links

GitHub Statistics

Maintainers

Unverified details

Meta

Classifiers

Release history Release notifications | RSS feed

Download files

Source Distribution

Built Distribution

File details

File metadata

File hashes

Provenance

File details

File metadata

File hashes

Provenance