Skip to main content

VoiceMode - Voice interaction capabilities for AI assistants (formerly voice-mcp)

Project description

VoiceMode

Natural voice conversations with Claude Code (and other MCP capable agents)

PyPI Downloads PyPI Downloads PyPI Downloads

VoiceMode enables natural voice conversations with Claude Code. Voice isn't about replacing typing - it's about being available when typing isn't.

Perfect for:

  • Walking to your next meeting
  • Cooking while debugging
  • Giving your eyes a break after hours of screen time
  • Holding a coffee (or a dog)
  • Any moment when your hands or eyes are busy

See It In Action

VoiceMode Demo

Quick Start

Requirements: Computer with microphone and speakers

Option 1: Claude Code Plugin (Recommended)

The fastest way for Claude Code users to get started:

# Add the VoiceMode marketplace
claude plugin marketplace add mbailey/voicemode

# Install VoiceMode plugin
claude plugin install voicemode@voicemode

## Install dependencies (CLI, Local Voice Services)

/voicemode:install

# Start talking!
/voicemode:converse

Option 2: Python installer package

Installs dependencies and the VoiceMode Python package.

# Install UV package manager (if needed)
curl -LsSf https://astral.sh/uv/install.sh | sh

# Run the installer (sets up dependencies and local voice services)
uvx voice-mode-install

# Add to Claude Code
claude mcp add --scope user voicemode -- uvx --refresh voice-mode

# Optional: Add OpenAI API key as fallback for local services
export OPENAI_API_KEY=your-openai-key

# Start a conversation
claude converse

For manual setup, see the Getting Started Guide.

Features

  • Natural conversations - speak naturally, hear responses immediately
  • Works offline - optional local voice services (Whisper STT, Kokoro TTS)
  • Low latency - fast enough to feel like a real conversation
  • Smart silence detection - stops recording when you stop speaking
  • Privacy options - run entirely locally or use cloud services

Compatibility

Platforms: Linux, macOS, Windows (WSL), NixOS Python: 3.10-3.14

Configuration

VoiceMode works out of the box. For customization:

# Set OpenAI API key (if using cloud services)
export OPENAI_API_KEY="your-key"

# Or configure via file
voicemode config edit

See the Configuration Guide for all options.

Remote Agent (Operator)

VoiceMode includes agent management for running headless Claude Code instances that can be woken remotely from the iOS app or web interface.

Quick Start

# Start the operator agent in a tmux session
voicemode agent start

# Check if it's running
voicemode agent status

# Send a message to the operator
voicemode agent send "Hello, please check my calendar"

# Stop the operator
voicemode agent stop

The Operator Concept

The operator is a headless Claude Code instance running in tmux that:

  • Listens for remote connections from voicemode.dev
  • Can be woken by the iOS app or web interface
  • Responds via voice using VoiceMode's TTS/STT capabilities

Think of it like a phone operator - always there to help when called.

Agent Commands

Command Description
voicemode agent start Start operator in tmux session
voicemode agent stop Send Ctrl-C to stop Claude gracefully
voicemode agent stop --kill Kill the tmux window
voicemode agent status Show running/stopped status
voicemode agent send "msg" Send message (auto-starts if needed)
voicemode agent send --no-start "msg" Send message (fail if not running)

Agent Directory Structure

Agent configuration lives in ~/.voicemode/agents/:

~/.voicemode/agents/
├── voicemode.env       # Shared settings for all agents
├── AGENT.md            # AI entry point
├── CLAUDE.md           # Claude-specific instructions
├── SKILL.md            # Shared behavior
└── operator/           # Default agent
    ├── voicemode.env   # Operator-specific settings
    ├── AGENT.md
    ├── CLAUDE.md
    └── SKILL.md        # Operator behavior

Configuration (voicemode.env)

Agent-specific settings override base settings. Available options:

# Base settings (~/.voicemode/agents/voicemode.env)
VOICEMODE_VOICE=nova           # Default TTS voice
VOICEMODE_SPEED=1.0            # Speech rate

# Operator settings (~/.voicemode/agents/operator/voicemode.env)
VOICEMODE_AGENT_REMOTE=true    # Enable remote connections
VOICEMODE_AGENT_STARTUP_MESSAGE=  # Message sent on startup
VOICEMODE_AGENT_CLAUDE_ARGS=   # Extra args for Claude Code

Permissions Setup (Optional)

To use VoiceMode without permission prompts, add to ~/.claude/settings.json:

{
  "permissions": {
    "allow": [
      "mcp__voicemode__converse",
      "mcp__voicemode__service"
    ]
  }
}

See the Permissions Guide for more options.

Local Voice Services

For privacy or offline use, install local speech services:

  • Whisper.cpp - Local speech-to-text
  • Kokoro - Local text-to-speech with multiple voices

These provide the same API as OpenAI, so VoiceMode switches seamlessly between them.

Installation Details

System Dependencies by Platform

Ubuntu/Debian

sudo apt update
sudo apt install -y ffmpeg gcc libasound2-dev libasound2-plugins libportaudio2 portaudio19-dev pulseaudio pulseaudio-utils python3-dev

WSL2 users: The pulseaudio packages above are required for microphone access.

Fedora/RHEL

sudo dnf install alsa-lib-devel ffmpeg gcc portaudio portaudio-devel python3-devel

macOS

brew install ffmpeg node portaudio

NixOS

# Use development shell
nix develop github:mbailey/voicemode

# Or install system-wide
nix profile install github:mbailey/voicemode
Alternative Installation Methods

From source

git clone https://github.com/mbailey/voicemode.git
cd voicemode
uv tool install -e .

NixOS system-wide

# In /etc/nixos/configuration.nix
environment.systemPackages = [
  (builtins.getFlake "github:mbailey/voicemode").packages.${pkgs.system}.default
];

Troubleshooting

Problem Solution
No microphone access Check terminal/app permissions. WSL2 needs pulseaudio packages.
UV not found Run curl -LsSf https://astral.sh/uv/install.sh | sh
OpenAI API error Verify OPENAI_API_KEY is set correctly
No audio output Check system audio settings and available devices

Save Audio for Debugging

export VOICEMODE_SAVE_AUDIO=true
# Files saved to ~/.voicemode/audio/YYYY/MM/

Documentation

Full documentation: voice-mode.readthedocs.io

Links

License

MIT - A Failmode Project


mcp-name: com.failmode/voicemode

Project details


Release history Release notifications | RSS feed

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

voice_mode-8.2.0.tar.gz (1.1 MB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

voice_mode-8.2.0-py3-none-any.whl (1.1 MB view details)

Uploaded Python 3

File details

Details for the file voice_mode-8.2.0.tar.gz.

File metadata

  • Download URL: voice_mode-8.2.0.tar.gz
  • Upload date:
  • Size: 1.1 MB
  • Tags: Source
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.13.7

File hashes

Hashes for voice_mode-8.2.0.tar.gz
Algorithm Hash digest
SHA256 34fd9cfdf1755fef49967952bdad5b459f468d8a5b8233901515b58734224854
MD5 54b3dcc109ed7abcb3c8921b381e8603
BLAKE2b-256 28de4b87ab70a32d2afb9032deeb75d30591e44512c1940bac48b47d46195986

See more details on using hashes here.

Provenance

The following attestation bundles were made for voice_mode-8.2.0.tar.gz:

Publisher: publish-pypi-and-mcp.yml on mbailey/voicemode

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file voice_mode-8.2.0-py3-none-any.whl.

File metadata

  • Download URL: voice_mode-8.2.0-py3-none-any.whl
  • Upload date:
  • Size: 1.1 MB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.13.7

File hashes

Hashes for voice_mode-8.2.0-py3-none-any.whl
Algorithm Hash digest
SHA256 3f4dbda7639bfbf32d05aac20a58ba976b9d1a3351931cf84e37ea44b6b66a5c
MD5 0c872b81e38c8e5713a09d3e90fc67d9
BLAKE2b-256 de4e81861bb688c6a055abac7fc806ec674e364d584ba753091a60f2f61cbab7

See more details on using hashes here.

Provenance

The following attestation bundles were made for voice_mode-8.2.0-py3-none-any.whl:

Publisher: publish-pypi-and-mcp.yml on mbailey/voicemode

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page