Voice interaction capabilities for Model Context Protocol (MCP) servers

These details have not been verified by PyPI

Project links

Project description

voice-mcp

MCP servers that enable voice interactions between LLMs and users through LiveKit.

Quick Start with Python Package

The easiest way to use voice-mcp is through our Python package:

# Install with pip
pip install livekit-voice-mcp

# Or use with uvx (no installation needed)
uvx livekit-voice-mcp

Configure Claude Desktop

Add to your Claude Desktop configuration file:

macOS: ~/Library/Application Support/Claude/claude_desktop_config.json Windows: %APPDATA%\Claude\claude_desktop_config.json

{
  "mcpServers": {
    "livekit-voice": {
      "command": "uvx",
      "args": ["livekit-voice-mcp"],
      "env": {
        "LIVEKIT_URL": "wss://your-app.livekit.cloud",
        "LIVEKIT_API_KEY": "your-api-key",
        "LIVEKIT_API_SECRET": "your-api-secret",
        "OPENAI_API_KEY": "your-openai-key"
      }
    }
  }
}

Restart Claude Desktop and you can now use voice commands!

Overview

voice-mcp provides Model Context Protocol (MCP) servers that allow LLMs to communicate via voice, enabling natural spoken conversations with AI assistants.

Architecture

┌─────────────────────┐     ┌──────────────────┐     ┌─────────────────────┐
│   Claude/LLM        │     │  LiveKit Server  │     │  Voice Frontend     │
│   (MCP Client)      │◄────►│  (Port 7880)     │◄────►│  (Port 3001)        │
└─────────────────────┘     └──────────────────┘     └─────────────────────┘
         │                            │
         │                            │
         ▼                            ▼
┌─────────────────────┐     ┌──────────────────┐
│  Voice MCP Server   │     │   Agent.py       │
│  (ask_voice_question│     │  (Voice Logic)   │
│   check_room_status)│     └──────────────────┘
└─────────────────────┘              │
                                     │
                    ┌────────────────┴────────────────┐
                    │                                 │
                    ▼                                 ▼
         ┌──────────────────┐             ┌──────────────────┐
         │  Whisper.cpp     │             │  Kokoro TTS      │
         │  (Port 2022)     │             │  (Port 8880)     │
         │  Local STT       │             │  Local TTS       │
         └──────────────────┘             └──────────────────┘

Features

Voice Input/Output: Bidirectional voice communication through LiveKit
Speech-to-Text: Local whisper.cpp or OpenAI Whisper API
Text-to-Speech: Multiple TTS providers (OpenAI TTS + local Kokoro-FastAPI)
Local STT/TTS: Cost-free local speech recognition and voice generation
Real-time Streaming: Low-latency voice interactions
MCP Integration: Works seamlessly with Claude and other MCP-compatible clients

Installation Options

Option 1: Python Package (Recommended for Users)

# Install globally
pip install livekit-voice-mcp

# Or use without installation
uvx livekit-voice-mcp

# Or use pipx for isolated installation  
pipx install livekit-voice-mcp

Option 2: Container Image

# Pull and run the container
docker pull ghcr.io/mbailey/voice-mcp:latest

# Run with environment variables
docker run -e OPENAI_API_KEY=your_key_here \
  -e VOICE_MCP_DEBUG=true \
  ghcr.io/mbailey/voice-mcp:latest

See CONTAINER.md for detailed container usage instructions.

Option 3: Local Development Setup

# Clone the repository
git clone https://github.com/mbailey/voice-mcp-public.git
cd voice-mcp-public

# Build container image
make build-container

# Or install development environment
make install

Configuration

Python Package Configuration

Set environment variables before running:

export LIVEKIT_URL="wss://your-app.livekit.cloud"
export LIVEKIT_API_KEY="your-api-key"
export LIVEKIT_API_SECRET="your-api-secret"
export OPENAI_API_KEY="your-openai-key"  # For STT/TTS

Local Development Configuration

Copy the example configuration and customize:

cp .env.example .env.local
# Edit .env.local with your settings

Provider Selection

voice-mcp supports multiple STT/TTS providers with smart fallback:

TTS Providers

TTS_PROVIDER=auto (default): Try Kokoro → OpenAI → LiveKit
TTS_PROVIDER=kokoro: Use only local Kokoro TTS
TTS_PROVIDER=openai: Use only OpenAI TTS

STT Configuration

Local Whisper: Automatically used when available at http://localhost:2022
OpenAI Whisper: Fallback when local whisper is not running

Key Configuration Options

# TTS Provider (auto/kokoro/openai)
TTS_PROVIDER=auto

# Kokoro TTS (local)
KOKORO_URL=http://127.0.0.1:8880
KOKORO_ENABLED=true

# Whisper STT (local)
WHISPER_BASE_URL=http://localhost:2022

# OpenAI (fallback for both STT and TTS)
OPENAI_API_KEY=your_key_here

# LiveKit
LIVEKIT_URL=ws://localhost:7880

Usage

Using the Python Package

Once installed and configured in Claude Desktop, you can use voice commands:

Ask Claude: "Can you help me with voice?"
Claude will use the voice MCP tools to communicate
Speak your questions and hear responses

Available MCP tools:

ask_voice_question: Ask a question via voice and get a text response
check_room_status: Check active voice rooms and participants

Local Development Usage

Download external repositories:
```
mt sync
```
Install and build all dependencies:
```
make install
```
Start the development environment:
```
make dev
```

This will start:

LiveKit server (port 7880)
Kokoro TTS (port 8880)
Whisper STT (port 2022)
Voice assistant frontend (port 3001)

Individual components:

make livekit-server   # Start LiveKit server
make frontend         # Start voice frontend
make kokoro-start     # Start Kokoro TTS
make whisper-start    # Start Whisper STT

Architecture

livekit-voice-mcp: MCP server for voice interactions
livekit-admin-mcp: Administrative tools for LiveKit management
livekit-agent: Python agent handling voice processing
kokoro-fastapi: Local TTS server providing OpenAI-compatible API
whisper.cpp: Local STT server providing OpenAI-compatible API

Kokoro-FastAPI (Local TTS)

voice-mcp includes Kokoro-FastAPI for cost-free local text-to-speech generation:

70+ Voice Options: Multiple languages and voice styles
OpenAI Compatible: Drop-in replacement for OpenAI TTS API
Web Interface: Interactive voice testing at http://127.0.0.1:8880/web/
Browser Support: Chrome/Chromium recommended (Firefox has streaming limitations)

Kokoro Commands

make kokoro-start     # Start Kokoro TTS service
make kokoro-stop      # Stop Kokoro TTS service  
make kokoro-build     # Build Kokoro container
make test-kokoro      # Test Kokoro functionality

Quick Test

# Generate speech using Kokoro API
curl -X POST http://127.0.0.1:8880/v1/audio/speech \
  -H "Content-Type: application/json" \
  -d '{"input": "Hello from Kokoro!", "voice": "nova"}' \
  --output test.mp3

Whisper.cpp (Local STT)

voice-mcp includes whisper.cpp for cost-free local speech-to-text:

Hardware Optimization: Automatically selects best model for your hardware
OpenAI Compatible: Drop-in replacement for OpenAI Whisper API
Multiple Models: From tiny to large-v3-turbo
GPU Support: CUDA, Metal, and Vulkan acceleration

Whisper Commands

make whisper-build    # Build Whisper container
make whisper-start    # Start Whisper STT service
make whisper-stop     # Stop Whisper STT service

Quick Test

# Test whisper API (OpenAI-compatible)
curl -X POST http://localhost:2022/v1/audio/transcriptions \
  -H "Content-Type: multipart/form-data" \
  -F "file=@audio.wav"

Requirements

Python 3.8+
LiveKit server
Podman or Docker (for Kokoro TTS only)
Build tools (cmake, make, gcc/g++) for Whisper.cpp
OpenAI API key (optional, for cloud fallback)
mt command for managing external repos

Development

See TASKS.md for development roadmap and technical tasks.

License

MIT

Project details

These details have not been verified by PyPI

Project links

Release history Release notifications | RSS feed

2.4.1

Jun 25, 2025

2.4.0

Jun 24, 2025

2.3.0

Jun 23, 2025

2.2.0

Jun 22, 2025

2.1.3

Jun 20, 2025

2.1.1

Jun 20, 2025

2.1.0

Jun 20, 2025

2.0.3

Jun 19, 2025

2.0.2

Jun 19, 2025

0.1.30

Jun 19, 2025

0.1.29

Jun 17, 2025

0.1.28

Jun 17, 2025

0.1.27

Jun 17, 2025

0.1.26

Jun 17, 2025

0.1.25

Jun 17, 2025

0.1.24

Jun 17, 2025

0.1.23

Jun 17, 2025

0.1.22

Jun 16, 2025

0.1.21

Jun 15, 2025

0.1.20

Jun 15, 2025

0.1.18

Jun 15, 2025

0.1.17

Jun 15, 2025

0.1.15

Jun 14, 2025

0.1.14

Jun 14, 2025

0.1.12

Jun 13, 2025

0.1.9

Jun 13, 2025

This version

0.1.2

Jun 8, 2025

0.1.1

Jun 8, 2025

0.1.0

Jun 8, 2025

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

voice_mcp-0.1.2.tar.gz (11.7 kB view details)

Uploaded Jun 8, 2025 Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

The dropdown lists show the available interpreters, ABIs, and platforms. Enable javascript to be able to filter the list of wheel files.

voice_mcp-0.1.2-py3-none-any.whl (20.1 kB view details)

Uploaded Jun 8, 2025 Python 3

File details

Details for the file voice_mcp-0.1.2.tar.gz.

File metadata

Download URL: voice_mcp-0.1.2.tar.gz
Upload date: Jun 8, 2025
Size: 11.7 kB
Tags: Source
Uploaded using Trusted Publishing? No
Uploaded via: twine/6.1.0 CPython/3.13.3

File hashes

Hashes for voice_mcp-0.1.2.tar.gz
Algorithm	Hash digest
SHA256	`39e7014f27f478503b26fe9b80e0ea418471ef90000d560928e52430a074a3b2`
MD5	`4c5135a09cbee1d9254ea1646ac1802f`
BLAKE2b-256	`fe4a75494aea0649e5368a54ce9ca1674132a438908722203f9a01f48fdda676`

See more details on using hashes here.

File details

Details for the file voice_mcp-0.1.2-py3-none-any.whl.

File metadata

Download URL: voice_mcp-0.1.2-py3-none-any.whl
Upload date: Jun 8, 2025
Size: 20.1 kB
Tags: Python 3
Uploaded using Trusted Publishing? No
Uploaded via: twine/6.1.0 CPython/3.13.3

File hashes

Hashes for voice_mcp-0.1.2-py3-none-any.whl
Algorithm	Hash digest
SHA256	`1695ade97ee94b596ea0cc5dc98abbdb01ac5c90f2c6e4931562f1be3eda7dbe`
MD5	`50f639c4e0ebb56171d12efe28bc71d0`
BLAKE2b-256	`dcff13d715eebf5798d2b075ee7631a2b15ca4cc168d27d243e2c0daaf43fd2a`

See more details on using hashes here.

voice-mcp 0.1.2

Navigation

Verified details

Maintainers

Unverified details

Project links

Meta

Classifiers

Project description

voice-mcp

Quick Start with Python Package

Configure Claude Desktop

Overview

Architecture

Features

Installation Options

Option 1: Python Package (Recommended for Users)

Option 2: Container Image

Option 3: Local Development Setup

Configuration

Python Package Configuration

Local Development Configuration

Provider Selection

TTS Providers

STT Configuration

Key Configuration Options

Usage

Using the Python Package

Local Development Usage

Architecture

Kokoro-FastAPI (Local TTS)

Kokoro Commands

Quick Test

Whisper.cpp (Local STT)

Whisper Commands

Quick Test

Requirements

Development

License

Project details

Verified details

Maintainers

Unverified details

Project links

Meta

Classifiers

Release history Release notifications | RSS feed

Download files

Source Distribution

Built Distribution

File details

File metadata

File hashes

File details

File metadata

File hashes