Voice interaction capabilities for Model Context Protocol (MCP) servers
Project description
voice-mcp
MCP servers that enable voice interactions between LLMs and users through LiveKit.
Quick Start with Python Package
The easiest way to use voice-mcp is through our Python package:
# Install with pip
pip install livekit-voice-mcp
# Or use with uvx (no installation needed)
uvx livekit-voice-mcp
Configure Claude Desktop
Add to your Claude Desktop configuration file:
macOS: ~/Library/Application Support/Claude/claude_desktop_config.json
Windows: %APPDATA%\Claude\claude_desktop_config.json
{
"mcpServers": {
"livekit-voice": {
"command": "uvx",
"args": ["livekit-voice-mcp"],
"env": {
"LIVEKIT_URL": "wss://your-app.livekit.cloud",
"LIVEKIT_API_KEY": "your-api-key",
"LIVEKIT_API_SECRET": "your-api-secret",
"OPENAI_API_KEY": "your-openai-key"
}
}
}
}
Restart Claude Desktop and you can now use voice commands!
Overview
voice-mcp provides Model Context Protocol (MCP) servers that allow LLMs to communicate via voice, enabling natural spoken conversations with AI assistants.
Architecture
┌─────────────────────┐ ┌──────────────────┐ ┌─────────────────────┐
│ Claude/LLM │ │ LiveKit Server │ │ Voice Frontend │
│ (MCP Client) │◄────►│ (Port 7880) │◄────►│ (Port 3001) │
└─────────────────────┘ └──────────────────┘ └─────────────────────┘
│ │
│ │
▼ ▼
┌─────────────────────┐ ┌──────────────────┐
│ Voice MCP Server │ │ Agent.py │
│ (ask_voice_question│ │ (Voice Logic) │
│ check_room_status)│ └──────────────────┘
└─────────────────────┘ │
│
┌────────────────┴────────────────┐
│ │
▼ ▼
┌──────────────────┐ ┌──────────────────┐
│ Whisper.cpp │ │ Kokoro TTS │
│ (Port 2022) │ │ (Port 8880) │
│ Local STT │ │ Local TTS │
└──────────────────┘ └──────────────────┘
Features
- Voice Input/Output: Bidirectional voice communication through LiveKit
- Speech-to-Text: Local whisper.cpp or OpenAI Whisper API
- Text-to-Speech: Multiple TTS providers (OpenAI TTS + local Kokoro-FastAPI)
- Local STT/TTS: Cost-free local speech recognition and voice generation
- Real-time Streaming: Low-latency voice interactions
- MCP Integration: Works seamlessly with Claude and other MCP-compatible clients
Installation Options
Option 1: Python Package (Recommended for Users)
# Install globally
pip install livekit-voice-mcp
# Or use without installation
uvx livekit-voice-mcp
# Or use pipx for isolated installation
pipx install livekit-voice-mcp
Option 2: Container Image
# Pull and run the container
docker pull ghcr.io/mbailey/voice-mcp:latest
# Run with environment variables
docker run -e OPENAI_API_KEY=your_key_here \
-e VOICE_MCP_DEBUG=true \
ghcr.io/mbailey/voice-mcp:latest
See CONTAINER.md for detailed container usage instructions.
Option 3: Local Development Setup
# Clone the repository
git clone https://github.com/mbailey/voice-mcp-public.git
cd voice-mcp-public
# Build container image
make build-container
# Or install development environment
make install
Configuration
Python Package Configuration
Set environment variables before running:
export LIVEKIT_URL="wss://your-app.livekit.cloud"
export LIVEKIT_API_KEY="your-api-key"
export LIVEKIT_API_SECRET="your-api-secret"
export OPENAI_API_KEY="your-openai-key" # For STT/TTS
Local Development Configuration
Copy the example configuration and customize:
cp .env.example .env.local
# Edit .env.local with your settings
Provider Selection
voice-mcp supports multiple STT/TTS providers with smart fallback:
TTS Providers
TTS_PROVIDER=auto(default): Try Kokoro → OpenAI → LiveKitTTS_PROVIDER=kokoro: Use only local Kokoro TTSTTS_PROVIDER=openai: Use only OpenAI TTS
STT Configuration
- Local Whisper: Automatically used when available at
http://localhost:2022 - OpenAI Whisper: Fallback when local whisper is not running
Key Configuration Options
# TTS Provider (auto/kokoro/openai)
TTS_PROVIDER=auto
# Kokoro TTS (local)
KOKORO_URL=http://127.0.0.1:8880
KOKORO_ENABLED=true
# Whisper STT (local)
WHISPER_BASE_URL=http://localhost:2022
# OpenAI (fallback for both STT and TTS)
OPENAI_API_KEY=your_key_here
# LiveKit
LIVEKIT_URL=ws://localhost:7880
Usage
Using the Python Package
Once installed and configured in Claude Desktop, you can use voice commands:
- Ask Claude: "Can you help me with voice?"
- Claude will use the voice MCP tools to communicate
- Speak your questions and hear responses
Available MCP tools:
ask_voice_question: Ask a question via voice and get a text responsecheck_room_status: Check active voice rooms and participants
Local Development Usage
-
Download external repositories:
mt sync -
Install and build all dependencies:
make install -
Start the development environment:
make dev
This will start:
- LiveKit server (port 7880)
- Kokoro TTS (port 8880)
- Whisper STT (port 2022)
- Voice assistant frontend (port 3001)
Individual components:
make livekit-server # Start LiveKit server
make frontend # Start voice frontend
make kokoro-start # Start Kokoro TTS
make whisper-start # Start Whisper STT
Architecture
- livekit-voice-mcp: MCP server for voice interactions
- livekit-admin-mcp: Administrative tools for LiveKit management
- livekit-agent: Python agent handling voice processing
- kokoro-fastapi: Local TTS server providing OpenAI-compatible API
- whisper.cpp: Local STT server providing OpenAI-compatible API
Kokoro-FastAPI (Local TTS)
voice-mcp includes Kokoro-FastAPI for cost-free local text-to-speech generation:
- 70+ Voice Options: Multiple languages and voice styles
- OpenAI Compatible: Drop-in replacement for OpenAI TTS API
- Web Interface: Interactive voice testing at http://127.0.0.1:8880/web/
- Browser Support: Chrome/Chromium recommended (Firefox has streaming limitations)
Kokoro Commands
make kokoro-start # Start Kokoro TTS service
make kokoro-stop # Stop Kokoro TTS service
make kokoro-build # Build Kokoro container
make test-kokoro # Test Kokoro functionality
Quick Test
# Generate speech using Kokoro API
curl -X POST http://127.0.0.1:8880/v1/audio/speech \
-H "Content-Type: application/json" \
-d '{"input": "Hello from Kokoro!", "voice": "nova"}' \
--output test.mp3
Whisper.cpp (Local STT)
voice-mcp includes whisper.cpp for cost-free local speech-to-text:
- Hardware Optimization: Automatically selects best model for your hardware
- OpenAI Compatible: Drop-in replacement for OpenAI Whisper API
- Multiple Models: From tiny to large-v3-turbo
- GPU Support: CUDA, Metal, and Vulkan acceleration
Whisper Commands
make whisper-build # Build Whisper container
make whisper-start # Start Whisper STT service
make whisper-stop # Stop Whisper STT service
Quick Test
# Test whisper API (OpenAI-compatible)
curl -X POST http://localhost:2022/v1/audio/transcriptions \
-H "Content-Type: multipart/form-data" \
-F "file=@audio.wav"
Requirements
- Python 3.8+
- LiveKit server
- Podman or Docker (for Kokoro TTS only)
- Build tools (cmake, make, gcc/g++) for Whisper.cpp
- OpenAI API key (optional, for cloud fallback)
mtcommand for managing external repos
Development
See TASKS.md for development roadmap and technical tasks.
License
MIT
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file voice_mcp-0.1.0.tar.gz.
File metadata
- Download URL: voice_mcp-0.1.0.tar.gz
- Upload date:
- Size: 12.9 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.1.0 CPython/3.13.3
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
45e08405c15fd3cc5067a5951a011a280ce64dd736651ae9b50dbcfce3c9df99
|
|
| MD5 |
1fbf2793716aeac39a0e3fcdccb58a76
|
|
| BLAKE2b-256 |
8a838024d9503cae0d879fb538897a92567f62f307ec79b0952ba6257c7029a2
|
File details
Details for the file voice_mcp-0.1.0-py3-none-any.whl.
File metadata
- Download URL: voice_mcp-0.1.0-py3-none-any.whl
- Upload date:
- Size: 24.1 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.1.0 CPython/3.13.3
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
d50720affd8ad9af8517067d40456daea085973da1e714465b4a9c5870a625c1
|
|
| MD5 |
6e9cd0576f6e0d5b681ca3eebb8adfc5
|
|
| BLAKE2b-256 |
bb567eda6cd77a9cff23148f23fef4e90834b0b0607caab12c7ee5adc2994b50
|