Multi-engine TTS for AI coding assistants — speak responses aloud
Project description
Voice Bridge
Multi-engine text-to-speech for AI coding assistants. Speaks responses aloud from Claude Code, Cursor, or VS Code via a Claude Code plugin, MCP server, or CLI pipe.
pip install ai-voice-bridge[edge] # Install with free edge-tts engine
voice-bridge test # Verify audio output
voice-bridge on # Enable always-on mode (optional)
Features
- Free by default -- edge-tts uses Microsoft Neural voices, no API key needed
- 5 engines -- edge-tts, ElevenLabs, Kokoro (local ONNX), macOS say, espeak-ng
- Text safety filter -- strips code blocks, secrets, file paths, URLs, and markdown before speaking
- Claude Code plugin -- Stop hook speaks responses automatically, MCP server for tool-based control
- Voice discovery -- browse, filter by gender/locale, and preview voices interactively
Prerequisites
Voice Bridge is a Python package that plays audio on your local machine. It requires:
| Requirement | Details |
|---|---|
| Python 3.10+ | python3 --version to check |
| pip | Usually bundled with Python. On some Linux distros: sudo apt install python3-pip |
| Audio output | Speakers or headphones — audio plays locally, not over a network |
| Audio player (Linux/Windows) | macOS: built-in (afplay). Linux: mpv (preferred) or ffplay. Windows: ffplay (preferred) or mpv |
Audio player fallback order: macOS uses afplay (always available). Linux tries mpv then ffplay. Windows tries ffplay then mpv.
Not supported: headless servers, Docker containers, SSH sessions, and CI runners typically lack audio output. Voice Bridge will install and run the MCP server, but
speakcommands will fail silently without an audio player and sound hardware.
Quick Start
Install
# Recommended: edge-tts (free, 400+ voices)
pip install ai-voice-bridge[edge]
# Or with all engines
pip install ai-voice-bridge[all]
# Or minimal (macOS say / Linux espeak only)
pip install ai-voice-bridge
Test
voice-bridge test # Speak a test phrase
voice-bridge engines # List available engines
voice-bridge setup # Interactive setup wizard
The setup wizard walks you through: detecting installed engines, testing audio output, optionally entering an ElevenLabs API key (if the SDK is installed), writing default state, and showing Claude Code integration options.
Use
# Pipe text to speech
echo "Hello world" | vb-speak
# Choose an engine
echo "Hello" | vb-speak --engine edge-tts
echo "Hello" | vb-speak --engine say
# Modes
voice-bridge on # Always-on: every AI response spoken
voice-bridge off # Off: use "speak" keyword for single responses
voice-bridge status # Show current mode and engine
Claude Code Integration
Option 1: Install as a Plugin (Recommended)
claude plugin marketplace add Tomorrow-You/voice-bridge
claude plugin install voice-bridge@voice-bridge
This installs the plugin with a Stop hook (auto-speaks responses), the /speak skill, and the MCP server. Auto-installs ai-voice-bridge[edge,mcp] on first session.
Option 2: Manual Hook Setup
Add to your .claude/settings.json:
{
"hooks": {
"Stop": [
{
"hooks": [
{
"type": "command",
"command": "bash -c 'VB_HOOK=$(python3 -c \"import voice_bridge; import pathlib; print(pathlib.Path(voice_bridge.__file__).parent / \\\"integrations\\\" / \\\"claude_hook.sh\\\")\" 2>/dev/null) && [ -f \"$VB_HOOK\" ] && bash \"$VB_HOOK\"'",
"timeout": 5
}
]
}
]
}
}
Then add to your CLAUDE.md:
## Voice Bridge (TTS)
- **NEVER** use `<speak>` tags unless the user's message starts with "speak"
- When user starts with "speak", wrap your ENTIRE response in `<speak>...</speak>` tags
- Strip the "speak" keyword before processing
- Inside tags, write naturally -- no markdown, code blocks, or file paths
Option 2: Always-On Mode
Skip the <speak> tag convention entirely:
voice-bridge on
Now every Claude response is spoken automatically. Toggle off with voice-bridge off.
Hook Details
The Stop hook runs in the background so it doesn't block Claude Code. It:
- Extracts text from
<speak>tags (single-turn mode) or the full response (always-on mode) - Truncates to 2,000 characters before speaking
- Uses a fallback chain: configured engine > espeak > say
- Logs to
~/.voice-bridge/voice-bridge.log(auto-rotated at 1MB, keeps 2 backups) - Runs
vb-speak --streamfor sentence-by-sentence playback
Engines
| Engine | Cost | Quality | Setup | Platform | Default voice |
|---|---|---|---|---|---|
| edge-tts | Free | High (neural) | pip install ai-voice-bridge[edge] |
All | en-US-GuyNeural |
| ElevenLabs | Paid | Highest | pip install ai-voice-bridge[elevenlabs] + API key |
All | George (JBFqnCBsd6RMkjVDRZzb), model eleven_flash_v2_5 |
| Kokoro | Free | Good | pip install ai-voice-bridge[kokoro] + model download |
All (English only) | bm_lewis |
| say | Free | Basic | Built-in | macOS | Samantha |
| espeak | Free | Basic | apt install espeak-ng |
Linux | en |
When engine is set to auto (default), Voice Bridge picks the first available in this order: edge-tts > say > espeak > kokoro > elevenlabs. ElevenLabs is only considered "available" if the SDK is installed AND a valid API key is configured -- it will never be auto-selected without credentials.
Discovering Voices
voice-bridge voices # List voices for current engine
voice-bridge voices edge-tts # List voices for a specific engine
# Filter by gender and/or locale
voice-bridge voices edge-tts --gender Female --locale en-US
# Preview a specific voice
voice-bridge voices edge-tts --preview en-US-AriaNeural
# Interactively audition voices (next/select/quit after each)
voice-bridge voices edge-tts --gender Female --locale en-US --preview
# Random sample of 3 voices
voice-bridge voices edge-tts --sample 3 --preview
Filtering options: --gender (Male/Female) works with edge-tts and kokoro. --locale (e.g. en-US, en-GB) works with edge-tts and say. --sample N picks N random voices. All combine with --preview for interactive audition.
ElevenLabs preview uses free pre-recorded samples when available (no API credits consumed).
Switching Engines
voice-bridge engine edge-tts # Free neural voices
voice-bridge engine elevenlabs # Premium cloud
voice-bridge engine kokoro # Local offline
voice-bridge engine say # macOS built-in
voice-bridge engine espeak # Linux built-in
voice-bridge engine auto # Best available (default)
ElevenLabs Setup
pip install ai-voice-bridge[elevenlabs]
voice-bridge setup # Prompts for your ElevenLabs API key
# Or manually: create ~/.voice-bridge/.env with ELEVENLABS_API_KEY=your-key
voice-bridge engine elevenlabs
voice-bridge test
Kokoro Setup (Offline)
pip install ai-voice-bridge[kokoro]
# Download model files (~200MB) from:
# https://github.com/thewh1teagle/kokoro-onnx/releases/tag/model-files-v1.0
# Place in: ~/.voice-bridge/models/ (or $VOICE_BRIDGE_HOME/models/)
voice-bridge engine kokoro
voice-bridge test
Configuration
Voice Bridge stores configuration in ~/.voice-bridge/ (macOS), ~/.local/share/voice-bridge/ (Linux, respects XDG_DATA_HOME), or %APPDATA%\voice-bridge\ (Windows).
Override with the VOICE_BRIDGE_HOME environment variable.
| File | Purpose |
|---|---|
.env |
API keys (ElevenLabs) |
.state |
Runtime state (mode, engine, speed, voice) |
models/ |
Kokoro ONNX model files |
voice-bridge.log |
Hook execution logs (auto-rotated at 1MB, 2 backups) |
State Variables
The .state file is a shell-sourceable key-value file. All values are optional — defaults apply if unset.
| Variable | Default | Description |
|---|---|---|
VOICE_BRIDGE_MODE |
off |
Mode: off (single-turn) or always (always-on) |
VOICE_BRIDGE_ENGINE |
auto |
Engine name or auto |
VOICE_BRIDGE_EDGE_VOICE |
en-US-GuyNeural |
edge-tts voice |
VOICE_BRIDGE_EDGE_RATE |
+0% |
edge-tts rate (e.g. +30%, -10%) |
VOICE_BRIDGE_ELEVENLABS_SPEED |
1.0 |
ElevenLabs speed (0.7–1.2) |
VOICE_BRIDGE_KOKORO_VOICE |
bm_lewis |
Kokoro voice name |
VOICE_BRIDGE_KOKORO_SPEED |
1.4 |
Kokoro speed multiplier |
VOICE_BRIDGE_SAY_RATE |
200 |
macOS say words per minute |
VOICE_BRIDGE_ESPEAK_RATE |
175 |
espeak words per minute |
Text Safety Filter
Before any text reaches the TTS engine, Voice Bridge strips:
- Code blocks (fenced
```and inline`) - Secrets: OpenAI/Anthropic keys (
sk-...), GitHub tokens (ghp_,gho_), AWS keys (AKIA...), PEM private keys, 64+ char hex strings - File paths: Unix (
/Users/...,/home/...) and Windows (C:\Users\...) - URLs:
http://andhttps:// - Markdown: headers, bold/italic markers, list bullets, table rows
Text is truncated to 4,000 characters at the nearest sentence boundary (. ). The Claude hook applies a separate 2,000 character limit before passing text to vb-speak.
CLI Reference
# Control
voice-bridge on # Always-on mode
voice-bridge off # Single-turn mode (default)
voice-bridge status # Show mode, engine, config
voice-bridge test # Test audio output
voice-bridge engines # List all engines with install status
voice-bridge setup # Interactive setup wizard
# Engine config
voice-bridge engine [name] # Get/set engine
voice-bridge voice [id] # Set voice for current engine
voice-bridge voices [engine] # List available voices
voice-bridge speed [val] # Set engine speed (see below)
# Voice discovery
voice-bridge voices edge-tts --gender Female --locale en-US # Filter
voice-bridge voices edge-tts --preview en-US-AriaNeural # Preview one
voice-bridge voices edge-tts --gender Female --preview # Interactive
voice-bridge voices edge-tts --sample 3 --preview # Random sample
# Pipe to speech
echo "text" | vb-speak # Default engine
echo "text" | vb-speak --engine edge-tts # Specific engine
echo "text" | vb-speak --voice Aria # Override voice for this call
echo "text" | vb-speak --stream # Stream sentence-by-sentence
echo "text" | vb-speak --dry-run # Print filtered text only
Streaming mode (--stream): reads stdin, splits text at sentence boundaries (. , ! , ? ), and speaks each sentence as it completes. For edge-tts, sentences are queued so the next one generates while the current one plays.
Speed Control
Each engine accepts a different speed format:
| Engine | Format | Default | Example |
|---|---|---|---|
| edge-tts | Percentage string | +0% |
voice-bridge speed +30% |
| elevenlabs | Float (0.7–1.2) | 1.0 |
voice-bridge speed 1.1 |
| kokoro | Positive float | 1.4 |
voice-bridge speed 1.8 |
| say | Words per minute | 200 |
voice-bridge speed 250 |
| espeak | Words per minute | 175 |
voice-bridge speed 220 |
Speed applies to whichever engine is currently active. Check with voice-bridge speed (no value).
MCP Server
Voice Bridge includes an MCP (Model Context Protocol) server so any MCP-compatible tool can speak text aloud.
npm shim (
npx ai-voice-bridge): The npm package is a thin wrapper that auto-installs the Python package. It requires Python 3.10+ and pip on yourPATH. On startup it checks for audio players and warns if none are found. See Prerequisites for full requirements.
Setup with Claude Desktop
Add to your claude_desktop_config.json:
{
"mcpServers": {
"voice-bridge": {
"command": "python3",
"args": ["-m", "voice_bridge.mcp.server"]
}
}
}
Or after npm publish, use the npm shim:
{
"mcpServers": {
"voice-bridge": {
"command": "npx",
"args": ["ai-voice-bridge"]
}
}
}
Setup with Cursor / VS Code
Add the same MCP server config in your editor's MCP settings. The config format is the same as Claude Desktop.
Setup with Claude Code
claude mcp add voice-bridge -- python3 -m voice_bridge.mcp.server
MCP Tools
| Tool | Parameters | Description |
|---|---|---|
speak |
text (required), engine (optional) |
Speak text aloud. Optionally override engine for this call. |
set_engine |
name (required) |
Switch the default TTS engine (auto, edge-tts, elevenlabs, kokoro, say, espeak) |
get_status |
(none) | Show current mode, engine, and available engines |
list_voices |
engine (optional) |
List available voices. Defaults to current engine if omitted. |
Install with MCP support
pip install ai-voice-bridge[mcp]
This installs the voice-bridge-mcp command as an alternative to python3 -m voice_bridge.mcp.server.
Development
git clone https://github.com/Tomorrow-You/voice-bridge.git
cd voice-bridge
python3 -m venv .venv && source .venv/bin/activate
pip install -e ".[all,dev]"
pytest
License
MIT. See LICENSE for details.
The edge-tts optional dependency is licensed under GPL-3.0. It is not included in the base install.
Project details
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file ai_voice_bridge-0.1.4.tar.gz.
File metadata
- Download URL: ai_voice_bridge-0.1.4.tar.gz
- Upload date:
- Size: 42.5 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.2.0 CPython/3.12.2
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
7ef5f359dff8c39d4c485c9f67480e16bf1ad5b0dee52e785e2848dc8ef5c6ed
|
|
| MD5 |
9c5e77ea8f774214f091724a1b320f26
|
|
| BLAKE2b-256 |
4937c94622bb954e96611de488bd4f09bd57c12314550b6740f5351538b686ea
|
File details
Details for the file ai_voice_bridge-0.1.4-py3-none-any.whl.
File metadata
- Download URL: ai_voice_bridge-0.1.4-py3-none-any.whl
- Upload date:
- Size: 38.1 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.2.0 CPython/3.12.2
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
0dcf8b6c7728c9df61ce766e33908ec629a4b3b9aba62f32aac98acf3dc926e7
|
|
| MD5 |
e723f12aff5cadc0d1404896570407e0
|
|
| BLAKE2b-256 |
c1e219e98443f04f96f8c515426253a2405423a473b84eace33ebe8918b811de
|