Text-to-speech CLI, MCP server, and Claude Code plugin (ElevenLabs, AWS Polly, OpenAI)
Project description
punt-vox
Voice for your AI coding assistant.
When Claude Code finishes a task, hits an error, or needs your approval --- you hear it. No need to watch the terminal. Keep working; your assistant will tell you what happened.
Platforms: macOS, Linux
Hear It
Real samples generated by vox with ElevenLabs v3. The first three are the same recap with different /vibe moods --- expressive tags change how the voice sounds without changing the words.
| Sample | Vibe | Voice | |
|---|---|---|---|
| Task recap | neutral | sarah | listen |
| Same recap | [excited] |
sarah | listen |
| Same recap | [weary] [sighs] |
sarah | listen |
| Task complete | neutral | matilda | listen |
Quick Start
curl -fsSL https://raw.githubusercontent.com/punt-labs/vox/368327b/install.sh | sh
Restart Claude Code, then:
/vox y # hear when tasks complete or need input
/recap # spoken summary of what just happened
Manual install (if you already have uv)
uv tool install punt-vox
vox install
vox doctor
Verify before running
curl -fsSL https://raw.githubusercontent.com/punt-labs/vox/368327b/install.sh -o install.sh
shasum -a 256 install.sh
cat install.sh
sh install.sh
Features
- Notification layer --- spoken summaries when tasks finish, chimes when Claude needs input
- Session vibe ---
/vibesets the mood for all speech. Auto-mode reads session signals (test results, lint, git ops) and adapts the voice. Manual mode lets you set it yourself. ElevenLabs expressive tags ([weary],[excited],[sighs]) color every utterance. - Five providers --- ElevenLabs, OpenAI, AWS Polly, macOS
say, and Linuxespeak-ng. The full experience (natural voice, expressive tags,/vibe) requires ElevenLabs. - Opt-in only --- no audio until you enable it, no surprises
- Voice or chime ---
/muteswitches to audio tones, no TTS API calls - Graceful absence --- if punt-vox isn't installed, Claude Code works exactly as before
- MCP-native --- runs as a Claude Code plugin with slash commands and hooks
- Daemon mode --- optional single-process daemon (
vox serve) fronted by mcp-proxy. Eliminates per-session overhead, deduplicates audio across sessions, and drops hook latency from ~500ms to ~15ms
What It Looks Like
Enable notifications
> /vox y
Vox enabled. You'll hear when tasks finish or need approval.
Pick a voice with /unmute @<name>.
Get a recap
> /recap
Speaking: "I refactored the authentication module into three files, added
comprehensive tests for the token refresh flow, and fixed a race condition
in the session middleware. All 47 tests pass."
Set the vibe
> /vibe banging my head against the wall
Vibe: banging my head against the wall → [frustrated] [sighs] [manual]
Auto-mode (default) reads session signals and adapts automatically --- after a string of test failures the voice sounds [weary], after a successful release it sounds [excited].
Switch to chime-only
> /mute
Muted — chimes only.
Chimes are mood-aware: when a vibe is active, chimes pitch-shift to match (bright for happy sessions, dark for frustrated ones). Eight distinct signals (tests pass/fail, lint pass/fail, git push, merge conflict, done, prompt) × three mood variants = 24 chime assets.
Commands
| Command | Purpose |
|---|---|
/vox y |
Enable vox (chime notifications) |
/vox n |
Disable vox |
/vox c |
Continuous mode (spoken summaries on task completion) |
/unmute |
Enable voice mode (spoken notifications) |
/unmute @matilda |
Set session voice + enable voice |
/unmute @ |
Browse voice roster |
/mute |
Chimes only --- no voice |
/recap |
Spoken summary of Claude's last response |
/vibe <mood> |
Set session mood --- voice adapts to match |
/vibe auto |
Auto-detect mood from session signals (default) |
/vibe off |
Disable vibe --- neutral voice |
Providers
The full experience --- natural voice with expressive tags that respond to /vibe --- requires ElevenLabs. The other providers are fallbacks for environments where ElevenLabs isn't available.
| Provider | API Key | Default Voice | Best For |
|---|---|---|---|
| ElevenLabs | ELEVENLABS_API_KEY |
matilda | Recommended. Natural voice, expressive tags via /vibe |
| OpenAI | OPENAI_API_KEY |
nova | Fast notifications, low latency |
| AWS Polly | AWS credentials | joanna | Natural voice, cost-effective |
| macOS say | — | samantha | Zero-config on macOS, offline |
| espeak-ng | — | en | Zero-config on Linux, offline |
Auto-detection order: ElevenLabs > OpenAI > Polly (if AWS credentials valid) > say (macOS) / espeak (Linux).
CLI
punt-vox is also a standalone TTS tool, independent of Claude Code.
vox unmute "Hello world" # Synthesize + play
vox record "Hello world" -o hello.mp3 # Synthesize + save
vox record --from segments.json # From JSON segments file
vox vibe excited # Set session mood
vox notify y # Enable notifications
vox notify c # Continuous spoken mode
vox speak n # Chimes only
vox voice matilda # Set session voice
vox status # Current state
vox version # Print version
vox doctor # Check setup
vox install # Install Claude Code plugin
vox mcp # Start MCP server (stdio)
vox serve # Start daemon (HTTP + WebSocket)
vox daemon install # Register as system service
vox daemon status # Check if daemon is running
Environment Variables
| Variable | Description | Default |
|---|---|---|
TTS_PROVIDER |
Force a specific provider | auto-detect |
TTS_MODEL |
Model override | provider default |
VOX_OUTPUT_DIR |
Output directory | ~/vox-output |
Roadmap
Shipped
- Mic API: unified
unmute/record/vibe/whoMCP tools with segment-based input - Notification layer:
/vox y|n|c,/mute,/unmute,/recap, Stop + Notification hooks - Multi-provider TTS engine: ElevenLabs, AWS Polly, OpenAI, macOS
say, Linuxespeak-ng - Claude Code plugin: marketplace install, MCP server, slash commands
- CLI: unmute, record, vibe, on/off, mute, version, status, doctor
- Ephemeral output mode (
.vox/in cwd) - Two-channel display:
♪panel summaries with voice/provider context - Audio playback serialization via
flock--- concurrent utterances queue instead of overlapping - ElevenLabs streaming API for lower time-to-first-audio
/vibewith auto, manual, and off modes --- ElevenLabs expressive tags color every utterance- Auto-vibe signal accumulator: test pass/fail, lint, git ops feed mood detection
- Per-signal chime assets and vibe-driven chimes with mood-aware pitch shifting
- Daemon mode: single
vox serveprocess with mcp-proxy, audio deduplication, launchd/systemd service management
Coming Soon
| Feature | What It Does |
|---|---|
| Per-session voices | Each Claude Code session gets its own voice from a pool --- no more five matildas talking at once. /voice to audition and pick. |
Documentation
Architecture (PDF) | Design Log | Testing | Changelog
Development
uv sync --all-extras # Install dependencies
make check # Run all quality gates
License
MIT
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file punt_vox-1.10.0.tar.gz.
File metadata
- Download URL: punt_vox-1.10.0.tar.gz
- Upload date:
- Size: 69.3 kB
- Tags: Source
- Uploaded using Trusted Publishing? Yes
- Uploaded via: twine/6.1.0 CPython/3.13.7
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
d7c3bee4e4073ce13596b39324887fee93ad77e1f629386adbbb9ff40ab5eb88
|
|
| MD5 |
60b39f3e249084c2a6bdb1a3230e1b7d
|
|
| BLAKE2b-256 |
17679972f08de07321aaf37a6b418aed9dd6c3623666bc0a190660498b767e00
|
Provenance
The following attestation bundles were made for punt_vox-1.10.0.tar.gz:
Publisher:
release.yml on punt-labs/vox
-
Statement:
-
Statement type:
https://in-toto.io/Statement/v1 -
Predicate type:
https://docs.pypi.org/attestations/publish/v1 -
Subject name:
punt_vox-1.10.0.tar.gz -
Subject digest:
d7c3bee4e4073ce13596b39324887fee93ad77e1f629386adbbb9ff40ab5eb88 - Sigstore transparency entry: 1105456596
- Sigstore integration time:
-
Permalink:
punt-labs/vox@3837b9cb61dd9c16f7ba4be8bd6bdd13675979c0 -
Branch / Tag:
refs/tags/v1.10.0 - Owner: https://github.com/punt-labs
-
Access:
public
-
Token Issuer:
https://token.actions.githubusercontent.com -
Runner Environment:
github-hosted -
Publication workflow:
release.yml@3837b9cb61dd9c16f7ba4be8bd6bdd13675979c0 -
Trigger Event:
push
-
Statement type:
File details
Details for the file punt_vox-1.10.0-py3-none-any.whl.
File metadata
- Download URL: punt_vox-1.10.0-py3-none-any.whl
- Upload date:
- Size: 86.2 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? Yes
- Uploaded via: twine/6.1.0 CPython/3.13.7
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
34a8048388fa63501316900b74baa0cd0d7d5c6c51cc2205da6deeb23fa21f51
|
|
| MD5 |
6db58191c6676aede65a7c569667df9f
|
|
| BLAKE2b-256 |
92dd86b8e6af941e2011a5aebb33dd1ad5b543444fb9063a8c9630a6ca3d8105
|
Provenance
The following attestation bundles were made for punt_vox-1.10.0-py3-none-any.whl:
Publisher:
release.yml on punt-labs/vox
-
Statement:
-
Statement type:
https://in-toto.io/Statement/v1 -
Predicate type:
https://docs.pypi.org/attestations/publish/v1 -
Subject name:
punt_vox-1.10.0-py3-none-any.whl -
Subject digest:
34a8048388fa63501316900b74baa0cd0d7d5c6c51cc2205da6deeb23fa21f51 - Sigstore transparency entry: 1105456654
- Sigstore integration time:
-
Permalink:
punt-labs/vox@3837b9cb61dd9c16f7ba4be8bd6bdd13675979c0 -
Branch / Tag:
refs/tags/v1.10.0 - Owner: https://github.com/punt-labs
-
Access:
public
-
Token Issuer:
https://token.actions.githubusercontent.com -
Runner Environment:
github-hosted -
Publication workflow:
release.yml@3837b9cb61dd9c16f7ba4be8bd6bdd13675979c0 -
Trigger Event:
push
-
Statement type: