Skip to main content

Text-to-speech CLI, MCP server, and Claude Code plugin (ElevenLabs, AWS Polly, OpenAI)

Project description

punt-vox

Voice for your AI coding assistant.

License CI PyPI Python Working Backwards

When Claude Code finishes a task, hits an error, or needs your approval --- you hear it. No need to watch the terminal. Keep working; your assistant will tell you what happened.

Platforms: macOS, Linux

Hear It

Real samples generated by vox with ElevenLabs v3. The first three are the same recap with different /vibe moods --- expressive tags change how the voice sounds without changing the words.

Sample Vibe Voice
Task recap neutral sarah listen
Same recap [excited] sarah listen
Same recap [weary] [sighs] sarah listen
Task complete neutral matilda listen

Quick Start

curl -fsSL https://raw.githubusercontent.com/punt-labs/vox/086a654/install.sh | sh

Restart Claude Code, then:

/vox y        # hear when tasks complete or need input
/recap        # spoken summary of what just happened
Manual install (if you already have uv)
uv tool install punt-vox
vox install
vox doctor
Verify before running
curl -fsSL https://raw.githubusercontent.com/punt-labs/vox/086a654/install.sh -o install.sh
shasum -a 256 install.sh
cat install.sh
sh install.sh

Features

  • Notification layer --- spoken summaries when tasks finish, chimes when Claude needs input
  • Session vibe --- /vibe sets the mood for all speech. Auto-mode reads session signals (test results, lint, git ops) and adapts the voice. Manual mode lets you set it yourself. ElevenLabs expressive tags ([weary], [excited], [sighs]) color every utterance.
  • Five providers --- ElevenLabs, OpenAI, AWS Polly, macOS say, and Linux espeak-ng. The full experience (natural voice, expressive tags, /vibe) requires ElevenLabs.
  • Opt-in only --- no audio until you enable it, no surprises
  • Voice or chime --- /mute switches to audio tones, no TTS API calls
  • Graceful absence --- if punt-vox isn't installed, Claude Code works exactly as before
  • MCP-native --- runs as a Claude Code plugin with slash commands and hooks
  • Audio daemon --- voxd is a system-level audio server that handles synthesis and playback. Deduplicates audio across sessions, serializes playback, caches synthesis results

What It Looks Like

Enable notifications

> /vox y

Vox enabled. You'll hear when tasks finish or need approval.
Pick a voice with /unmute @<name>.

Get a recap

> /recap

Speaking: "I refactored the authentication module into three files, added
comprehensive tests for the token refresh flow, and fixed a race condition
in the session middleware. All 47 tests pass."

Set the vibe

> /vibe banging my head against the wall

Vibe: banging my head against the wall → [frustrated] [sighs] [manual]

Auto-mode (default) reads session signals and adapts automatically --- after a string of test failures the voice sounds [weary], after a successful release it sounds [excited].

Switch to chime-only

> /mute

Muted — chimes only.

Chimes are mood-aware: when a vibe is active, chimes pitch-shift to match (bright for happy sessions, dark for frustrated ones). Eight distinct signals (tests pass/fail, lint pass/fail, git push, merge conflict, done, prompt) × three mood variants = 24 chime assets.

Commands

Command Purpose
/vox y Enable vox (chime notifications)
/vox n Disable vox
/vox c Continuous mode (spoken summaries on task completion)
/unmute Enable voice mode (spoken notifications)
/unmute @matilda Set session voice + enable voice
/unmute @ Browse voice roster
/mute Chimes only --- no voice
/recap Spoken summary of Claude's last response
/vibe <mood> Set session mood --- voice adapts to match
/vibe auto Auto-detect mood from session signals (default)
/vibe off Disable vibe --- neutral voice

Providers

The full experience --- natural voice with expressive tags that respond to /vibe --- requires ElevenLabs. The other providers are fallbacks for environments where ElevenLabs isn't available.

Provider API Key Default Voice Best For
ElevenLabs ELEVENLABS_API_KEY matilda Recommended. Natural voice, expressive tags via /vibe
OpenAI OPENAI_API_KEY nova Fast notifications, low latency
AWS Polly AWS credentials joanna Natural voice, cost-effective
macOS say samantha Zero-config on macOS, offline
espeak-ng en Zero-config on Linux, offline

Auto-detection order: ElevenLabs > OpenAI > Polly (if AWS credentials valid) > say (macOS) / espeak (Linux).

Architecture

Claude Code ◄── stdio ──► vox mcp ── WebSocket ──► voxd :8421
                                                      │
Hook scripts ──► vox hook <event> ── WebSocket ──►    │
                                                      │
Shell        ──► vox unmute "hi"  ── WebSocket ──►    │
                                                      ▼
                                                   speakers

voxd is a system-level audio daemon. It synthesizes text via TTS providers and plays audio through the speakers. It owns the playback queue (sequential, no overlap), deduplicates identical requests within 5 seconds, and caches synthesis results. It knows nothing about MCP, hooks, projects, or Claude Code.

vox mcp is a lightweight stdio MCP server, one per Claude Code session. It holds session state (voice, vibe, notify mode) in memory and delegates synthesis to voxd over WebSocket. It inherits its working directory from Claude Code and finds .vox/config.md by walking up from there.

vox hook <event> handlers call voxd for chimes and speech. Hook shell scripts are thin gates per the hooks standard.

vox unmute and other CLI commands are one-shot WebSocket clients of voxd.

System Paths

voxd stores runtime state in system directories.

Purpose macOS (Homebrew) Linux
Config $(brew --prefix)/etc/vox/keys.env /etc/vox/keys.env
Logs $(brew --prefix)/var/log/vox/voxd.log /var/log/vox/voxd.log
Runtime $(brew --prefix)/var/run/vox/serve.{port,token} /var/run/vox/serve.{port,token}
Service /Library/LaunchDaemons/com.punt-labs.voxd.plist /etc/systemd/system/voxd.service
Cache ~/.punt-labs/vox/cache/ ~/.punt-labs/vox/cache/

Service Install

sudo vox daemon install    # writes keys.env, registers service, starts voxd

Requires sudo because the service plist/unit goes in a system directory. voxd runs as the installing user (not root) — it needs audio device access tied to the desktop session. The plist sets UserName; the systemd unit sets User=.

Session State

Session state (voice, provider, vibe, notify mode) lives in the MCP server's memory. The daemon is stateless with respect to sessions. Per-project enablement and initial state are read from .vox/config.md in the project directory at MCP server startup. Hook handlers also read and write .vox/config.md for signal accumulation (vibe_signals). The daemon never reads this file.

Daemon Restart

The MCP session (Claude Code ↔ vox mcp) is stdio — unaffected by daemon restarts. The WebSocket connection (vox mcpvoxd) reconnects automatically. No session data is lost.

CLI

punt-vox is also a standalone TTS tool, independent of Claude Code.

vox unmute "Hello world"                       # Synthesize + play
vox record "Hello world" -o hello.mp3          # Synthesize + save
vox record --from segments.json                # From JSON segments file
vox vibe excited                               # Set session mood
vox notify y                                   # Enable notifications
vox notify c                                   # Continuous spoken mode
vox speak n                                    # Chimes only
vox voice matilda                              # Set session voice
vox status                                     # Current state
vox version                                    # Print version
vox doctor                                     # Check setup
vox install                                    # Install Claude Code plugin
vox mcp                                        # Start MCP server (stdio)
voxd                                           # Start audio daemon
sudo vox daemon install                        # Register voxd as system service + write API keys
vox daemon status                              # Check if daemon is running

Environment Variables

Variable Description Default
TTS_PROVIDER Force a specific provider auto-detect
TTS_MODEL Model override provider default
VOX_OUTPUT_DIR Output directory ~/vox-output

Daemon API keys: Run sudo vox daemon install from a shell where your API keys are set (e.g., a directory with .envrc). The command writes keys to the system config directory ($(brew --prefix)/etc/vox/keys.env on macOS, /etc/vox/keys.env on Linux, chmod 0600) so the daemon can use premium providers. Run vox doctor to verify which providers are active.

Roadmap

Shipped

  • Mic API: unified unmute/record/vibe/who MCP tools with segment-based input
  • Notification layer: /vox y|n|c, /mute, /unmute, /recap, Stop + Notification hooks
  • Multi-provider TTS engine: ElevenLabs, AWS Polly, OpenAI, macOS say, Linux espeak-ng
  • Claude Code plugin: marketplace install, MCP server, slash commands
  • CLI: unmute, record, vibe, on/off, mute, version, status, doctor
  • Two-channel display: panel summaries with voice/provider context
  • ElevenLabs streaming API for lower time-to-first-audio
  • /vibe with auto, manual, and off modes --- ElevenLabs expressive tags color every utterance
  • Auto-vibe signal accumulator: test pass/fail, lint, git ops feed mood detection
  • Per-signal chime assets and vibe-driven chimes with mood-aware pitch shifting
  • Audio daemon (voxd): system-level audio server with in-memory playback queue, dedup, synthesis cache, launchd/systemd service management

Coming Soon

Feature What It Does
Per-session voices Each Claude Code session gets its own voice from a pool --- no more five matildas talking at once. /voice to audition and pick.

Documentation

Architecture (PDF) | Design Log | Testing | Changelog

Development

uv sync --all-extras    # Install dependencies
make check              # Run all quality gates

License

MIT

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

punt_vox-4.0.3.tar.gz (248.5 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

punt_vox-4.0.3-py3-none-any.whl (274.6 kB view details)

Uploaded Python 3

File details

Details for the file punt_vox-4.0.3.tar.gz.

File metadata

  • Download URL: punt_vox-4.0.3.tar.gz
  • Upload date:
  • Size: 248.5 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.13.7

File hashes

Hashes for punt_vox-4.0.3.tar.gz
Algorithm Hash digest
SHA256 5b3de70b7daac8c00c2ff954585ab504d6fa0d08ee200744991aa5fade946cdd
MD5 7920538a67292e3cb15ccbd8cb75e3da
BLAKE2b-256 ac268a30fd0135b43405a76aece5cdd8b179ce24dbb174bff7ced2f7ceb24634

See more details on using hashes here.

Provenance

The following attestation bundles were made for punt_vox-4.0.3.tar.gz:

Publisher: release.yml on punt-labs/vox

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file punt_vox-4.0.3-py3-none-any.whl.

File metadata

  • Download URL: punt_vox-4.0.3-py3-none-any.whl
  • Upload date:
  • Size: 274.6 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.13.7

File hashes

Hashes for punt_vox-4.0.3-py3-none-any.whl
Algorithm Hash digest
SHA256 12dcf4cfe92ea8038ae62881cb369301517a7db1ceac7fb5af0f450311b27942
MD5 2635510d34af86843d59bab1998c77a6
BLAKE2b-256 dddfd5abe29eb16b0663eb9bc555ad4720388746007b678fbca2db77ab66084a

See more details on using hashes here.

Provenance

The following attestation bundles were made for punt_vox-4.0.3-py3-none-any.whl:

Publisher: release.yml on punt-labs/vox

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page