Skip to main content

MCP server wrapping Mistral Voxtral 4B TTS (via mlx-audio) for Claude Code on Apple Silicon — in-process, streaming, gap-free playback

Project description

voxtral-mcp

Local voice for any MCP client (Claude Code, Claude Desktop, Cursor, etc.) via Mistral Voxtral 4B TTS (MLX 4-bit). Higher voice quality than the lightweight pocket-tts route, at the cost of a larger model and slower TTFA.

  • 9 languages: 🇬🇧 English, 🇫🇷 French, 🇩🇪 German, 🇪🇸 Spanish, 🇮🇹 Italian, 🇵🇹 Portuguese, 🇳🇱 Dutch, 🇮🇳 Hindi, 🇸🇦 Arabic
  • 4 B parameters, 4-bit MLX quantization (~2.5 GB on disk)
  • ~2.4× real-time generation on Apple Silicon M-series
  • TTFA ~2 s thanks to native streaming via mlx-audio stream=True
  • Non-blocking speak(), gap-free playback via sounddevice write-mode
  • ~3 GB resident RAM once the model is loaded

⚠️ Licence: the Voxtral model itself is distributed by Mistral under CC BY-NC 4.0 — non-commercial use only. This wrapper's code is MIT.

Requirements

  • macOS Apple Silicon (M1/M2/M3/M4) — required, MLX doesn't run on Intel
  • ≥16 GB RAM recommended (the 4-bit model keeps ~3 GB resident)
  • Python 3.10 – 3.13

Install

uvx voxtral-mcp --help

Or persistent:

uv tool install voxtral-mcp

Then add to your MCP client's .mcp.json:

{
  "mcpServers": {
    "voxtral": {
      "command": "uvx",
      "args": ["voxtral-mcp"]
    }
  }
}

MCP tools

Tool Purpose
speak(text, voice?, interrupt?) Generate audio for text and queue it for background playback. Returns immediately, streaming generation. By default, calls queue and play sequentially (including across turns). Pass interrupt=True to abort current playback and clear the queue first.
stop_speaking() Stop current playback, drop queue, cancel in-flight generation. Use for explicit "mute" requests. For mid-turn interruption + new speech, use speak(..., interrupt=True) instead.
status() Report model load state, queue depths, sample rate, last error.

Configuration

All env vars (in the env block of .mcp.json):

Variable Default Notes
VOXTRAL_MODEL mlx-community/Voxtral-4B-TTS-2603-mlx-4bit Any Voxtral MLX model on HF (4-bit / 6-bit / bf16)
VOXTRAL_STREAMING_INTERVAL 2.0 Approx. seconds of audio per streaming chunk
VOXTRAL_MAX_TOKENS 4096 Generation cap (in audio tokens, not characters)
VOXTRAL_SAMPLE_RATE 24000 Output sample rate

Voices

Voxtral ships with 20 preset voices across 9 languages. A non-exhaustive sample:

Language Voices
English casual_male, casual_female, cheerful_female, neutral_male, neutral_female
French fr_male, fr_female
Spanish es_male, es_female
German de_male, de_female
Italian it_male, it_female
Portuguese pt_male, pt_female
Dutch nl_male, nl_female
Arabic ar_male
Hindi hi_male, hi_female

Pass voice="fr_female" in your speak() call to switch.

Claude Code users

If you're on Claude Code (CLI, desktop, or via Cursor), install the bundled plugin (MCP wiring + /voice-mode skill) in one shot:

/plugin marketplace add Vincweb/voxtral-mcp
/plugin install voxtral@vincweb-tools

See the main repo for architecture details and a side-by-side comparison with kyutai-tts-mcp (smaller / faster TTFA / permissive licence).

License

MIT for this wrapper. The underlying Voxtral model is CC BY-NC 4.0 (non-commercial) — see Mistral.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

voxtral_mcp-0.5.0.tar.gz (122.3 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

voxtral_mcp-0.5.0-py3-none-any.whl (6.9 kB view details)

Uploaded Python 3

File details

Details for the file voxtral_mcp-0.5.0.tar.gz.

File metadata

  • Download URL: voxtral_mcp-0.5.0.tar.gz
  • Upload date:
  • Size: 122.3 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.13.12

File hashes

Hashes for voxtral_mcp-0.5.0.tar.gz
Algorithm Hash digest
SHA256 ad6d3be9dfe511169055dcf02cd31ce5576ab3049b2b423e01f82445d7536f53
MD5 39f2f172749e6c0ffa03e3b767963c9d
BLAKE2b-256 6032a4923cdeb957a2385f75f368ffb3123ad8f0f3312c3cfc24af581a68856c

See more details on using hashes here.

Provenance

The following attestation bundles were made for voxtral_mcp-0.5.0.tar.gz:

Publisher: publish.yml on Vincweb/voxtral-mcp

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file voxtral_mcp-0.5.0-py3-none-any.whl.

File metadata

  • Download URL: voxtral_mcp-0.5.0-py3-none-any.whl
  • Upload date:
  • Size: 6.9 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.13.12

File hashes

Hashes for voxtral_mcp-0.5.0-py3-none-any.whl
Algorithm Hash digest
SHA256 ea0fed44ef23cdea5cf76e3fd2f66adf2e6520c707e54d92c90c53716c60a76a
MD5 f289fc4196d3c20366563fdae627123b
BLAKE2b-256 2b8b5cae27370e29cb303e5333852d10586c70e0a200a32d88eb89dbab4e5d8f

See more details on using hashes here.

Provenance

The following attestation bundles were made for voxtral_mcp-0.5.0-py3-none-any.whl:

Publisher: publish.yml on Vincweb/voxtral-mcp

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page