Skip to main content

MLX-native speech library for Apple Silicon.

Project description

mlx-speech

License: MIT Python 3.13+ Platform

Local speech synthesis on Apple Silicon, running pure MLX. No cloud, no PyTorch.

Model Best for
MossTTSLocal shorter TTS, voice cloning, continuation
MOSS-TTSD multi-speaker dialogue
MOSS-SoundEffect text-to-sound-effect
VibeVoice long-form speech, voice-conditioned generation

Requirements

  • Apple Silicon Mac (M1 or later)
  • Python 3.13+
  • uv

Installation

git clone https://github.com/appautomaton/mlx-speech.git
cd mlx-speech
uv sync

PyPI package (pip install mlx-speech) coming soon.

Convert the checkpoints you want to use — each model family has a scripts/convert_*.py entry point:

python scripts/convert_moss_local.py
python scripts/convert_moss_audio_tokenizer.py
python scripts/convert_moss_ttsd.py
python scripts/convert_moss_sound_effect.py
python scripts/convert_vibevoice.py

Usage

Generate speech:

python scripts/generate_moss_local.py \
  --text "Hello, this is a test." \
  --output outputs/moss_local.wav

Clone a voice:

python scripts/generate_moss_local.py \
  --mode clone \
  --text "Hello, this is a cloned sample." \
  --reference-audio reference.wav \
  --output outputs/moss_local_clone.wav

Multi-speaker dialogue:

python scripts/generate_moss_ttsd.py \
  --text "[S1] Watson, we should go now." \
  --output outputs/ttsd.wav

Sound effect:

python scripts/generate_moss_sound_effect.py \
  --ambient-sound "rolling thunder with steady rainfall on a metal roof" \
  --duration-seconds 8 \
  --output outputs/thunder.wav

VibeVoice:

python scripts/generate_vibevoice.py \
  --text "Hello from VibeVoice." \
  --output outputs/vibevoice.wav

Exploring the Codebase

The PyPI package is still in progress. The best way to explore right now is to drop the repo into an agentic coding tool like Claude Code or Cursor — the codebase is structured and self-describing, and an agent can walk you through it quickly.

Model Guides

Each family has a doc covering behavior, flags, and known limitations:

Development

uv run pytest
uv run ruff check .
uv build --no-sources
mlx-speech/
  src/mlx_speech/    library code
  scripts/          conversion and generation entry points
  models/           local checkpoints (not in git)
  tests/            unit and integration tests
  docs/             model-family behavior guides

License

MIT — see LICENSE

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

mlx_speech-0.1.0.tar.gz (89.9 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

mlx_speech-0.1.0-py3-none-any.whl (114.5 kB view details)

Uploaded Python 3

File details

Details for the file mlx_speech-0.1.0.tar.gz.

File metadata

  • Download URL: mlx_speech-0.1.0.tar.gz
  • Upload date:
  • Size: 89.9 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.13.7

File hashes

Hashes for mlx_speech-0.1.0.tar.gz
Algorithm Hash digest
SHA256 45a6f787bf830b97cf61386ef3ca8e5b3122f589732aeb403924757598b0d47d
MD5 0bab1304a3380859bbd0e914fb766a21
BLAKE2b-256 f8145eeb4b0de11e57d80d1972a4cf58cb0a5c5f87a1cb18254c83ad0b8aeadb

See more details on using hashes here.

Provenance

The following attestation bundles were made for mlx_speech-0.1.0.tar.gz:

Publisher: publish.yml on appautomaton/mlx-speech

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file mlx_speech-0.1.0-py3-none-any.whl.

File metadata

  • Download URL: mlx_speech-0.1.0-py3-none-any.whl
  • Upload date:
  • Size: 114.5 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.13.7

File hashes

Hashes for mlx_speech-0.1.0-py3-none-any.whl
Algorithm Hash digest
SHA256 9c93237bee393a094985bc81191fdaf65cbd2eb3eadf5e7672c7e7be89c0ca2a
MD5 75eaa6d4e6a1e619082a4d711bbc5adf
BLAKE2b-256 1d5e9d60b919981d1c95de07ab8172c76144420782d4644115434b10428c19b9

See more details on using hashes here.

Provenance

The following attestation bundles were made for mlx_speech-0.1.0-py3-none-any.whl:

Publisher: publish.yml on appautomaton/mlx-speech

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page