MLX-native speech library for Apple Silicon.
Project description
mlx-speech
Local speech synthesis on Apple Silicon, running pure MLX. No cloud, no PyTorch.
| Model | Best for |
|---|---|
| MossTTSLocal | shorter TTS, voice cloning, continuation |
| MOSS-TTSD | multi-speaker dialogue |
| MOSS-SoundEffect | text-to-sound-effect |
| VibeVoice | long-form speech, voice-conditioned generation |
Requirements
- Apple Silicon Mac (M1 or later)
- Python 3.13+
- uv
Installation
git clone https://github.com/appautomaton/mlx-speech.git
cd mlx-speech
uv sync
PyPI package (
pip install mlx-speech) coming soon.
Convert the checkpoints you want to use — each model family has a scripts/convert_*.py entry point:
python scripts/convert_moss_local.py
python scripts/convert_moss_audio_tokenizer.py
python scripts/convert_moss_ttsd.py
python scripts/convert_moss_sound_effect.py
python scripts/convert_vibevoice.py
Usage
Generate speech:
python scripts/generate_moss_local.py \
--text "Hello, this is a test." \
--output outputs/moss_local.wav
Clone a voice:
python scripts/generate_moss_local.py \
--mode clone \
--text "Hello, this is a cloned sample." \
--reference-audio reference.wav \
--output outputs/moss_local_clone.wav
Multi-speaker dialogue:
python scripts/generate_moss_ttsd.py \
--text "[S1] Watson, we should go now." \
--output outputs/ttsd.wav
Sound effect:
python scripts/generate_moss_sound_effect.py \
--ambient-sound "rolling thunder with steady rainfall on a metal roof" \
--duration-seconds 8 \
--output outputs/thunder.wav
VibeVoice:
python scripts/generate_vibevoice.py \
--text "Hello from VibeVoice." \
--output outputs/vibevoice.wav
Exploring the Codebase
The PyPI package is still in progress. The best way to explore right now is to drop the repo into an agentic coding tool like Claude Code or Cursor — the codebase is structured and self-describing, and an agent can walk you through it quickly.
Model Guides
Each family has a doc covering behavior, flags, and known limitations:
Development
uv run pytest
uv run ruff check .
uv build --no-sources
mlx-speech/
src/mlx_speech/ library code
scripts/ conversion and generation entry points
models/ local checkpoints (not in git)
tests/ unit and integration tests
docs/ model-family behavior guides
License
MIT — see LICENSE
Project details
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file mlx_speech-0.1.0.tar.gz.
File metadata
- Download URL: mlx_speech-0.1.0.tar.gz
- Upload date:
- Size: 89.9 kB
- Tags: Source
- Uploaded using Trusted Publishing? Yes
- Uploaded via: twine/6.1.0 CPython/3.13.7
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
45a6f787bf830b97cf61386ef3ca8e5b3122f589732aeb403924757598b0d47d
|
|
| MD5 |
0bab1304a3380859bbd0e914fb766a21
|
|
| BLAKE2b-256 |
f8145eeb4b0de11e57d80d1972a4cf58cb0a5c5f87a1cb18254c83ad0b8aeadb
|
Provenance
The following attestation bundles were made for mlx_speech-0.1.0.tar.gz:
Publisher:
publish.yml on appautomaton/mlx-speech
-
Statement:
-
Statement type:
https://in-toto.io/Statement/v1 -
Predicate type:
https://docs.pypi.org/attestations/publish/v1 -
Subject name:
mlx_speech-0.1.0.tar.gz -
Subject digest:
45a6f787bf830b97cf61386ef3ca8e5b3122f589732aeb403924757598b0d47d - Sigstore transparency entry: 1203500091
- Sigstore integration time:
-
Permalink:
appautomaton/mlx-speech@9356d4d1c3277a1d370239ebfa67c849c3aa69f9 -
Branch / Tag:
refs/tags/v0.1.0 - Owner: https://github.com/appautomaton
-
Access:
public
-
Token Issuer:
https://token.actions.githubusercontent.com -
Runner Environment:
github-hosted -
Publication workflow:
publish.yml@9356d4d1c3277a1d370239ebfa67c849c3aa69f9 -
Trigger Event:
push
-
Statement type:
File details
Details for the file mlx_speech-0.1.0-py3-none-any.whl.
File metadata
- Download URL: mlx_speech-0.1.0-py3-none-any.whl
- Upload date:
- Size: 114.5 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? Yes
- Uploaded via: twine/6.1.0 CPython/3.13.7
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
9c93237bee393a094985bc81191fdaf65cbd2eb3eadf5e7672c7e7be89c0ca2a
|
|
| MD5 |
75eaa6d4e6a1e619082a4d711bbc5adf
|
|
| BLAKE2b-256 |
1d5e9d60b919981d1c95de07ab8172c76144420782d4644115434b10428c19b9
|
Provenance
The following attestation bundles were made for mlx_speech-0.1.0-py3-none-any.whl:
Publisher:
publish.yml on appautomaton/mlx-speech
-
Statement:
-
Statement type:
https://in-toto.io/Statement/v1 -
Predicate type:
https://docs.pypi.org/attestations/publish/v1 -
Subject name:
mlx_speech-0.1.0-py3-none-any.whl -
Subject digest:
9c93237bee393a094985bc81191fdaf65cbd2eb3eadf5e7672c7e7be89c0ca2a - Sigstore transparency entry: 1203500094
- Sigstore integration time:
-
Permalink:
appautomaton/mlx-speech@9356d4d1c3277a1d370239ebfa67c849c3aa69f9 -
Branch / Tag:
refs/tags/v0.1.0 - Owner: https://github.com/appautomaton
-
Access:
public
-
Token Issuer:
https://token.actions.githubusercontent.com -
Runner Environment:
github-hosted -
Publication workflow:
publish.yml@9356d4d1c3277a1d370239ebfa67c849c3aa69f9 -
Trigger Event:
push
-
Statement type: