MLX backend for pocket-tts with Apple Silicon optimization
Project description
pocket-tts-mlx
MLX backend for pocket-tts optimized for Apple Silicon.
Runtime is torch-free. Torch is only required for optional parity tests.
Installation
PyPI install:
pip install pocket-tts-mlx
Local development:
pip install -e .
Model weights are downloaded from Hugging Face on first run. For voice cloning weights, accept the model terms and authenticate:
hf auth login
Quickstart
from pocket_tts_mlx import TTSModel
model = TTSModel.load_model()
state = model.get_state_for_audio_prompt("marius")
audio = model.generate_audio(
state,
"Hello from MLX!",
max_tokens=200,
warmup_frames=1,
trim_start_ms=40,
fade_in_ms=15,
)
CLI
Basic usage:
pocket-tts-mlx "Hello, world!" --voice marius --output output.wav
Cleaner onset (recommended if startup artifacts are audible):
pocket-tts-mlx "Hello, world!" --voice marius --output output.wav --warmup-frames 1 --trim-start-ms 40 --fade-in-ms 15
Onset Cleanup Options
--warmup-frames: decode and discard initial Mimi frames to reduce decoder startup transients.--trim-start-ms: trim milliseconds from start of output.--fade-in-ms: apply linear fade-in at start.
Equivalent Python args are warmup_frames, trim_start_ms, and fade_in_ms.
Performance Note
generate_audio() now materializes generated chunks before returning, so np.array(audio) overhead should be near zero in normal usage.
Voices
Predefined voices:
- alba
- marius
- javert
- jean
- fantine
- cosette
- eponine
- azelma
Requirements
- Python 3.10+
- Apple Silicon Mac (M1/M2/M3/M4)
- MLX
- Internet access for initial model downloads
Notes
- Voice cloning requires Hugging Face access to
kyutai/pocket-tts. - Non-voice-cloning weights are used automatically when voice cloning is unavailable.
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file pocket_tts_mlx-0.2.1.tar.gz.
File metadata
- Download URL: pocket_tts_mlx-0.2.1.tar.gz
- Upload date:
- Size: 31.1 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.2.0 CPython/3.12.12
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
c2cad3e52764b16cb5d524e3524a8ea3b89d385d1348ea4f2b4c5a14beb2a32f
|
|
| MD5 |
32e1cee5f64a8b10c9a0f1a01b1859e8
|
|
| BLAKE2b-256 |
8c160af5ff09328a3a3f3a9e43e93be7a3cd3a83596b0cc233924837de9d9dad
|
File details
Details for the file pocket_tts_mlx-0.2.1-py3-none-any.whl.
File metadata
- Download URL: pocket_tts_mlx-0.2.1-py3-none-any.whl
- Upload date:
- Size: 39.1 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.2.0 CPython/3.12.12
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
60b8f2753b5cea523ce9fb909959812742f200c88ee9535f92930b3eaa15ea67
|
|
| MD5 |
da9dd6a63d449d230302fc8422552262
|
|
| BLAKE2b-256 |
fea79ebff3ef6292a6f45314c364251847d67515f3ba2ec8ea1fef1133882b63
|