IndexTTS for Apple Silicon using MLX

These details have not been verified by PyPI

Project links

Project description

MLX-IndexTTS

IndexTTS for Apple Silicon using MLX. Zero-shot text-to-speech with voice cloning capabilities.

Features

Run IndexTTS 1.5/2.0 natively on Apple Silicon
RTF ~0.5 (2x faster than real-time on M2 Max)
Voice cloning from reference audio
v2.0: Emotion control (8 emotions)
Auto-detect model version (1.5/2.0)

Requirements

macOS with Apple Silicon (M1/M2/M3/M4)
Python 3.10+
uv package manager

Installation

# Install uv (if not already installed)
curl -LsSf https://astral.sh/uv/install.sh | sh

# Clone and install
git clone https://github.com/user/mlx-indextts.git
cd mlx-indextts

# Basic install (generation only)
uv sync

# With model conversion support (requires torch)
uv sync --extra convert

Quick Start

1. Convert Model (auto-detects version)

# Convert IndexTTS 1.5
uv run mlx-indextts convert \
    --model-dir /path/to/indexTTS-1.5 \
    -o models/mlx-indexTTS-1.5

# Convert IndexTTS 2.0
uv run mlx-indextts convert \
    --model-dir /path/to/indexTTS-2 \
    -o models/mlx-indexTTS-2.0

2. Generate Speech (auto-detects version)

# v1.5
uv run mlx-indextts generate \
    -m models/mlx-indexTTS-1.5 \
    -r reference.wav \
    -t "你好，这是一个语音合成测试。" \
    -o output.wav

# v2.0
uv run mlx-indextts generate \
    -m models/mlx-indexTTS-2.0 \
    -r reference.wav \
    -t "你好，这是一个语音合成测试。" \
    -o output.wav

# v2.0 with emotion control
uv run mlx-indextts generate \
    -m models/mlx-indexTTS-2.0 \
    -r reference.wav \
    -t "今天真是太开心了！" \
    -o output.wav \
    --emotion happy --emo-alpha 0.8

3. Pre-compute Speaker (Faster Inference)

Pre-compute speaker conditioning to skip audio preprocessing on subsequent generations.

# v1.5
uv run mlx-indextts speaker \
    -m models/mlx-indexTTS-1.5 \
    -r reference.wav \
    -o speaker_v15.npz

# v2.0
uv run mlx-indextts speaker \
    -m models/mlx-indexTTS-2.0 \
    -r reference.wav \
    -o speaker_v20.npz

# Use pre-computed speaker (much faster loading)
uv run mlx-indextts generate \
    -m models/mlx-indexTTS-2.0 \
    -r speaker_v20.npz \
    -t "你好，世界！" \
    -o output.wav

Note: v1.5 and v2.0 speaker files are incompatible - each version requires its own .npz file.

Python API

# v1.5
from mlx_indextts.generate import IndexTTS

tts = IndexTTS.load_model("models/mlx-indexTTS-1.5")
audio = tts.generate(text="你好", ref_audio="reference.wav")
tts.save_audio(audio, "output.wav")

# v2.0
from mlx_indextts.generate_v2 import IndexTTSv2

tts = IndexTTSv2("models/mlx-indexTTS-2.0")
audio = tts.generate(
    text="你好",
    reference_audio="reference.wav",
    output_path="output.wav",
    emotion="happy",
    emo_alpha=0.8,
)

CLI Options

mlx-indextts generate [OPTIONS]

Required:
  -m, --model        Model directory
  -r, --ref-audio    Reference audio (.wav or .npz)
  -t, --text         Text to synthesize
  -o, --output       Output file

Common options:
  --max-tokens       Max mel tokens (default: 800 for v1.5, 1500 for v2.0)
  --temperature      Sampling temperature (default: 1.0 for v1.5, 0.8 for v2.0)
  --seed, -s         Random seed for reproducibility
  -v, --verbose      Verbose output
  -p, --play         Play audio after generation
  --quantize, -q     Runtime quantization: 4, 8, or fp32

v2.0 only:
  --emotion          Emotion: happy/sad/angry/afraid/disgusted/melancholic/surprised/calm
  --emo-alpha        Emotion intensity 0.0-1.0 (default: 1.0)
  --diffusion-steps  Diffusion steps (default: 25)
  --cfg-rate         CFG rate (default: 0.7)

Version Comparison

Feature	v1.5	v2.0
Sample rate	24000 Hz	22050 Hz
Max tokens	800	1815
Default temperature	1.0	0.8
Emotion control	❌	✅ 8 emotions
S2Mel (CFM)	❌	✅
BigVGAN	Custom	nvidia pretrained
Runtime quantization	✅	✅
Speaker pre-compute	✅	✅

Supported Emotions (v2.0)

English	中文
happy	高兴
angry	愤怒
sad	悲伤
afraid	恐惧
disgusted	反感
melancholic	低落
surprised	惊讶
calm	自然

Mixed emotions: --emotion "happy:0.6,sad:0.4"

Performance

Metric	v1.5	v2.0
RTF (M2 Max)	~0.5	~1.3
Load time (.wav)	~0.3s	~9s
Load time (.npz)	~0.3s	~1.5s

License

MIT License

Acknowledgments

IndexTTS - Original PyTorch implementation
MLX - Apple's ML framework

Project details

These details have not been verified by PyPI

Project links

Release history Release notifications | RSS feed

This version

0.1.0

Mar 26, 2026

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

mlx_indextts-0.1.0.tar.gz (962.8 kB view details)

Uploaded Mar 26, 2026 Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

The dropdown lists show the available interpreters, ABIs, and platforms. Enable javascript to be able to filter the list of wheel files.

mlx_indextts-0.1.0-py3-none-any.whl (117.0 kB view details)

Uploaded Mar 26, 2026 Python 3

File details

Details for the file mlx_indextts-0.1.0.tar.gz.

File metadata

Download URL: mlx_indextts-0.1.0.tar.gz
Upload date: Mar 26, 2026
Size: 962.8 kB
Tags: Source
Uploaded using Trusted Publishing? No
Uploaded via: uv/0.9.30 {"installer":{"name":"uv","version":"0.9.30","subcommand":["publish"]},"python":null,"implementation":{"name":null,"version":null},"distro":{"name":"macOS","version":null,"id":null,"libc":null},"system":{"name":null,"release":null},"cpu":null,"openssl_version":null,"setuptools_version":null,"rustc_version":null,"ci":null}

File hashes

Hashes for mlx_indextts-0.1.0.tar.gz
Algorithm	Hash digest
SHA256	`47fae8781e6ebcdb439b9b019bb7f104ce6c318a6be86ab0b7d6ee92676dec57`
MD5	`0821606ee709d746195c35c4c41a5e15`
BLAKE2b-256	`c16cc319c9fd3712d43b6350fe13789914cf14938df4a1d6155e9e4228c6bac9`

See more details on using hashes here.

File details

Details for the file mlx_indextts-0.1.0-py3-none-any.whl.

File metadata

Download URL: mlx_indextts-0.1.0-py3-none-any.whl
Upload date: Mar 26, 2026
Size: 117.0 kB
Tags: Python 3
Uploaded using Trusted Publishing? No
Uploaded via: uv/0.9.30 {"installer":{"name":"uv","version":"0.9.30","subcommand":["publish"]},"python":null,"implementation":{"name":null,"version":null},"distro":{"name":"macOS","version":null,"id":null,"libc":null},"system":{"name":null,"release":null},"cpu":null,"openssl_version":null,"setuptools_version":null,"rustc_version":null,"ci":null}

File hashes

Hashes for mlx_indextts-0.1.0-py3-none-any.whl
Algorithm	Hash digest
SHA256	`0e2e4e2efd3158fe99c0d6839e9d1eece6b2082cde9e54fb650a536332c1ba34`
MD5	`b1be9ab8b182ab3c3073bde65f2e6342`
BLAKE2b-256	`bdadace4661be1e9019eac82173f7d9fbf51e0b1fdc869cc5388d9d6a8233b54`

See more details on using hashes here.

mlx-indextts 0.1.0

Navigation

Verified details

Maintainers

Unverified details

Project links

Meta

Classifiers

Project description

MLX-IndexTTS

Features

Requirements

Installation

Quick Start

1. Convert Model (auto-detects version)

2. Generate Speech (auto-detects version)

3. Pre-compute Speaker (Faster Inference)

Python API

CLI Options

Version Comparison

Supported Emotions (v2.0)

Performance

License

Acknowledgments

Project details

Verified details

Maintainers

Unverified details

Project links

Meta

Classifiers

Release history Release notifications | RSS feed

Download files

Source Distribution

Built Distribution

File details

File metadata

File hashes

File details

File metadata

File hashes