OpenAI-compatible Text-to-Speech server for Apple Silicon, powered by Qwen3-TTS and MLX
Project description
mlx-tts-server
OpenAI-compatible Text-to-Speech server for Apple Silicon, powered by Qwen3-TTS and mlx-audio.
Runs natively on Metal — no CUDA, no cloud, just your Mac.
Installation
We recommend uv for fast, reliable Python package management:
uv pip install mlx-tts-server
Or with pip:
pip install mlx-tts-server
To install from source for development:
git clone https://github.com/realAllenSong/mlx-tts-server.git
cd mlx-tts-server
uv venv && source .venv/bin/activate
uv pip install -e .
For MP3, OPUS, or AAC output, install ffmpeg:
brew install ffmpeg
Quick Start
Start the TTS server:
mlx-tts serve mlx-community/Qwen3-TTS-12Hz-0.6B-CustomVoice-4bit --port 8000
Synthesize speech:
curl -X POST http://localhost:8000/v1/audio/speech \
-H "Content-Type: application/json" \
-d '{"model":"tts-1","input":"Hello, Apple Silicon!","voice":"ryan","response_format":"wav"}' \
--output speech.wav
open speech.wav
Python Client
Use the OpenAI Python client:
from openai import OpenAI
client = OpenAI(base_url="http://localhost:8000/v1", api_key="dummy")
response = client.audio.speech.create(
model="tts-1",
input="Hello from Apple Silicon!",
voice="alloy",
response_format="wav",
)
response.stream_to_file("output.wav")
Or use httpx / requests directly:
import httpx
resp = httpx.post(
"http://localhost:8000/v1/audio/speech",
json={"model": "tts-1", "input": "Hello!", "voice": "ryan", "response_format": "wav"},
)
with open("output.wav", "wb") as f:
f.write(resp.content)
Voice Cloning
Clone a voice from a short reference audio clip (requires a Base model):
mlx-tts serve mlx-community/Qwen3-TTS-12Hz-1.7B-Base-bf16 --port 8000
curl -X POST http://localhost:8000/v1/audio/clone \
-F "input=Say this in my voice." \
-F "ref_audio=@/path/to/sample.wav" \
-F "ref_text=This is a sample." \
--output cloned.wav
Voice Design
Create a voice from a text description (requires a VoiceDesign model):
mlx-tts serve mlx-community/Qwen3-TTS-12Hz-1.7B-VoiceDesign-bf16 --port 8000
curl -X POST http://localhost:8000/v1/audio/speech \
-H "Content-Type: application/json" \
-d '{"model":"tts-1","input":"Welcome to our app.","voice":"custom","instruct":"A warm, friendly female voice with a slight British accent"}' \
--output designed.wav
Supported Models
| Model ID | Size | Type |
|---|---|---|
mlx-community/Qwen3-TTS-12Hz-0.6B-CustomVoice-4bit |
0.6B 4-bit | 9 preset voices |
mlx-community/Qwen3-TTS-12Hz-1.7B-CustomVoice-4bit |
1.7B 4-bit | 9 preset voices |
mlx-community/Qwen3-TTS-12Hz-0.6B-CustomVoice-bf16 |
0.6B bf16 | 9 preset voices |
mlx-community/Qwen3-TTS-12Hz-1.7B-CustomVoice-bf16 |
1.7B bf16 | 9 preset voices |
mlx-community/Qwen3-TTS-12Hz-0.6B-Base-4bit |
0.6B 4-bit | Voice cloning |
mlx-community/Qwen3-TTS-12Hz-1.7B-Base-bf16 |
1.7B bf16 | Voice cloning |
mlx-community/Qwen3-TTS-12Hz-1.7B-VoiceDesign-bf16 |
1.7B bf16 | Voice design |
Available Voices (CustomVoice models)
| Voice Name | Gender | OpenAI Alias |
|---|---|---|
ryan |
Male | echo |
aiden |
Male | nova |
eric |
Male | onyx |
dylan |
Male | — |
serena |
Female | alloy |
vivian |
Female | fable |
sohee |
Female | shimmer |
uncle_fu |
Special | — |
ono_anna |
Special | — |
OpenAI voice aliases (alloy, echo, fable, onyx, nova, shimmer) are
accepted and automatically mapped to the corresponding Qwen3-TTS voice.
API Reference
POST /v1/audio/speech
Synthesize speech from text.
| Parameter | Type | Default | Description |
|---|---|---|---|
model |
string | required | Model ID (any value, server uses loaded model) |
input |
string | required | Text to synthesize |
voice |
string | "ryan" |
Voice name or OpenAI alias |
response_format |
string | "wav" |
wav, mp3, flac, opus, aac, pcm |
speed |
float | 1.0 |
Playback speed (0.25–4.0) |
language |
string | auto | Language (auto-detected if omitted) |
instruct |
string | — | Emotion/style instruction or voice description |
POST /v1/audio/clone
Voice cloning via reference audio upload (multipart form).
| Field | Type | Description |
|---|---|---|
input |
string | Text to synthesize |
ref_audio |
file | Reference audio file (~3s WAV/MP3) |
ref_text |
string | Transcript of reference audio |
response_format |
string | Output format (default: wav) |
language |
string | Language (default: english) |
GET /v1/models
OpenAI-compatible model listing.
GET /v1/audio/speech/voices
List available voice names for the loaded model.
GET /health
Health check. Returns {"status": "ok", "model": "...", "uptime": 42.0}.
CLI Reference
mlx-tts serve MODEL [OPTIONS]
| Option | Default | Description |
|---|---|---|
--host |
0.0.0.0 |
Host to bind |
--port |
8000 |
Port to listen on |
--workers |
1 |
Number of uvicorn workers |
--log-level |
info |
Log level (debug, info, warning, error) |
Output Formats
| Format | MIME Type | Requires |
|---|---|---|
wav |
audio/wav |
built-in (soundfile) |
flac |
audio/flac |
built-in (soundfile) |
pcm |
audio/pcm |
built-in |
mp3 |
audio/mpeg |
brew install ffmpeg |
opus |
audio/opus |
brew install ffmpeg |
aac |
audio/aac |
brew install ffmpeg |
License
Apache-2.0
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file mlx_tts_server-0.1.0.tar.gz.
File metadata
- Download URL: mlx_tts_server-0.1.0.tar.gz
- Upload date:
- Size: 18.5 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: uv/0.9.26 {"installer":{"name":"uv","version":"0.9.26","subcommand":["publish"]},"python":null,"implementation":{"name":null,"version":null},"distro":{"name":"macOS","version":null,"id":null,"libc":null},"system":{"name":null,"release":null},"cpu":null,"openssl_version":null,"setuptools_version":null,"rustc_version":null,"ci":null}
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
b05d8ea5fc95601a6bba80349fd5209ac04055decf6dc010f15326013e8281f3
|
|
| MD5 |
05fef328518d7b849837f5410b5179b7
|
|
| BLAKE2b-256 |
9db6c03ff6cd69f71b7a33828dda8def63cc1a2a15684bd292743e2ccaf1c1c2
|
File details
Details for the file mlx_tts_server-0.1.0-py3-none-any.whl.
File metadata
- Download URL: mlx_tts_server-0.1.0-py3-none-any.whl
- Upload date:
- Size: 18.3 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: uv/0.9.26 {"installer":{"name":"uv","version":"0.9.26","subcommand":["publish"]},"python":null,"implementation":{"name":null,"version":null},"distro":{"name":"macOS","version":null,"id":null,"libc":null},"system":{"name":null,"release":null},"cpu":null,"openssl_version":null,"setuptools_version":null,"rustc_version":null,"ci":null}
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
cf712a6d1585c105d33526d7bd0c7f5c9bd0aca3a66d5b83d1e7825952332110
|
|
| MD5 |
0fa0c15ed5dfe247add89fb78007aaa3
|
|
| BLAKE2b-256 |
2425e47d3be4b2bbfefc4b9639e7fe352337381ceda93977149bcc47c0053294
|