Skip to main content

OpenAI-compatible Text-to-Speech server for Apple Silicon, powered by Qwen3-TTS and MLX

Project description

mlx-tts-server

OpenAI-compatible Text-to-Speech server for Apple Silicon, powered by Qwen3-TTS and mlx-audio.

Runs natively on Metal — no CUDA, no cloud, just your Mac.

中文文档

Installation

We recommend uv for fast, reliable Python package management:

uv pip install mlx-tts-server

Or with pip:

pip install mlx-tts-server

To install from source for development:

git clone https://github.com/realAllenSong/mlx-tts-server.git
cd mlx-tts-server
uv venv && source .venv/bin/activate
uv pip install -e .

For MP3, OPUS, or AAC output, install ffmpeg:

brew install ffmpeg

Quick Start

Start the TTS server:

mlx-tts serve mlx-community/Qwen3-TTS-12Hz-0.6B-CustomVoice-4bit --port 8000

Synthesize speech:

curl -X POST http://localhost:8000/v1/audio/speech \
  -H "Content-Type: application/json" \
  -d '{"model":"tts-1","input":"Hello, Apple Silicon!","voice":"ryan","response_format":"wav"}' \
  --output speech.wav

open speech.wav

Python Client

Use the OpenAI Python client:

from openai import OpenAI

client = OpenAI(base_url="http://localhost:8000/v1", api_key="dummy")
response = client.audio.speech.create(
    model="tts-1",
    input="Hello from Apple Silicon!",
    voice="alloy",
    response_format="wav",
)
response.stream_to_file("output.wav")

Or use httpx / requests directly:

import httpx

resp = httpx.post(
    "http://localhost:8000/v1/audio/speech",
    json={"model": "tts-1", "input": "Hello!", "voice": "ryan", "response_format": "wav"},
)
with open("output.wav", "wb") as f:
    f.write(resp.content)

Voice Cloning

Clone a voice from a short reference audio clip (requires a Base model):

mlx-tts serve mlx-community/Qwen3-TTS-12Hz-1.7B-Base-bf16 --port 8000

curl -X POST http://localhost:8000/v1/audio/clone \
  -F "input=Say this in my voice." \
  -F "ref_audio=@/path/to/sample.wav" \
  -F "ref_text=This is a sample." \
  --output cloned.wav

Voice Design

Create a voice from a text description (requires a VoiceDesign model):

mlx-tts serve mlx-community/Qwen3-TTS-12Hz-1.7B-VoiceDesign-bf16 --port 8000

curl -X POST http://localhost:8000/v1/audio/speech \
  -H "Content-Type: application/json" \
  -d '{"model":"tts-1","input":"Welcome to our app.","voice":"custom","instruct":"A warm, friendly female voice with a slight British accent"}' \
  --output designed.wav

Supported Models

Model ID Size Type
mlx-community/Qwen3-TTS-12Hz-0.6B-CustomVoice-4bit 0.6B 4-bit 9 preset voices
mlx-community/Qwen3-TTS-12Hz-1.7B-CustomVoice-4bit 1.7B 4-bit 9 preset voices
mlx-community/Qwen3-TTS-12Hz-0.6B-CustomVoice-bf16 0.6B bf16 9 preset voices
mlx-community/Qwen3-TTS-12Hz-1.7B-CustomVoice-bf16 1.7B bf16 9 preset voices
mlx-community/Qwen3-TTS-12Hz-0.6B-Base-4bit 0.6B 4-bit Voice cloning
mlx-community/Qwen3-TTS-12Hz-1.7B-Base-bf16 1.7B bf16 Voice cloning
mlx-community/Qwen3-TTS-12Hz-1.7B-VoiceDesign-bf16 1.7B bf16 Voice design

Available Voices (CustomVoice models)

Voice Name Gender OpenAI Alias
ryan Male echo
aiden Male nova
eric Male onyx
dylan Male
serena Female alloy
vivian Female fable
sohee Female shimmer
uncle_fu Special
ono_anna Special

OpenAI voice aliases (alloy, echo, fable, onyx, nova, shimmer) are accepted and automatically mapped to the corresponding Qwen3-TTS voice.

API Reference

POST /v1/audio/speech

Synthesize speech from text.

Parameter Type Default Description
model string required Model ID (any value, server uses loaded model)
input string required Text to synthesize
voice string "ryan" Voice name or OpenAI alias
response_format string "wav" wav, mp3, flac, opus, aac, pcm
speed float 1.0 Playback speed (0.25–4.0)
language string auto Language (auto-detected if omitted)
instruct string Emotion/style instruction or voice description

POST /v1/audio/clone

Voice cloning via reference audio upload (multipart form).

Field Type Description
input string Text to synthesize
ref_audio file Reference audio file (~3s WAV/MP3)
ref_text string Transcript of reference audio
response_format string Output format (default: wav)
language string Language (default: english)

GET /v1/models

OpenAI-compatible model listing.

GET /v1/audio/speech/voices

List available voice names for the loaded model.

GET /health

Health check. Returns {"status": "ok", "model": "...", "uptime": 42.0}.

CLI Reference

mlx-tts serve MODEL [OPTIONS]
Option Default Description
--host 0.0.0.0 Host to bind
--port 8000 Port to listen on
--workers 1 Number of uvicorn workers
--log-level info Log level (debug, info, warning, error)

Output Formats

Format MIME Type Requires
wav audio/wav built-in (soundfile)
flac audio/flac built-in (soundfile)
pcm audio/pcm built-in
mp3 audio/mpeg brew install ffmpeg
opus audio/opus brew install ffmpeg
aac audio/aac brew install ffmpeg

License

Apache-2.0

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

mlx_tts_server-0.1.0.tar.gz (18.5 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

mlx_tts_server-0.1.0-py3-none-any.whl (18.3 kB view details)

Uploaded Python 3

File details

Details for the file mlx_tts_server-0.1.0.tar.gz.

File metadata

  • Download URL: mlx_tts_server-0.1.0.tar.gz
  • Upload date:
  • Size: 18.5 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: uv/0.9.26 {"installer":{"name":"uv","version":"0.9.26","subcommand":["publish"]},"python":null,"implementation":{"name":null,"version":null},"distro":{"name":"macOS","version":null,"id":null,"libc":null},"system":{"name":null,"release":null},"cpu":null,"openssl_version":null,"setuptools_version":null,"rustc_version":null,"ci":null}

File hashes

Hashes for mlx_tts_server-0.1.0.tar.gz
Algorithm Hash digest
SHA256 b05d8ea5fc95601a6bba80349fd5209ac04055decf6dc010f15326013e8281f3
MD5 05fef328518d7b849837f5410b5179b7
BLAKE2b-256 9db6c03ff6cd69f71b7a33828dda8def63cc1a2a15684bd292743e2ccaf1c1c2

See more details on using hashes here.

File details

Details for the file mlx_tts_server-0.1.0-py3-none-any.whl.

File metadata

  • Download URL: mlx_tts_server-0.1.0-py3-none-any.whl
  • Upload date:
  • Size: 18.3 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: uv/0.9.26 {"installer":{"name":"uv","version":"0.9.26","subcommand":["publish"]},"python":null,"implementation":{"name":null,"version":null},"distro":{"name":"macOS","version":null,"id":null,"libc":null},"system":{"name":null,"release":null},"cpu":null,"openssl_version":null,"setuptools_version":null,"rustc_version":null,"ci":null}

File hashes

Hashes for mlx_tts_server-0.1.0-py3-none-any.whl
Algorithm Hash digest
SHA256 cf712a6d1585c105d33526d7bd0c7f5c9bd0aca3a66d5b83d1e7825952332110
MD5 0fa0c15ed5dfe247add89fb78007aaa3
BLAKE2b-256 2425e47d3be4b2bbfefc4b9639e7fe352337381ceda93977149bcc47c0053294

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page