Whisper-style CLI for Qwen3-TTS text-to-speech

These details have not been verified by PyPI

Project links

Homepage

Project description

qwen-tts-cli

Whisper-style CLI for Qwen3-TTS text-to-speech. One command, instant speech.

Install

# Apple Silicon (recommended for Mac — 6x faster)
pip install "qwen-tts-cli[mlx]"

# CUDA / CPU
pip install "qwen-tts-cli[transformers]"

Usage

# Just speak
qwen-tts "Hello, world!"

# Choose a speaker and style
qwen-tts "I can't believe it!" --speaker Aiden --instruct "Speak with excitement"

# Save to a specific file
qwen-tts "Good morning." -o greeting.wav

# Use the larger model
qwen-tts "Higher quality voice." --model 1.7B

# Force a specific backend (auto-detected by default)
qwen-tts "Fast on Mac!" --backend mlx

# Clone a voice from a 3-second sample
qwen-tts "Now I sound like someone else." --clone reference.wav --ref-text "Transcript of the reference audio."

# Design a voice from a description
qwen-tts "Hi there!" --design --instruct "A warm, deep male voice with a calm tone"

# Read from stdin
echo "Pipe text in" | qwen-tts -

# List available speakers
qwen-tts --list-speakers

Options

positional arguments:
  text                    Text to speak. Use "-" to read from stdin.

options:
  -o, --output FILE       Output audio file (default: output.wav)
  -m, --model SIZE        Model: 0.6B, 1.7B, or full HF ID (default: 0.6B)
  -b, --backend BACKEND   Inference backend: transformers, mlx (default: auto)
  -s, --speaker NAME      Speaker voice (default: Ryan)
  -l, --language LANG     Language (default: Auto)
  -i, --instruct TEXT     Style/emotion instruction
  --device DEVICE         Force device: cuda:0, mps, cpu (default: auto, transformers only)
  --play / --no-play      Play audio after generation (default: on for macOS)
  --list-speakers         List available speakers and exit

voice cloning:
  --clone AUDIO           Reference audio for voice cloning
  --ref-text TEXT         Transcript of reference audio

voice design:
  --design                Design a voice using --instruct description

Speakers

Speaker	Description	Language
Ryan	Dynamic rhythmic male	English
Aiden	Sunny clear male	English
Vivian	Bright young female	Chinese
Serena	Warm gentle female	Chinese
Uncle_Fu	Seasoned mellow male	Chinese
Dylan	Clear natural male	Chinese (Beijing)
Eric	Lively bright male	Chinese (Sichuan)
Ono_Anna	Playful light female	Japanese
Sohee	Warm emotional female	Korean

Supported languages

Chinese, English, Japanese, Korean, German, French, Russian, Portuguese, Spanish, Italian.

Backends

Transformers (default)

Uses PyTorch + HuggingFace Transformers. Works on all platforms.

Platform	Device	Precision
NVIDIA GPU	cuda	bfloat16
Apple Silicon	mps	float32
CPU	cpu	float32

MLX (Apple Silicon)

Uses mlx-audio with 8-bit quantization for native Apple Silicon acceleration.

qwen-tts "Hello!" --backend mlx

MLX Model	Mode	HuggingFace ID
1.7B 8-bit	speak	`mlx-community/Qwen3-TTS-12Hz-1.7B-CustomVoice-8bit`
1.7B 8-bit	clone	`mlx-community/Qwen3-TTS-12Hz-1.7B-Base-8bit`
0.6B 4-bit	clone	`mlx-community/Qwen3-TTS-12Hz-0.6B-Base-4bit`

Benchmark (Apple Silicon)

Tested on a 16GB M1 MacBook Pro with the same input text (~14s of audio output):

Model	Load	Avg Gen	RTF
Transformers 0.6B (mps)	10.6s	61.4s	4.36
Transformers 1.7B (mps)	85.0s	117.7s	8.08
MLX 1.7B 8-bit	2.3s	10.2s	1.00

MLX is 6x faster than the equivalent transformers model while using less memory. RTF (real-time factor) of 1.0 means generation runs at real-time speed.

License

Apache-2.0 (same as Qwen3-TTS)

Project details

These details have not been verified by PyPI

Project links

Homepage

Release history Release notifications | RSS feed

0.5.0

Feb 26, 2026

0.4.0

Feb 25, 2026

This version

0.3.0

Feb 24, 2026

0.2.0

Feb 24, 2026

0.1.0

Feb 23, 2026

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

qwen_tts_cli-0.3.0.tar.gz (13.1 kB view details)

Uploaded Feb 24, 2026 Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

The dropdown lists show the available interpreters, ABIs, and platforms. Enable javascript to be able to filter the list of wheel files.

qwen_tts_cli-0.3.0-py3-none-any.whl (12.5 kB view details)

Uploaded Feb 24, 2026 Python 3

File details

Details for the file qwen_tts_cli-0.3.0.tar.gz.

File metadata

Download URL: qwen_tts_cli-0.3.0.tar.gz
Upload date: Feb 24, 2026
Size: 13.1 kB
Tags: Source
Uploaded using Trusted Publishing? No
Uploaded via: twine/6.2.0 CPython/3.11.11

File hashes

Hashes for qwen_tts_cli-0.3.0.tar.gz
Algorithm	Hash digest
SHA256	`d8ff546e27e7f58ba5a359536a7d1bb15ca6f22b2f8f9651fd86e03c09b46143`
MD5	`c80d87067b6e8d61128bc5e8b05a57e9`
BLAKE2b-256	`edeefa90dd12dbc384e6a29b68bde1bc7a4c1ae855b901f0f6491f519ef7b80b`

See more details on using hashes here.

File details

Details for the file qwen_tts_cli-0.3.0-py3-none-any.whl.

File metadata

Download URL: qwen_tts_cli-0.3.0-py3-none-any.whl
Upload date: Feb 24, 2026
Size: 12.5 kB
Tags: Python 3
Uploaded using Trusted Publishing? No
Uploaded via: twine/6.2.0 CPython/3.11.11

File hashes

Hashes for qwen_tts_cli-0.3.0-py3-none-any.whl
Algorithm	Hash digest
SHA256	`130e3aa9a8f761d1cd36b067f2fef162d716f2c2be3a57ecf32b5bd97291d2c8`
MD5	`88ec95bf9a8694417331bc3192c64934`
BLAKE2b-256	`aa1ff0028c9116b6c99d0c23f76a2529ed4410e7e63fcb9e214db664c9dcd7d3`

See more details on using hashes here.

qwen-tts-cli 0.3.0

Navigation

Verified details

Maintainers

Unverified details

Project links

Meta

Classifiers

Project description

qwen-tts-cli

Install

Usage

Options

Speakers

Supported languages

Backends

Transformers (default)

MLX (Apple Silicon)

Benchmark (Apple Silicon)

License

Project details

Verified details

Maintainers

Unverified details

Project links

Meta

Classifiers

Release history Release notifications | RSS feed

Download files

Source Distribution

Built Distribution

File details

File metadata

File hashes

File details

File metadata

File hashes