MOSS-TTS-Local-Transformer-v1.5 voice cloning via ONNX Runtime
Project description
moss-tts-local-onnx-cli
MOSS-TTS-Local-Transformer-v1.5 voice cloning via ONNX Runtime. Converted from the original PyTorch model (4.13B params, Qwen3 global + GPT2 local transformer).
Quick Start
One-liner with uvx (auto-downloads ~31GB models on first run):
uvx moss-tts-local-onnx-cli --text "你好,这是一段测试语音合成。" --reference ref.wav --output output.wav
Installation
# Install with uv
uv pip install moss-tts-local-onnx-cli
# Or with pip
pip install moss-tts-local-onnx-cli
# For ModelScope download support (China mainland):
pip install "moss-tts-local-onnx-cli[modelscope]"
CLI Usage
# Basic voice cloning
moss-tts-local-onnx-cli --text "你好" --reference speaker.wav --output output.wav
# With language hint
moss-tts-local-onnx-cli --text "Hello world" --reference speaker.wav --language English --output output.wav
# Download from ModelScope (China)
moss-tts-local-onnx-cli --text "你好" --reference speaker.wav --source modelscope --output output.wav
# Custom model directory
moss-tts-local-onnx-cli --text "你好" --reference speaker.wav --model-dir /path/to/models --output output.wav
Python API
from moss_tts_local_onnx_cli import MossTTS
# Initialize once (auto-downloads models)
tts = MossTTS()
# Generate multiple times without reloading
audio = tts.generate("你好,这是第一段。", reference="ref.wav", language="Chinese")
tts.save(audio, "output1.wav")
audio2 = tts.generate("这是第二段。", reference="ref.wav")
tts.save(audio2, "output2.wav")
# Or generate and save in one call
tts.generate("你好", reference="ref.wav", output="output3.wav")
Audio Codec
The audio codec (MOSS-Audio-Tokenizer-v2) is required for decoding audio tokens. It will be auto-detected from ~/.cache/modelscope/hub/models/OpenMOSS/MOSS-Audio-Tokenizer-v2.
To download it:
pip install modelscope
python -c "from modelscope import snapshot_download; snapshot_download('OpenMOSS/MOSS-Audio-Tokenizer-v2')"
Or specify a custom path:
tts = MossTTS(codec_path="/path/to/MOSS-Audio-Tokenizer-v2")
Model Conversion
To convert the original PyTorch model to ONNX yourself:
# Install export dependencies
pip install torch transformers onnx
# Run export (requires ~18GB RAM)
python scripts/export_onnx.py
# Validate correctness
python scripts/validate.py
See scripts/export_onnx.py for the full conversion pipeline.
Model Architecture
| ONNX Model | Description | Size |
|---|---|---|
global_prefill.onnx |
Qwen3 36-layer full-sequence forward | ~15GB |
global_decode.onnx |
Qwen3 single-token decode with KV cache | ~15GB |
local_transformer_first.onnx |
GPT2 1-layer (first step) | 290MB |
local_transformer.onnx |
GPT2 1-layer (with KV cache) | 290MB |
lm_heads.npz |
LM head weight matrices | 120MB |
Requirements
- Python >= 3.10
- ~32GB RAM (for loading models)
- ~31GB disk space (for model files)
License
MIT
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file moss_tts_local_onnx_cli-0.1.0.tar.gz.
File metadata
- Download URL: moss_tts_local_onnx_cli-0.1.0.tar.gz
- Upload date:
- Size: 6.6 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: uv/0.11.8 {"installer":{"name":"uv","version":"0.11.8","subcommand":["publish"]},"python":null,"implementation":{"name":null,"version":null},"distro":{"name":"macOS","version":null,"id":null,"libc":null},"system":{"name":null,"release":null},"cpu":null,"openssl_version":null,"setuptools_version":null,"rustc_version":null,"ci":null}
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
efaf06bea6a63540569d6480d68d84dd28073fa43b9ce929985396ed460bb59e
|
|
| MD5 |
59b705c2500780b400d01096b39950c4
|
|
| BLAKE2b-256 |
e69d5b72f4865060c5e47889feef37602ba2def20cea1aedde709e1b937f9cba
|
File details
Details for the file moss_tts_local_onnx_cli-0.1.0-py3-none-any.whl.
File metadata
- Download URL: moss_tts_local_onnx_cli-0.1.0-py3-none-any.whl
- Upload date:
- Size: 9.0 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: uv/0.11.8 {"installer":{"name":"uv","version":"0.11.8","subcommand":["publish"]},"python":null,"implementation":{"name":null,"version":null},"distro":{"name":"macOS","version":null,"id":null,"libc":null},"system":{"name":null,"release":null},"cpu":null,"openssl_version":null,"setuptools_version":null,"rustc_version":null,"ci":null}
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
9179ad9d3820a7d124e00d2f2b9d86a898ad4a98c93844fd1318add2b1c598e7
|
|
| MD5 |
3288228e3232828ca0f08f4b9b219b9d
|
|
| BLAKE2b-256 |
05f539f4bde100cdcf28c898a20540be13ba9fe55593ad9b60e3bffa17154fc9
|