Skip to main content

MOSS-TTS-Local-Transformer-v1.5 voice cloning via ONNX Runtime

Project description

moss-tts-local-onnx-cli

MOSS-TTS-Local-Transformer-v1.5 voice cloning via ONNX Runtime. Converted from the original PyTorch model (4.13B params, Qwen3 global + GPT2 local transformer).

Quick Start

One-liner with uvx (auto-downloads ~31GB models on first run):

uvx moss-tts-local-onnx-cli --text "你好,这是一段测试语音合成。" --reference ref.wav --output output.wav

Installation

# Install with uv
uv pip install moss-tts-local-onnx-cli

# Or with pip
pip install moss-tts-local-onnx-cli

# For ModelScope download support (China mainland):
pip install "moss-tts-local-onnx-cli[modelscope]"

CLI Usage

# Basic voice cloning
moss-tts-local-onnx-cli --text "你好" --reference speaker.wav --output output.wav

# With language hint
moss-tts-local-onnx-cli --text "Hello world" --reference speaker.wav --language English --output output.wav

# Download from ModelScope (China)
moss-tts-local-onnx-cli --text "你好" --reference speaker.wav --source modelscope --output output.wav

# Custom model directory
moss-tts-local-onnx-cli --text "你好" --reference speaker.wav --model-dir /path/to/models --output output.wav

Python API

from moss_tts_local_onnx_cli import MossTTS

# Initialize once (auto-downloads models)
tts = MossTTS()

# Generate multiple times without reloading
audio = tts.generate("你好,这是第一段。", reference="ref.wav", language="Chinese")
tts.save(audio, "output1.wav")

audio2 = tts.generate("这是第二段。", reference="ref.wav")
tts.save(audio2, "output2.wav")

# Or generate and save in one call
tts.generate("你好", reference="ref.wav", output="output3.wav")

Audio Codec

The audio codec (MOSS-Audio-Tokenizer-v2) is required for decoding audio tokens. It will be auto-detected from ~/.cache/modelscope/hub/models/OpenMOSS/MOSS-Audio-Tokenizer-v2.

To download it:

pip install modelscope
python -c "from modelscope import snapshot_download; snapshot_download('OpenMOSS/MOSS-Audio-Tokenizer-v2')"

Or specify a custom path:

tts = MossTTS(codec_path="/path/to/MOSS-Audio-Tokenizer-v2")

Model Conversion

To convert the original PyTorch model to ONNX yourself:

# Install export dependencies
pip install torch transformers onnx

# Run export (requires ~18GB RAM)
python scripts/export_onnx.py

# Validate correctness
python scripts/validate.py

See scripts/export_onnx.py for the full conversion pipeline.

Model Architecture

ONNX Model Description Size
global_prefill.onnx Qwen3 36-layer full-sequence forward ~15GB
global_decode.onnx Qwen3 single-token decode with KV cache ~15GB
local_transformer_first.onnx GPT2 1-layer (first step) 290MB
local_transformer.onnx GPT2 1-layer (with KV cache) 290MB
lm_heads.npz LM head weight matrices 120MB

Requirements

  • Python >= 3.10
  • ~32GB RAM (for loading models)
  • ~31GB disk space (for model files)

License

MIT

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

moss_tts_local_onnx_cli-0.1.1.tar.gz (6.6 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

moss_tts_local_onnx_cli-0.1.1-py3-none-any.whl (9.0 kB view details)

Uploaded Python 3

File details

Details for the file moss_tts_local_onnx_cli-0.1.1.tar.gz.

File metadata

  • Download URL: moss_tts_local_onnx_cli-0.1.1.tar.gz
  • Upload date:
  • Size: 6.6 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: uv/0.11.8 {"installer":{"name":"uv","version":"0.11.8","subcommand":["publish"]},"python":null,"implementation":{"name":null,"version":null},"distro":{"name":"macOS","version":null,"id":null,"libc":null},"system":{"name":null,"release":null},"cpu":null,"openssl_version":null,"setuptools_version":null,"rustc_version":null,"ci":null}

File hashes

Hashes for moss_tts_local_onnx_cli-0.1.1.tar.gz
Algorithm Hash digest
SHA256 216be08ffcb849395d1338c85cbf18e7418d39054aad5c4ff0cec901a784b66a
MD5 ef4e09149080d39a71776922fd2141df
BLAKE2b-256 bdce585f4720e85415c402deb31994818017b6e4255b0fcec3faa4f9398b0177

See more details on using hashes here.

File details

Details for the file moss_tts_local_onnx_cli-0.1.1-py3-none-any.whl.

File metadata

  • Download URL: moss_tts_local_onnx_cli-0.1.1-py3-none-any.whl
  • Upload date:
  • Size: 9.0 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: uv/0.11.8 {"installer":{"name":"uv","version":"0.11.8","subcommand":["publish"]},"python":null,"implementation":{"name":null,"version":null},"distro":{"name":"macOS","version":null,"id":null,"libc":null},"system":{"name":null,"release":null},"cpu":null,"openssl_version":null,"setuptools_version":null,"rustc_version":null,"ci":null}

File hashes

Hashes for moss_tts_local_onnx_cli-0.1.1-py3-none-any.whl
Algorithm Hash digest
SHA256 34c921a27d6202c317c2398dd782bd1fa7d7d265ff5fb5f963fce0b1852e9f2d
MD5 8342590b92822ea2bb010eebdb864a46
BLAKE2b-256 dcf0999e5d57056c26f659fd99d7a6e50db241b5807e16553377e9c9cf8a8bb9

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page