Skip to main content

MOSS-TTS-Local-Transformer-v1.5 voice cloning via ONNX Runtime

Project description

moss-tts-local-onnx-cli

MOSS-TTS-Local-Transformer-v1.5 voice cloning via ONNX Runtime. Converted from the original PyTorch model (4.13B params, Qwen3 global + GPT2 local transformer).

Quick Start

One-liner with uvx (auto-downloads ~31GB models on first run):

uvx moss-tts-local-onnx-cli --text "你好,这是一段测试语音合成。" --reference ref.wav --output output.wav

Installation

# Install with uv
uv pip install moss-tts-local-onnx-cli

# Or with pip
pip install moss-tts-local-onnx-cli

# For ModelScope download support (China mainland):
pip install "moss-tts-local-onnx-cli[modelscope]"

CLI Usage

# Basic voice cloning
moss-tts-local-onnx-cli --text "你好" --reference speaker.wav --output output.wav

# With language hint
moss-tts-local-onnx-cli --text "Hello world" --reference speaker.wav --language English --output output.wav

# Download from ModelScope (China)
moss-tts-local-onnx-cli --text "你好" --reference speaker.wav --source modelscope --output output.wav

# Custom model directory
moss-tts-local-onnx-cli --text "你好" --reference speaker.wav --model-dir /path/to/models --output output.wav

Python API

from moss_tts_local_onnx_cli import MossTTS

# Initialize once (auto-downloads models)
tts = MossTTS()

# Generate multiple times without reloading
audio = tts.generate("你好,这是第一段。", reference="ref.wav", language="Chinese")
tts.save(audio, "output1.wav")

audio2 = tts.generate("这是第二段。", reference="ref.wav")
tts.save(audio2, "output2.wav")

# Or generate and save in one call
tts.generate("你好", reference="ref.wav", output="output3.wav")

Audio Codec

The audio codec (MOSS-Audio-Tokenizer-v2) is required for decoding audio tokens. It will be auto-detected from ~/.cache/modelscope/hub/models/OpenMOSS/MOSS-Audio-Tokenizer-v2.

To download it:

pip install modelscope
python -c "from modelscope import snapshot_download; snapshot_download('OpenMOSS/MOSS-Audio-Tokenizer-v2')"

Or specify a custom path:

tts = MossTTS(codec_path="/path/to/MOSS-Audio-Tokenizer-v2")

Model Conversion

To convert the original PyTorch model to ONNX yourself:

# Install export dependencies
pip install torch transformers onnx

# Run export (requires ~18GB RAM)
python scripts/export_onnx.py

# Validate correctness
python scripts/validate.py

See scripts/export_onnx.py for the full conversion pipeline.

Model Architecture

ONNX Model Description Size
global_prefill.onnx Qwen3 36-layer full-sequence forward ~15GB
global_decode.onnx Qwen3 single-token decode with KV cache ~15GB
local_transformer_first.onnx GPT2 1-layer (first step) 290MB
local_transformer.onnx GPT2 1-layer (with KV cache) 290MB
lm_heads.npz LM head weight matrices 120MB

Requirements

  • Python >= 3.10
  • ~32GB RAM (for loading models)
  • ~31GB disk space (for model files)

License

MIT

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

moss_tts_local_onnx_cli-0.1.0.tar.gz (6.6 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

moss_tts_local_onnx_cli-0.1.0-py3-none-any.whl (9.0 kB view details)

Uploaded Python 3

File details

Details for the file moss_tts_local_onnx_cli-0.1.0.tar.gz.

File metadata

  • Download URL: moss_tts_local_onnx_cli-0.1.0.tar.gz
  • Upload date:
  • Size: 6.6 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: uv/0.11.8 {"installer":{"name":"uv","version":"0.11.8","subcommand":["publish"]},"python":null,"implementation":{"name":null,"version":null},"distro":{"name":"macOS","version":null,"id":null,"libc":null},"system":{"name":null,"release":null},"cpu":null,"openssl_version":null,"setuptools_version":null,"rustc_version":null,"ci":null}

File hashes

Hashes for moss_tts_local_onnx_cli-0.1.0.tar.gz
Algorithm Hash digest
SHA256 efaf06bea6a63540569d6480d68d84dd28073fa43b9ce929985396ed460bb59e
MD5 59b705c2500780b400d01096b39950c4
BLAKE2b-256 e69d5b72f4865060c5e47889feef37602ba2def20cea1aedde709e1b937f9cba

See more details on using hashes here.

File details

Details for the file moss_tts_local_onnx_cli-0.1.0-py3-none-any.whl.

File metadata

  • Download URL: moss_tts_local_onnx_cli-0.1.0-py3-none-any.whl
  • Upload date:
  • Size: 9.0 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: uv/0.11.8 {"installer":{"name":"uv","version":"0.11.8","subcommand":["publish"]},"python":null,"implementation":{"name":null,"version":null},"distro":{"name":"macOS","version":null,"id":null,"libc":null},"system":{"name":null,"release":null},"cpu":null,"openssl_version":null,"setuptools_version":null,"rustc_version":null,"ci":null}

File hashes

Hashes for moss_tts_local_onnx_cli-0.1.0-py3-none-any.whl
Algorithm Hash digest
SHA256 9179ad9d3820a7d124e00d2f2b9d86a898ad4a98c93844fd1318add2b1c598e7
MD5 3288228e3232828ca0f08f4b9b219b9d
BLAKE2b-256 05f539f4bde100cdcf28c898a20540be13ba9fe55593ad9b60e3bffa17154fc9

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page