Skip to main content

GPT-SoVITS ONNX Inference Engine & Model Converter

Project description

 ___       ___  ___  ________   ________  ___      ___ ________     ___    ___ 
|\  \     |\  \|\  \|\   ___  \|\   __  \|\  \    /  /|\   __  \   |\  \  /  /|
\ \  \    \ \  \\\  \ \  \\ \  \ \  \|\  \ \  \  /  / | \  \|\  \  \ \  \/  / /
 \ \  \    \ \  \\\  \ \  \\ \  \ \   __  \ \  \/  / / \ \  \\\  \  \ \    / / 
  \ \  \____\ \  \\\  \ \  \\ \  \ \  \ \  \ \    / /   \ \  \\\  \  /     \/  
   \ \_______\ \_______\ \__\\ \__\ \__\ \__\ \__/ /     \ \_______\/  /\   \  
    \|_______|\|_______|\|__| \|__|\|__|\|__|\|__|/       \|_______/__/ /\ __\ 
                                                                   |__|/ \|__| 

🔮 LunaVox: GPT-SoVITS Lightweight Inference Engine

Experience near-instantaneous speech synthesis on your CPU

简体中文 | English


LunaVox is a lightweight inference engine built on the open-source TTS project GPT-SoVITS. It integrates TTS inference, ONNX model conversion, API server, and other core features, aiming to provide ultimate performance and convenience.

  • ✅ Supported Model Version: GPT-SoVITS V2
  • ✅ Supported Language: Japanese
  • ✅ Supported Python Version: >= 3.9

🚀 Performance Advantages

LunaVox optimizes the original model for outstanding CPU performance.

Feature 🔮 LunaVox Official PyTorch Model Official ONNX Model
First Inference Latency 1.13s 1.35s 3.57s
Runtime Size ~200MB ~several GB Similar to LunaVox
Model Size ~230MB Similar to LunaVox ~750MB

📝 Note: Since GPU inference latency does not significantly improve over CPU for the first packet, we currently only provide a CPU version to ensure the best out-of-the-box experience.

📝 Latency Test Info: All latency data is based on a test set of 100 Japanese sentences (~20 characters each), averaged. Tested on CPU i7-13620H.


🏁 QuickStart

⚠️ Important: It is recommended to run LunaVox in Administrator mode to avoid potential performance degradation.

📦 Installation

Install via pip:

pip install lunavox-tts

📝 You may encounter an installation failure when trying to install pyopenjtalk. This is because pyopenjtalk is a library that includes C extensions, and the publisher does not currently provide pre-compiled binary packages ( wheels). For Windows users, this requires installing Visual Studio Build Tools. Specifically, you must select the "Desktop development with C++" workload during the installation process.

⚡️ Quick Tryout

No GPT-SoVITS model yet? No problem! LunaVox includes predefined speaker characters for immediate use without any model files. Run the script below to hear it in action:

python Tutorial/quick_tryout.py

This script will automatically download required dependencies and play a sample audio.

🎤 TTS Best Practices

A simple TTS inference example:

import lunavox_tts as lunavox

# Step 1: Load character voice model
lunavox.load_character(
    character_name='<CHARACTER_NAME>',  # Replace with your character name
    onnx_model_dir=r"<PATH_TO_CHARACTER_ONNX_MODEL_DIR>",  # Folder containing ONNX model
)

# Step 2: Set reference audio (for emotion and intonation cloning)
lunavox.set_reference_audio(
    character_name='<CHARACTER_NAME>',  # Must match loaded character name
    audio_path=r"<PATH_TO_REFERENCE_AUDIO>",  # Path to reference audio
    audio_text="<REFERENCE_AUDIO_TEXT>",  # Corresponding text
)

# Step 3: Run TTS inference and generate audio
lunavox.tts(
    character_name='<CHARACTER_NAME>',  # Must match loaded character
    text="<TEXT_TO_SYNTHESIZE>",  # Text to synthesize
    play=True,  # Play audio directly
    save_path="<OUTPUT_AUDIO_PATH>",  # Output audio file path
)

print("🎉 Audio generation complete!")

🔧 Model Conversion

To convert original GPT-SoVITS models for LunaVox, ensure torch is installed:

pip install torch

Use the built-in conversion tool:

Tip: convert_to_onnx currently supports only V2 models.

import lunavox_tts as lunavox

lunavox.convert_to_onnx(
    torch_pth_path=r"<YOUR .PTH MODEL FILE>",  # Replace with your .pth file
    torch_ckpt_path=r"<YOUR .CKPT CHECKPOINT FILE>",  # Replace with your .ckpt file
    output_dir=r"<ONNX MODEL OUTPUT DIRECTORY>"  # Directory to save ONNX model
)

🌐 Launch FastAPI Server

LunaVox includes a lightweight FastAPI server:

import lunavox_tts as lunavox

# Start server
lunavox.start_server(
    host="0.0.0.0",  # Host address
    port=8000,  # Port
    workers=1  # Number of workers
)

For request formats and API details, see our API Server Tutorial.


⌨️ Launch CMD Client

LunaVox provides a simple command-line client for quick testing and interactive use:

import lunavox_tts as lunavox

# Launch CLI client
lunavox.launch_command_line_client()

📝 Roadmap

  • 🌐 Language Expansion

    • Add support for Chinese and English.
  • 🚀 Model Compatibility

    • Support for V2Proplus, V3, V4, and more.
  • 📦 Easy Deployment

    • Release Docker images.
    • Provide out-of-the-box Windows / Linux bundles.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

lunavox_tts-1.0.3.tar.gz (32.6 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

lunavox_tts-1.0.3-py3-none-any.whl (39.4 kB view details)

Uploaded Python 3

File details

Details for the file lunavox_tts-1.0.3.tar.gz.

File metadata

  • Download URL: lunavox_tts-1.0.3.tar.gz
  • Upload date:
  • Size: 32.6 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.12.6

File hashes

Hashes for lunavox_tts-1.0.3.tar.gz
Algorithm Hash digest
SHA256 87f84d20cd562a11bf8273a88ca666f2be68f811cc693d14db497376d9f99d06
MD5 f7a688378a13a41a28592250c1160aac
BLAKE2b-256 9de8835606730c5c83da593be20b80cd4ebd801d426ed5ab4d25a03b901fe123

See more details on using hashes here.

File details

Details for the file lunavox_tts-1.0.3-py3-none-any.whl.

File metadata

  • Download URL: lunavox_tts-1.0.3-py3-none-any.whl
  • Upload date:
  • Size: 39.4 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.12.6

File hashes

Hashes for lunavox_tts-1.0.3-py3-none-any.whl
Algorithm Hash digest
SHA256 419536d5426853c3142b4188c72b45a3235a641867f92b2cdbfb032408c766a5
MD5 1fed0ea68c2416acd70367248499d3ff
BLAKE2b-256 b719b7a10d72dbd8fe675220f4f242d147d9713f7614a6304dc98904fed5a54a

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page