Skip to main content

An implementation of melo tts by onnxruntime

Project description

MeloTTS and OpenVoice Tone Clone by ONNX

中文

Introduction

This is an implementation of Melo TTS and OpenVoice Tone Clone by onnxruntime. We restruct the text utils for speeding up. We convert the models into onnx format. The models are stored in this repository as lfs.

We has just implement the zh-mix-en language. Other languages will come soon.

Usage

# tts demo
from melo_onnx import MeloTTX_ONNX
import soundfile

model_path = "path/to/folder/of/model_tts"
tts = MeloTTX_ONNX(model_path)
audio = tts.speak("今天天气真nice。", tts.speakers[0])

soundfile.write("path/of/result.wav", audio, samplerate=tts.sample_rate)
# Tone clone demo
from melo_onnx import OpenVoiceToneClone_ONNX
tc = OpenVoiceToneClone_ONNX("path/to/folder/of/model_tone_clone")
import soundfile
tgt = soundfile.read("path/of/audio_for_tone_color", dtype='float32')
tgt = tc.resample(tgt[0], tgt[1])
tgt_tone_color = tc.extract_tone_color(tgt)
src = soundfile.read("path/of/audio_to_change_tone", dtype='float32')
src = tc.resample(src[0], src[1])
result = tc.tone_clone(src, tgt_tone_color)
soundfile.write("path/of/result.wav", result, tc.sample_rate)

Models can be downloaded from the modelscope or huggingface:

model repositories comment
tts modelscope or huggingface zh-mix-en
tone clone modelscope or huggingface

The parameters of the constructor of MeloTTX_ONNX:

  • model_path str. The path of the folder store the model.
  • execution_provider str. The device for the onnxruntime, CUDA, CPU, or others. If it's None, the library will choose the better one between the CUDA and CPU.
  • verbose bool. Set True for display the detail information when the library working.
  • onnx_session_options onnxruntim.SessionOptions. You can setup the special options for the onnx session.
  • onnx_params dict. The other parameters you want to pass into the onnxruntim.InferenceSession

The parameters of the MeloTTX_ONNX.speak:

  • text str. The text you want to synthesis.
  • speaker str. The speaker you want to use.
  • speed float. The speed of the speech
  • sdp_ratio float.
  • noise_scale float.
  • noise_scale_w float.
  • pbar function. Such as tqdm
  • Returns numpy.array. The data of the result audio

Some useful property of the instance of the MeloTTS:

  • speakers: [str]. Readonly. The available speakers
  • sample_rate: int. Readonly. The sample rate of the synthesis result
  • language: str. Readonly. The language of the current model

The parameters of the constructor of OpenVoiceToneClone_ONNX:

  • model_path str. The path of the folder store the model.
  • execution_provider str. The device for the onnxruntime, CUDA, CPU, or others. If it's None, the library will choose the better one between the CUDA and CPU.
  • verbose bool. Set True for display the detail information when the library working.
  • onnx_session_options onnxruntim.SessionOptions. You can setup the special options for the onnx session.
  • onnx_params dict. The other parameters you want to pass into the onnxruntim.InferenceSession

The parameters of OpenVoiceToneClone_ONNX.extract_tone_color

  • audio numpy.array. The data of the audio
  • Returns numpy.array. The tone color vector

The parameters of OpenVoiceToneClone_ONNX.mix_tone_color

  • color list[numpy.array]. The list of the tone colors you want to mix. Each element should be the result of extract_tone_color.
  • Returns numpy.array. The tone color vector

The parameters of OpenVoiceToneClone_ONNX.tone_clone

  • audio numpy.array. The data of the audio that will be changed the tone
  • target_tone_color numpy.array. The tone color that you want to clone. It should be the result of the extract_tone_color or mix_tone_color.
  • tau float
  • Returns numpy.array. The dest audio

The parameters of OpenVoiceToneClone_ONNX.resample

  • audio numpy.array. The source audio you want to resample.
  • original_rate int. The original sample rate of the source audio
  • Returns numpy.array. The dest data of the audio after resample

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

melotts_onnx-0.0.2.tar.gz (4.6 MB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

melotts_onnx-0.0.2-py3-none-any.whl (4.7 MB view details)

Uploaded Python 3

File details

Details for the file melotts_onnx-0.0.2.tar.gz.

File metadata

  • Download URL: melotts_onnx-0.0.2.tar.gz
  • Upload date:
  • Size: 4.6 MB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/5.1.1 CPython/3.10.11

File hashes

Hashes for melotts_onnx-0.0.2.tar.gz
Algorithm Hash digest
SHA256 c8678ed0a7f153aab641de8767f58be2c6501950c3c11465bba092fa0176ac8b
MD5 668151cb9dd91a3eb434d33cebe67464
BLAKE2b-256 14c462463fea19b7213846f04e061be6ed8ef0a249b877f5ac826d24564aa97d

See more details on using hashes here.

File details

Details for the file melotts_onnx-0.0.2-py3-none-any.whl.

File metadata

  • Download URL: melotts_onnx-0.0.2-py3-none-any.whl
  • Upload date:
  • Size: 4.7 MB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/5.1.1 CPython/3.10.11

File hashes

Hashes for melotts_onnx-0.0.2-py3-none-any.whl
Algorithm Hash digest
SHA256 256ffc6b7622ecdbcf86a75427ba573c2d58414466602c99b9784449093772c8
MD5 f6acc1953da02321a5510c7ab7e329d2
BLAKE2b-256 031f5ab1f0296d49e426a013e850eb8aa2d779bfff3de585f876912d661ce36a

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page