An implementation of melo tts by onnxruntime

Project description

MeloTTS and OpenVoice Tone Clone by ONNX

Introduction

This is an implementation of Melo TTS and OpenVoice Tone Clone by onnxruntime. We restruct the text utils for speeding up. We convert the models into onnx format. The models are stored in this repository as lfs.

We has just implement the zh-mix-en language. Other languages will come soon.

Usage

# tts demo
from melo_onnx import MeloTTX_ONNX
import soundfile

model_path = "path/to/folder/of/model_tts"
tts = MeloTTX_ONNX(model_path)
audio = tts.speak("今天天气真nice。", tts.speakers[0])

soundfile.write("path/of/result.wav", audio, samplerate=tts.sample_rate)

# Tone clone demo
from melo_onnx import OpenVoiceToneClone_ONNX
tc = OpenVoiceToneClone_ONNX("path/to/folder/of/model_tone_clone")
import soundfile
tgt = soundfile.read("path/of/audio_for_tone_color", dtype='float32')
tgt = tc.resample(tgt[0], tgt[1])
tgt_tone_color = tc.extract_tone_color(tgt)
src = soundfile.read("path/of/audio_to_change_tone", dtype='float32')
src = tc.resample(src[0], src[1])
result = tc.tone_clone(src, tgt_tone_color)
soundfile.write("path/of/result.wav", result, tc.sample_rate)

Models can be downloaded from the modelscope or huggingface:

model	repositories	comment
tts	modelscope or huggingface	zh-mix-en
tone clone	modelscope or huggingface

The parameters of the constructor of MeloTTX_ONNX:

model_path str. The path of the folder store the model.
execution_provider str. The device for the onnxruntime, CUDA, CPU, or others. If it's None, the library will choose the better one between the CUDA and CPU.
verbose bool. Set True for display the detail information when the library working.
onnx_session_options onnxruntim.SessionOptions. You can setup the special options for the onnx session.
onnx_params dict. The other parameters you want to pass into the onnxruntim.InferenceSession

The parameters of the MeloTTX_ONNX.speak:

text str. The text you want to synthesis.
speaker str. The speaker you want to use.
speed float. The speed of the speech
sdp_ratio float.
noise_scale float.
noise_scale_w float.
pbar function. Such as tqdm
Returns numpy.array. The data of the result audio

Some useful property of the instance of the MeloTTS:

speakers: [str]. Readonly. The available speakers
sample_rate: int. Readonly. The sample rate of the synthesis result
language: str. Readonly. The language of the current model

The parameters of the constructor of OpenVoiceToneClone_ONNX:

model_path str. The path of the folder store the model.
execution_provider str. The device for the onnxruntime, CUDA, CPU, or others. If it's None, the library will choose the better one between the CUDA and CPU.
verbose bool. Set True for display the detail information when the library working.
onnx_session_options onnxruntim.SessionOptions. You can setup the special options for the onnx session.
onnx_params dict. The other parameters you want to pass into the onnxruntim.InferenceSession

The parameters of OpenVoiceToneClone_ONNX.extract_tone_color

audio numpy.array. The data of the audio
Returns numpy.array. The tone color vector

The parameters of OpenVoiceToneClone_ONNX.mix_tone_color

color list[numpy.array]. The list of the tone colors you want to mix. Each element should be the result of extract_tone_color.
Returns numpy.array. The tone color vector

The parameters of OpenVoiceToneClone_ONNX.tone_clone

audio numpy.array. The data of the audio that will be changed the tone
target_tone_color numpy.array. The tone color that you want to clone. It should be the result of the extract_tone_color or mix_tone_color.
tau float
Returns numpy.array. The dest audio

The parameters of OpenVoiceToneClone_ONNX.resample

audio numpy.array. The source audio you want to resample.
original_rate int. The original sample rate of the source audio
Returns numpy.array. The dest data of the audio after resample

Project details

Release history Release notifications | RSS feed

This version

0.0.2

Oct 27, 2024

0.0.1

Oct 3, 2024

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

melotts_onnx-0.0.2.tar.gz (4.6 MB view details)

Uploaded Oct 27, 2024 Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

The dropdown lists show the available interpreters, ABIs, and platforms. Enable javascript to be able to filter the list of wheel files.

melotts_onnx-0.0.2-py3-none-any.whl (4.7 MB view details)

Uploaded Oct 27, 2024 Python 3

File details

Details for the file melotts_onnx-0.0.2.tar.gz.

File metadata

Download URL: melotts_onnx-0.0.2.tar.gz
Upload date: Oct 27, 2024
Size: 4.6 MB
Tags: Source
Uploaded using Trusted Publishing? No
Uploaded via: twine/5.1.1 CPython/3.10.11

File hashes

Hashes for melotts_onnx-0.0.2.tar.gz
Algorithm	Hash digest
SHA256	`c8678ed0a7f153aab641de8767f58be2c6501950c3c11465bba092fa0176ac8b`
MD5	`668151cb9dd91a3eb434d33cebe67464`
BLAKE2b-256	`14c462463fea19b7213846f04e061be6ed8ef0a249b877f5ac826d24564aa97d`

See more details on using hashes here.

File details

Details for the file melotts_onnx-0.0.2-py3-none-any.whl.

File metadata

Download URL: melotts_onnx-0.0.2-py3-none-any.whl
Upload date: Oct 27, 2024
Size: 4.7 MB
Tags: Python 3
Uploaded using Trusted Publishing? No
Uploaded via: twine/5.1.1 CPython/3.10.11

File hashes

Hashes for melotts_onnx-0.0.2-py3-none-any.whl
Algorithm	Hash digest
SHA256	`256ffc6b7622ecdbcf86a75427ba573c2d58414466602c99b9784449093772c8`
MD5	`f6acc1953da02321a5510c7ab7e329d2`
BLAKE2b-256	`031f5ab1f0296d49e426a013e850eb8aa2d779bfff3de585f876912d661ce36a`

See more details on using hashes here.

melotts-onnx 0.0.2

Navigation

Verified details

Maintainers

Unverified details

Meta

Project description

MeloTTS and OpenVoice Tone Clone by ONNX

Introduction

Usage

The parameters of the constructor of MeloTTX_ONNX:

The parameters of the MeloTTX_ONNX.speak:

Some useful property of the instance of the MeloTTS:

The parameters of the constructor of OpenVoiceToneClone_ONNX:

The parameters of OpenVoiceToneClone_ONNX.extract_tone_color

The parameters of OpenVoiceToneClone_ONNX.mix_tone_color

The parameters of OpenVoiceToneClone_ONNX.tone_clone

The parameters of OpenVoiceToneClone_ONNX.resample

Project details

Verified details

Maintainers

Unverified details

Meta

Release history Release notifications | RSS feed

Download files

Source Distribution

Built Distribution

File details

File metadata

File hashes

File details

File metadata

File hashes