Skip to main content

Python wrapper for fast inference with rvc

Project description

GPT-SoVITS-FastInference

A streamlined Python wrapper for fast inference with RVC. This is designed solely for inference purposes.

Introduction

description

Getting Started

Installation

pip install infer_rvc_python

Usage

Initialize the base class

from infer_rvc_python import BaseLoader

converter = BaseLoader(only_cpu=False, hubert_path=None, rmvpe_path=None)

Define a tag and select the model along with other parameters.

converter.apply_conf(
        tag="yoimiya",
        file_model="model.pth",
        pitch_algo="rmvpe+",
        pitch_lvl=0,
        file_index="model.index",
        index_influence=0.66,
        respiration_median_filtering=3,
        envelope_ratio=0.25,
        consonant_breath_protection=0.33
    )

Select the audio or audios you want to convert.

# audio_files = ["audio.wav", "haha.mp3"]
audio_files = "myaudio.mp3"

# speakers_list = ["sunshine", "yoimiya"]
speakers_list = "yoimiya"

Perform inference

result = converter(
    audio_files,
    speakers_list,
    overwrite=False,
    parallel_workers=4
)

The result is a list with the paths of the converted files.

Unload models

converter.unload_models()

Preloading model (Reduces inference time)

The initial execution will preload the model for the tag. Subsequent calls to inference with the same tag will benefit from preloaded components, thereby reducing inference time.

result_array, sample_rate = converter.generate_from_cache(
    audio_data="myaudiofile_path.wav",
    tag="yoimiya",
)

The param audio_data can be a path or a tuple with (array_data, sampling_rate)

# array_data = np.array([-22, -22, -15, ..., 0, 0, 0], dtype=np.int16)
# source_sample_rate = 16000
data = (array_data, source_sample_rate)
result_array, sample_rate = converter.generate_from_cache(
    audio_data=data,
    tag="yoimiya",
)

The result in both cases will be (array, sample_rate), which you can save or play in a notebook

# Save
import soundfile as sf

sf.write(
    file="output_file.wav",
    samplerate=sample_rate,
    data=result_array
)
# Play; need to install ipython
from IPython.display import Audio

Audio(result_array, rate=sample_rate)

When settings or the tag are altered, the model requires reloading. To maintain multiple preloaded models, you can instantiate another BaseLoader object.

second_converter = BaseLoader()

License

This project is licensed under the MIT License.

Disclaimer

This software is provided for educational and research purposes only. The authors and contributors of this project do not endorse or encourage any misuse or unethical use of this software. Any use of this software for purposes other than those intended is solely at the user's own risk. The authors and contributors shall not be held responsible for any damages or liabilities arising from the use of this software inappropriately.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distributions

No source distribution files available for this release.See tutorial on generating distribution archives.

Built Distribution

infer_rvc_python-1.1.0-py3-none-any.whl (35.1 kB view details)

Uploaded Python 3

File details

Details for the file infer_rvc_python-1.1.0-py3-none-any.whl.

File metadata

File hashes

Hashes for infer_rvc_python-1.1.0-py3-none-any.whl
Algorithm Hash digest
SHA256 f083a2a93c7a3d558e526b097a820d74213d4ccc7d7ac20a9a25263744086a16
MD5 c904a52e3ed472f7e6c571365db60b4d
BLAKE2b-256 979f9f00a58adc4cf42d7c4c473b7a4165b9a7c1fd18cbf43b7301ca42391e9d

See more details on using hashes here.

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page