Skip to main content

Run Retrieval-based Voice Conversion training and inference with ease.

Project description

ZeroRVC

Run Retrieval-based Voice Conversion training and inference with ease.

Features

  • Dataset Preparation
  • Hugging Face Datasets Integration
  • Hugging Face Accelerate Integration
  • Trainer API
  • Inference API
    • Index Support
  • Tensorboard Support
  • FP16 Support

Dataset Preparation

ZeroRVC provides a simple API to prepare your dataset for training. You only need to provide the path to your audio files. The feature extraction models will be downloaded automatically, or you can provide your own with the hubert and rmvpe arguments.

from zerorvc import prepare

dataset = prepare("./my-voices")

Since dataset is a Hugging Face Dataset object, you can easily push it to the Hugging Face Hub.

dataset.push_to_hub("my-rvc-dataset", token=HF_TOKEN)

And bring the preprocessed dataset back with the following code.

from datasets import load_dataset

dataset = load_dataset("my-rvc-dataset")

Training

Once you've prepared your dataset, you can start training your model with the RVCTrainer.

from tqdm import tqdm
from zerorvc import RVCTrainer

epochs = 100
trainer = RVCTrainer(checkpoint_dir="./checkpoints")
training = tqdm(
    trainer.train(
        dataset=dataset["train"], # preprocessed dataset
        resume_from=trainer.latest_checkpoint(), # resume training from the latest checkpoint if any
        epochs=epochs, batch_size=8
    )
)

# Training loop: iterate over epochs
for checkpoint in training:
    training.set_description(
        f"Epoch {checkpoint.epoch}/{epochs} loss: (gen: {checkpoint.loss_gen:.4f}, fm: {checkpoint.loss_fm:.4f}, mel: {checkpoint.loss_mel:.4f}, kl: {checkpoint.loss_kl:.4f}, disc: {checkpoint.loss_disc:.4f})"
    )

    # Save checkpoint every 10 epochs
    if checkpoint.epoch % 10 == 0:
        checkpoint.save(checkpoint_dir=trainer.checkpoint_dir)
        # Directly push the synthesizer to the Hugging Face Hub
        checkpoint.G.push_to_hub("my-rvc-model", token=HF_TOKEN)

print("Training completed.")

Inference

ZeroRVC provides an easy API to convert your voice with the trained model.

from zerorvc import RVC
import soundfile as sf

rvc = RVC.from_pretrained("my-rvc-model")
samples = rvc.convert("test.mp3")
sf.write("output.wav", samples, rvc.sr)

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

zerorvc-0.0.5.tar.gz (34.4 kB view details)

Uploaded Source

Built Distribution

zerorvc-0.0.5-py3-none-any.whl (43.5 kB view details)

Uploaded Python 3

File details

Details for the file zerorvc-0.0.5.tar.gz.

File metadata

  • Download URL: zerorvc-0.0.5.tar.gz
  • Upload date:
  • Size: 34.4 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/5.1.1 CPython/3.10.0

File hashes

Hashes for zerorvc-0.0.5.tar.gz
Algorithm Hash digest
SHA256 0e9106fec5d878662ed8c42610c145ff24e7a9d5d8221cde1d093755f1448f63
MD5 0e794c6a48b01897025dbdf985ce7796
BLAKE2b-256 b475a14957bdf29b1b876e0fb5796d41181a6b73c20976e0e142f34a32308a59

See more details on using hashes here.

File details

Details for the file zerorvc-0.0.5-py3-none-any.whl.

File metadata

  • Download URL: zerorvc-0.0.5-py3-none-any.whl
  • Upload date:
  • Size: 43.5 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/5.1.1 CPython/3.10.0

File hashes

Hashes for zerorvc-0.0.5-py3-none-any.whl
Algorithm Hash digest
SHA256 d36f0c781949ecc9f9a44b8adeb5e89951276e33b9b1265c3c2976571871596f
MD5 6a0d32b470e62abaa872008cb51498e7
BLAKE2b-256 5e481a3af5c50de9f1145bab2039bd88e0cd7c552f33380be7e21fcf3cefa06e

See more details on using hashes here.

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page