Skip to main content

Run Retrieval-based Voice Conversion training and inference with ease.

Project description

ZeroRVC

Run Retrieval-based Voice Conversion training and inference with ease.

Features

  • Dataset Preparation
  • Hugging Face Datasets Integration
  • Hugging Face Accelerate Integration
  • Trainer API
  • Inference API
    • Index Support
  • Tensorboard Support
  • FP16 Support

Dataset Preparation

ZeroRVC provides a simple API to prepare your dataset for training. You only need to provide the path to your audio files. The feature extraction models will be downloaded automatically, or you can provide your own with the hubert and rmvpe arguments.

from zerorvc import prepare

dataset = prepare("./my-voices")

Since dataset is a Hugging Face Dataset object, you can easily push it to the Hugging Face Hub.

dataset.push_to_hub("my-rvc-dataset", token=HF_TOKEN)

And bring the preprocessed dataset back with the following code.

from datasets import load_dataset

dataset = load_dataset("my-rvc-dataset")

Training

Once you've prepared your dataset, you can start training your model with the RVCTrainer.

from tqdm import tqdm
from zerorvc import RVCTrainer

epochs = 100
trainer = RVCTrainer(checkpoint_dir="./checkpoints")
training = tqdm(
    trainer.train(
        dataset=dataset["train"], # preprocessed dataset
        resume_from=trainer.latest_checkpoint(), # resume training from the latest checkpoint if any
        epochs=epochs, batch_size=8
    )
)

# Training loop: iterate over epochs
for checkpoint in training:
    training.set_description(
        f"Epoch {checkpoint.epoch}/{epochs} loss: (gen: {checkpoint.loss_gen:.4f}, fm: {checkpoint.loss_fm:.4f}, mel: {checkpoint.loss_mel:.4f}, kl: {checkpoint.loss_kl:.4f}, disc: {checkpoint.loss_disc:.4f})"
    )

    # Save checkpoint every 10 epochs
    if checkpoint.epoch % 10 == 0:
        checkpoint.save(checkpoint_dir=trainer.checkpoint_dir)
        # Directly push the synthesizer to the Hugging Face Hub
        checkpoint.G.push_to_hub("my-rvc-model", token=HF_TOKEN)

print("Training completed.")

Inference

ZeroRVC provides an easy API to convert your voice with the trained model.

from zerorvc import RVC
import soundfile as sf

rvc = RVC.from_pretrained("my-rvc-model")
samples = rvc.convert("test.mp3")
sf.write("output.wav", samples, rvc.sr)

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

zerorvc-0.0.3.tar.gz (33.4 kB view details)

Uploaded Source

Built Distribution

zerorvc-0.0.3-py3-none-any.whl (42.2 kB view details)

Uploaded Python 3

File details

Details for the file zerorvc-0.0.3.tar.gz.

File metadata

  • Download URL: zerorvc-0.0.3.tar.gz
  • Upload date:
  • Size: 33.4 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/5.1.1 CPython/3.10.0

File hashes

Hashes for zerorvc-0.0.3.tar.gz
Algorithm Hash digest
SHA256 c36af4ca033c15568832e9ff7b4355d9e9de0f7658d891406c56c7b3c02936b8
MD5 221d497115500d970099bcb548a4c696
BLAKE2b-256 2c00a2b05eb7be5674efd5b11a8f56f851a85fb8c90a552eec1205fe61ad18bb

See more details on using hashes here.

File details

Details for the file zerorvc-0.0.3-py3-none-any.whl.

File metadata

  • Download URL: zerorvc-0.0.3-py3-none-any.whl
  • Upload date:
  • Size: 42.2 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/5.1.1 CPython/3.10.0

File hashes

Hashes for zerorvc-0.0.3-py3-none-any.whl
Algorithm Hash digest
SHA256 6e3a9b8647d1bb20a51d8029290e4aef2695890e1d752d2447aa91682295337a
MD5 409e5b898313e153f6f0109a3899f74a
BLAKE2b-256 2b82720485f93a5d42e7e6a93641a144e43ebd80494666ee96c79bcd970683ac

See more details on using hashes here.

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page