Skip to main content

Run Retrieval-based Voice Conversion training and inference with ease.

Project description

ZeroRVC

Run Retrieval-based Voice Conversion training and inference with ease.

Features

  • Dataset Preparation
  • Hugging Face Datasets Integration
  • Hugging Face Accelerate Integration
  • Trainer API
  • Inference API
    • Index Support
  • Tensorboard Support
  • FP16 Support

Dataset Preparation

ZeroRVC provides a simple API to prepare your dataset for training. You only need to provide the path to your audio files. The feature extraction models will be downloaded automatically, or you can provide your own with the hubert and rmvpe arguments.

from zerorvc import prepare

dataset = prepare("./my-voices")

Since dataset is a Hugging Face Dataset object, you can easily push it to the Hugging Face Hub.

dataset.push_to_hub("my-rvc-dataset", token=HF_TOKEN)

And bring the preprocessed dataset back with the following code.

from datasets import load_dataset

dataset = load_dataset("my-rvc-dataset")

Training

Once you've prepared your dataset, you can start training your model with the RVCTrainer.

from tqdm import tqdm
from zerorvc import RVCTrainer

epochs = 100
trainer = RVCTrainer(checkpoint_dir="./checkpoints")
training = tqdm(
    trainer.train(
        dataset=dataset["train"], # preprocessed dataset
        resume_from=trainer.latest_checkpoint(), # resume training from the latest checkpoint if any
        epochs=epochs, batch_size=8
    )
)

# Training loop: iterate over epochs
for checkpoint in training:
    training.set_description(
        f"Epoch {checkpoint.epoch}/{epochs} loss: (gen: {checkpoint.loss_gen:.4f}, fm: {checkpoint.loss_fm:.4f}, mel: {checkpoint.loss_mel:.4f}, kl: {checkpoint.loss_kl:.4f}, disc: {checkpoint.loss_disc:.4f})"
    )

    # Save checkpoint every 10 epochs
    if checkpoint.epoch % 10 == 0:
        checkpoint.save(checkpoint_dir=trainer.checkpoint_dir)
        # Directly push the synthesizer to the Hugging Face Hub
        checkpoint.G.push_to_hub("my-rvc-model", token=HF_TOKEN)

print("Training completed.")

Inference

ZeroRVC provides an easy API to convert your voice with the trained model.

from zerorvc import RVC
import soundfile as sf

rvc = RVC.from_pretrained("my-rvc-model")
samples = rvc.convert("test.mp3")
sf.write("output.wav", samples, rvc.sr)

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

zerorvc-0.0.4.tar.gz (33.6 kB view details)

Uploaded Source

Built Distribution

zerorvc-0.0.4-py3-none-any.whl (42.4 kB view details)

Uploaded Python 3

File details

Details for the file zerorvc-0.0.4.tar.gz.

File metadata

  • Download URL: zerorvc-0.0.4.tar.gz
  • Upload date:
  • Size: 33.6 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/5.1.1 CPython/3.10.0

File hashes

Hashes for zerorvc-0.0.4.tar.gz
Algorithm Hash digest
SHA256 dcf9c58eaa39c208bb3b2d002a7dab25781b263330fbdde30808a4baa731b41d
MD5 4ac262e3c55ab23b00c91ba4097112f2
BLAKE2b-256 9b483bf9869c685193dbe66eace53a9338940837a8e8f0de62980fda2a732dbc

See more details on using hashes here.

File details

Details for the file zerorvc-0.0.4-py3-none-any.whl.

File metadata

  • Download URL: zerorvc-0.0.4-py3-none-any.whl
  • Upload date:
  • Size: 42.4 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/5.1.1 CPython/3.10.0

File hashes

Hashes for zerorvc-0.0.4-py3-none-any.whl
Algorithm Hash digest
SHA256 d7f16931bd014e5d828a85ff6b2230b756e780e1d28e74c52ac7dce6802fa89f
MD5 3d724d02552fe5c0413ee48e9664ef43
BLAKE2b-256 5b874a0979f1451d636c288a187301b07763689efddf6a7b92739f734c2f7650

See more details on using hashes here.

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page