Run Retrieval-based Voice Conversion training and inference with ease.
Project description
ZeroRVC
Run Retrieval-based Voice Conversion training and inference with ease.
Features
- Dataset Preparation
- Hugging Face Datasets Integration
- Hugging Face Accelerate Integration
- Trainer API
- Inference API
- Index Support
- Tensorboard Support
- FP16 Support
Dataset Preparation
ZeroRVC provides a simple API to prepare your dataset for training. You only need to provide the path to your audio files. The feature extraction models will be downloaded automatically, or you can provide your own with the hubert
and rmvpe
arguments.
from zerorvc import prepare
dataset = prepare("./my-voices")
Since dataset
is a Hugging Face Dataset object, you can easily push it to the Hugging Face Hub.
dataset.push_to_hub("my-rvc-dataset", token=HF_TOKEN)
And bring the preprocessed dataset back with the following code.
from datasets import load_dataset
dataset = load_dataset("my-rvc-dataset")
Training
Once you've prepared your dataset, you can start training your model with the RVCTrainer
.
from tqdm import tqdm
from zerorvc import RVCTrainer
epochs = 100
trainer = RVCTrainer(checkpoint_dir="./checkpoints")
training = tqdm(
trainer.train(
dataset=dataset["train"], # preprocessed dataset
resume_from=trainer.latest_checkpoint(), # resume training from the latest checkpoint if any
epochs=epochs, batch_size=8
)
)
# Training loop: iterate over epochs
for checkpoint in training:
training.set_description(
f"Epoch {checkpoint.epoch}/{epochs} loss: (gen: {checkpoint.loss_gen:.4f}, fm: {checkpoint.loss_fm:.4f}, mel: {checkpoint.loss_mel:.4f}, kl: {checkpoint.loss_kl:.4f}, disc: {checkpoint.loss_disc:.4f})"
)
# Save checkpoint every 10 epochs
if checkpoint.epoch % 10 == 0:
checkpoint.save(checkpoint_dir=trainer.checkpoint_dir)
# Directly push the synthesizer to the Hugging Face Hub
checkpoint.G.push_to_hub("my-rvc-model", token=HF_TOKEN)
print("Training completed.")
You can also push the whole GAN weights to the Hugging Face Hub.
checkpoint.push_to_hub("my-rvc-model", token=HF_TOKEN)
Inference
ZeroRVC provides an easy API to convert your voice with the trained model.
from zerorvc import RVC
import soundfile as sf
rvc = RVC.from_pretrained("my-rvc-model")
samples = rvc.convert("test.mp3")
sf.write("output.wav", samples, rvc.sr)
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
File details
Details for the file zerorvc-0.0.8.tar.gz
.
File metadata
- Download URL: zerorvc-0.0.8.tar.gz
- Upload date:
- Size: 34.3 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/5.1.1 CPython/3.10.0
File hashes
Algorithm | Hash digest | |
---|---|---|
SHA256 | d1e9443ab22b216c262933b7ea06b96fc986a40f24b96deefec4ff98d2fbc948 |
|
MD5 | fd59fd120042ad9024c512130ddfea68 |
|
BLAKE2b-256 | 915d38489b5290c05f7828553da075540e026af4cd34cd1ae04ae0764fd2f8f0 |
File details
Details for the file zerorvc-0.0.8-py3-none-any.whl
.
File metadata
- Download URL: zerorvc-0.0.8-py3-none-any.whl
- Upload date:
- Size: 43.4 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/5.1.1 CPython/3.10.0
File hashes
Algorithm | Hash digest | |
---|---|---|
SHA256 | 4e3056b7fb1c9cc7bb75268b897d93d0aa3834254678f6b086ebf609e695b18c |
|
MD5 | cc67b43381747566414df417a5e1f53f |
|
BLAKE2b-256 | 60cc8d1b12cf3dd9a484b23c72c13c77bae7c5a52ceb45e20a4a6187cbdcf8c5 |