A library for tuning adapters for visual generation models
Project description
vtuna
Tuning adapters for visual generation models, a library highly inspired by torchtune.
Introduction
vtuna
is a PyTorch library for easily authoring, fine-tuning and experimenting with visual generation model, especially the text-to-image diffusion model.
diffusers is the go-to library for state-of-the-art pretrained diffusion models. However, if your goal is to develop diffusion adapters that do not yet exist, you will need vtuna
. The main goal of vtuna
is to support researchers in freely exploring new adapter technologies for visual generative models.
vtuna provides:
- Easy-to-use and hackable training recipes for popular fine-tuning, adapting techniques (LoRA, IP-Adapter, ControlNet, ELLA...).
- YAML configs for easily configuring training, evaluation or inference recipes.
vtuna focuses on integrating with popular tools and libraries from the ecosystem. These are just a few examples, with more under development:
- Hugging Face Diffusers for diffusion models, pipelines.
- Hugging Face Hub for accessing model weights
- Hugging Face Datasets for access to training and evaluation datasets
- Deepspeed for distributed training
Get Started
vtuna
is currently under development.
Design Principles
vtuna embodies PyTorch’s design philosophy [details], especially "usability over everything else".
Simplicity and Extensibility
vtuna is designed to be easy to understand, use and extend.
- Composition over implementation inheritance - layers of inheritance for code re-use makes the code hard to read and extend
- Code duplication is preferred over unnecessary abstractions
- Modular building blocks over monolithic components
Acknowledgements
This repository is highly inspired by torchtune.
License
vtuna is released under the Apache License 2.0.
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.