Unifying Generative Autoencoders in Python

These details have not been verified by PyPI

Project links

GitHub Statistics

View statistics for this project via Libraries.io, or by using our public dataset on Google BigQuery

Project description

pythae

This library implements some of the most common (Variational) Autoencoder models. In particular it provides the possibility to perform benchmark experiments and comparisons by training the models with the same autoencoding neural network architecture. The feature make your own autoencoder allows you to train any of these models with your own data and own Encoder and Decoder neural networks.

Installation

To install the latest version of this library run the following using pip

$ pip install git+https://github.com/clementchadebec/benchmark_VAE.git

or alternatively you can clone the github repo to access to tests, tutorials and scripts.

$ git clone https://github.com/clementchadebec/benchmark_VAE.git

and install the library

$ cd benchmark_VAE
$ pip install -e .

Available Models

Below is the list of the models currently implemented in the library.

Models	Paper	Official Implementation
Autoencoder (AE)
Variational Autoencoder (VAE)	link
Beta Variational Autoencoder (BetaVAE)	link
VAE with Linear Normalizing Flows (VAE_LinNF)	link
VAE with Inverse Autoregressive Flows (VAE_IAF)	link	link
Disentangled Beta Variational Autoencoder (DisentangledBetaVAE)	link
Disentangling by Factorising (FactorVAE)	link
Beta-TC-VAE (BetaTCVAE)	link	link
Importance Weighted Autoencoder (IWAE)	link	link
VAE with perceptual metric similarity (MSSSIM_VAE)	link
Wasserstein Autoencoder (WAE)	link	link
Info Variational Autoencoder (INFOVAE_MMD)	link
VAMP Autoencoder (VAMP)	link	link
Hyperspherical VAE (SVAE)	link	link
Adversarial Autoencoder (Adversarial_AE)	link
Variational Autoencoder GAN (VAEGAN) 🥗	link	link
Vector Quantized VAE (VQVAE)	link	link
Hamiltonian VAE (HVAE)	link	link
Regularized AE with L2 decoder param (RAE_L2)	link	link
Regularized AE with gradient penalty (RAE_GP)	link	link
Riemannian Hamiltonian VAE (RHVAE)	link	link

See reconstruction and generation results for all aforementionned models

Available Samplers

Below is the list of the models currently implemented in the library.

Samplers	Models	Paper	Official Implementation
Normal prior (NormalSampler)	all models	link
Gaussian mixture (GaussianMixtureSampler)	all models	link	link
Two stage VAE sampler (TwoStageVAESampler)	all VAE based models	link	link
Unit sphere uniform sampler (HypersphereUniformSampler)	SVAE	link	link
VAMP prior sampler (VAMPSampler)	VAMP	link	link
Manifold sampler (RHVAESampler)	RHVAE	link	link
Masked Autoregressive Flow Sampler (MAFSampler)	all models	link	link
Inverse Autoregressive Flow Sampler (IAFSampler)	all models	link	link
PixelCNN (PixelCNNSampler)	VQVAE	link

Launching a model training

To launch a model training, you only need to call a TrainingPipeline instance.

>>> from pythae.pipelines import TrainingPipeline
>>> from pythae.models import VAE, VAEConfig
>>> from pythae.trainers import BaseTrainerConfig

>>> # Set up the training configuration
>>> my_training_config = BaseTrainerConfig(
...	output_dir='my_model',
...	num_epochs=50,
...	learning_rate=1e-3,
...	batch_size=200,
...	steps_saving=None
... )
>>> # Set up the model configuration 
>>> my_vae_config = model_config = VAEConfig(
...	input_dim=(1, 28, 28),
...	latent_dim=10
... )
>>> # Build the model
>>> my_vae_model = VAE(
...	model_config=my_vae_config
... )
>>> # Build the Pipeline
>>> pipeline = TrainingPipeline(
... 	training_config=my_training_config,
... 	model=my_vae_model
... )
>>> # Launch the Pipeline
>>> pipeline(
...	train_data=your_train_data, # must be torch.Tensor or np.array 
...	eval_data=your_eval_data # must be torch.Tensor or np.array
... )

At the end of training, the best model weights, model configuration and training configuration are stored in a final_model folder available in my_model/MODEL_NAME_training_YYYY-MM-DD_hh-mm-ss (with my_model being the output_dir argument of the BaseTrainerConfig). If you further set the steps_saving argument to a certain value, folders named checkpoint_epoch_k containing the best model weights, optimizer, scheduler, configuration and training configuration at epoch k will also appear in my_model/MODEL_NAME_training_YYYY-MM-DD_hh-mm-ss.

Lauching a training on benchmark datasets

We also provide a training script example here that can be used to train the models on benchmarks datasets (mnist, cifar10, celeba ...). The script can be launched with the following commandline

python training.py --dataset mnist --model_name ae --model_config 'configs/ae_config.json' --training_config 'configs/base_training_config.json'

See README.md for further details on this script

Launching data generation

Using the `GeneationPipeline`

The easiest way to launch a data generation from a trained model consists in using the built-in GenerationPipeline provided in Pythae. Say you want to generate 100 samples using a MAFSampler all you have to do is 1) relaod the trained model, 2) define the sampler's configuration and 3) create and launch the GenerationPipeline as follows

>>> from pythae.models import AutoModel
>>> from pythae.samplers import MAFSamplerConfig
>>> from pythae.pipelines import GenerationPipeline
>>> # Retrieve the trained model
>>> my_trained_vae = AutoModel.load_from_folder(
...	'path/to/your/trained/model'
... )
>>> my_sampler_config = MAFSamplerConfig(
...	n_made_blocks=2,
...	n_hidden_in_made=3,
...	hidden_size=128
... )
>>> # Build the pipeline
>>> pipe = GenerationPipeline(
...	model=my_trained_vae,
...	sampler_config=my_sampler_config
... )
>>> # Launch data generation
>>> generated_samples = pipe(
...	num_samples=args.num_samples,
...	return_gen=True, # If false returns nothing
...	train_data=train_data, # Needed to fit the sampler
...	eval_data=eval_data, # Needed to fit the sampler
...	training_config=BaseTrainerConfig(num_epochs=200) # TrainingConfig to use to fit the sampler
... )

Using the Samplers

Alternatively, you can launch the data generation process from a trained model directly with the sampler. For instance, to generate new data with your sampler, run the following.

>>> from pythae.models import AutoModel
>>> from pythae.samplers import NormalSampler
>>> # Retrieve the trained model
>>> my_trained_vae = AutoModel.load_from_folder(
...	'path/to/your/trained/model'
... )
>>> # Define your sampler
>>> my_samper = NormalSampler(
...	model=my_trained_vae
... )
>>> # Generate samples
>>> gen_data = my_samper.sample(
...	num_samples=50,
...	batch_size=10,
...	output_dir=None,
...	return_gen=True
... )

If you set output_dir to a specific path, the generated images will be saved as .png files named 00000000.png, 00000001.png ... The samplers can be used with any model as long as it is suited. For instance, a GaussianMixtureSampler instance can be used to generate from any model but a VAMPSampler will only be usable with a VAMP model. Check here to see which ones apply to your model. Be carefull that some samplers such as the GaussianMixtureSampler for instance may need to be fitted by calling the fit method before using. Below is an example for the GaussianMixtureSampler.

>>> from pythae.models import AutoModel
>>> from pythae.samplers import GaussianMixtureSampler, GaussianMixtureSamplerConfig
>>> # Retrieve the trained model
>>> my_trained_vae = AutoModel.load_from_folder(
...	'path/to/your/trained/model'
... )
>>> # Define your sampler
... gmm_sampler_config = GaussianMixtureSamplerConfig(
...	n_components=10
... )
>>> my_samper = GaussianMixtureSampler(
...	sampler_config=gmm_sampler_config,
...	model=my_trained_vae
... )
>>> # fit the sampler
>>> gmm_sampler.fit(train_dataset)
>>> # Generate samples
>>> gen_data = my_samper.sample(
...	num_samples=50,
...	batch_size=10,
...	output_dir=None,
...	return_gen=True
... )

Define you own Autoencoder architecture

Pythae provides you the possibility to define your own neural networks within the VAE models. For instance, say you want to train a Wassertstein AE with a specific encoder and decoder, you can do the following:

>>> from pythae.models.nn import BaseEncoder, BaseDecoder
>>> from pythae.models.base.base_utils import ModelOutput
>>> class My_Encoder(BaseEncoder):
...	def __init__(self, args=None): # Args is a ModelConfig instance
...		BaseEncoder.__init__(self)
...		self.layers = my_nn_layers()
...		
...	def forward(self, x:torch.Tensor) -> ModelOutput:
...		out = self.layers(x)
...		output = ModelOutput(
...			embedding=out # Set the output from the encoder in a ModelOutput instance 
...		)
...		return output
...
... class My_Decoder(BaseDecoder):
...	def __init__(self, args=None):
...		BaseDecoder.__init__(self)
...		self.layers = my_nn_layers()
...		
...	def forward(self, x:torch.Tensor) -> ModelOutput:
...		out = self.layers(x)
...		output = ModelOutput(
...			reconstruction=out # Set the output from the decoder in a ModelOutput instance
...		)
...		return output
...
>>> my_encoder = My_Encoder()
>>> my_decoder = My_Decoder()

And now build the model

>>> from pythae.models import WAE_MMD, WAE_MMD_Config
>>> # Set up the model configuration 
>>> my_wae_config = model_config = WAE_MMD_Config(
...	input_dim=(1, 28, 28),
...	latent_dim=10
... )
...
>>> # Build the model
>>> my_wae_model = WAE_MMD(
...	model_config=my_wae_config,
...	encoder=my_encoder, # pass your encoder as argument when building the model
...	decoder=my_decoder # pass your decoder as argument when building the model
... )

important note 1: For all AE-based models (AE, WAE, RAE_L2, RAE_GP), both the encoder and decoder must return a ModelOutput instance. For the encoder, the ModelOutput instance must contain the embbeddings under the key embedding. For the decoder, the ModelOutput instance must contain the reconstructions under the key reconstruction.

important note 2: For all VAE-based models (VAE, BetaVAE, IWAE, HVAE, VAMP, RHVAE), both the encoder and decoder must return a ModelOutput instance. For the encoder, the ModelOutput instance must contain the embbeddings and log-covariance matrices (of shape batch_size x latent_space_dim) respectively under the key embedding and log_covariance key. For the decoder, the ModelOutput instance must contain the reconstructions under the key reconstruction.

Using benchmark neural nets

You can also find predefined neural network architectures for the most common data sets (i.e. MNIST, CIFAR, CELEBA ...) that can be loaded as follows

>>> for pythae.models.nn.benchmark.mnist import (
...	Encoder_Conv_AE_MNIST, # For AE based model (only return embeddings)
...	Encoder_Conv_VAE_MNIST, # For VAE based model (return embeddings and log_covariances)
...	Decoder_Conv_AE_MNIST
... )

Replace mnist by cifar or celeba to access to other neural nets.

Getting your hands on the code

To help you to understand the way pythae works and how you can train your models with this library we also provide tutorials:

making_your_own_autoencoder.ipynb shows you how to pass your own networks to the models implemented in pythae
models_training folder provides notebooks showing how to train each implemented model and how to sample from it using pythae.samplers.
scripts folder provides in particular an example of a training script to train the models on benchmark data sets (mnist, cifar10, celeba ...)

Dealing with issues

If you are experiencing any issues while running the code or request new features/models to be implemented please open an issue on github.

Contributing 🚀

You want to contribute to this library by adding a model, a sampler or simply fix a bug ? That's awesome! Thank you! Please see CONTRIBUTING.md to follow the main contributing guidelines.

Results

Reconstruction

First let's have a look at the reconstructed samples taken from the evaluation set.

Models	MNIST	CELEBA
Eval data
AE
VAE
Beta-VAE
VAE Lin NF
VAE IAF
Disentangled Beta-VAE
FactorVAE
BetaTCVAE
IWAE
MSSSIM_VAE
WAE
INFO VAE
VAMP
SVAE
Adversarial_AE
VAE_GAN
VQVAE
HVAE
RAE_L2
RAE_GP
Riemannian Hamiltonian VAE (RHVAE)

Generation

Here, we show the generated samples using using each model implemented in the library and different samplers.

Models	MNIST	CELEBA
AE + GaussianMixtureSampler
VAE + NormalSampler
VAE + GaussianMixtureSampler
VAE + TwoStageVAESampler
VAE + MAFSampler
Beta-VAE + NormalSampler
VAE Lin NF + NormalSampler
VAE IAF + NormalSampler
Disentangled Beta-VAE + NormalSampler
FactorVAE + NormalSampler
BetaTCVAE + NormalSampler
IWAE + Normal sampler
MSSSIM_VAE + NormalSampler
WAE + NormalSampler
INFO VAE + NormalSampler
SVAE + HypershereUniformSampler
VAMP + VAMPSampler
Adversarial_AE + NormalSampler
VAEGAN + NormalSampler
VQVAE + MAFSampler
HVAE + NormalSampler
RAE_L2 + GaussianMixtureSampler
RAE_GP + GaussianMixtureSampler
Riemannian Hamiltonian VAE (RHVAE) + RHVAE Sampler

Project details

These details have not been verified by PyPI

Project links

GitHub Statistics

View statistics for this project via Libraries.io, or by using our public dataset on Google BigQuery

Release history Release notifications | RSS feed

0.1.2

Sep 6, 2023

0.1.1

Feb 23, 2023

0.1.0

Feb 6, 2023

0.0.9

Oct 19, 2022

0.0.8

Sep 7, 2022

0.0.7

Sep 3, 2022

0.0.6

Jul 22, 2022

0.0.5

Jul 7, 2022

0.0.4

Jul 7, 2022

0.0.3

Jul 5, 2022

0.0.2

Jul 4, 2022

This version

0.0.1

Jun 14, 2022

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

pythae-0.0.1.tar.gz (108.3 kB view hashes)

Uploaded Jun 14, 2022 Source

Built Distribution

pythae-0.0.1-py3-none-any.whl (197.9 kB view hashes)

Uploaded Jun 14, 2022 Python 3

Hashes for pythae-0.0.1.tar.gz

Hashes for pythae-0.0.1.tar.gz
Algorithm	Hash digest
SHA256	`d645d01025a35b22fbb19e42a0a73946b8c1156b01607f49e7939b27ed77140e`
MD5	`3fbcdc75c9e651827dbf5e683762a0ad`
BLAKE2b-256	`16a1bb49c05d2ac8d71986a6a4df69393606e389e6d226beb570bc6558ba5c9a`

Hashes for pythae-0.0.1-py3-none-any.whl

Hashes for pythae-0.0.1-py3-none-any.whl
Algorithm	Hash digest
SHA256	`87eea05d73bd5a6df3030c23bba074c8d103e914cfab53effd6ee9bd03bc90c0`
MD5	`9ace5dad0e3dc8490b1ef9f6a242ef74`
BLAKE2b-256	`87d8b02959a8d415d7ac987fa402100cfc8ee6550c7cd14fbafb5e5765c39a9d`

pythae 0.0.1

Navigation

Verified details

Maintainers

Unverified details

Project links

GitHub Statistics

Meta

Classifiers

Project description

pythae

Installation

Available Models

Available Samplers

Launching a model training

Lauching a training on benchmark datasets

Launching data generation

Using the `GeneationPipeline`

Using the Samplers

Define you own Autoencoder architecture

Using benchmark neural nets

Getting your hands on the code

Dealing with issues

Contributing 🚀

Results

Reconstruction

Generation

Project details

Verified details

Maintainers

Unverified details

Project links

GitHub Statistics

Meta

Classifiers

Release history Release notifications | RSS feed

Download files

Source Distribution

Built Distribution

pythae 0.0.1

Navigation

Verified details

Maintainers

Unverified details

Project links

GitHub Statistics

Meta

Classifiers

Project description

pythae

Installation

Available Models

Available Samplers

Launching a model training

Lauching a training on benchmark datasets

Launching data generation

Using the GeneationPipeline

Using the Samplers

Define you own Autoencoder architecture

Using benchmark neural nets

Getting your hands on the code

Dealing with issues

Contributing 🚀

Results

Reconstruction

Generation

Project details

Verified details

Maintainers

Unverified details

Project links

GitHub Statistics

Meta

Classifiers

Release history Release notifications | RSS feed

Download files

Source Distribution

Built Distribution

Using the `GeneationPipeline`