FaceAI-BGImpact implements various generative methods and studies the impact of altering the background on the generated images and the training of the models.
Project description
FaceAI - Background Impact
Code for the paper:
"Behind the Face: Unveiling the Effects of Background Subtraction on VAE and GAN Model Efficacy"
This study focuses on removing the background from datasets of faces
to gauge the effect on the training and performance of facial generative models.
We are also interested on the effect on the interpretability of the latent spaces of the models.
(Paper available here).
The repository contains:
- The
faceai-bgimpact
package:- Data processing scripts to create the FFHQ-Blur and FFHQ-Grey datasets
- A unified Deep Learning framework for training and evaluating Generative AI models
- Models enabling latent space exploration with PCA
- A set of scripts to train, evaluate and generate images and videos from the models
- A set of pre-trained models(*)
- A web application to take control of the pre-trained models(*)
(*) = not included in the Pypi package.
The package
The package was published to PyPi, and can be installed using
pip install faceai-bgimpact
To install locally, clone the repository and run poetry install
from the root folder instead.
๐ Datasets
Download script
We uploaded the created datasets to Kaggle. To download them, please set up your Kaggle API token, then run:
faceai-bgimpact download-all-ffhq
Generate
You can also choose to generate the Grey and Blur datasets yourself. To do so, first download the raw dataset using:
faceai-bgimpact download-all-ffhq --raw
Then, you can generate the masks using:
faceai-bgimpact create-masks
And finally, generate the grey and blur datasets using:
faceai-bgimpact create-blur-and-grey
Kaggle
The datasets are available on Kaggle at the following links:
๐ง Models
The package allows you to train and evaluate the following models:
- Variational Auto-encoder
- DCGAN
- StyleGAN
using these example commands:
-
faceai-bgimpact train --model VAE --dataset ffhq_raw
- (Train a VAE on the raw FFHQ dataset from scratch)
-
faceai-bgimpact train --model StyleGAN --dataset ffhq_grey --checkpoint-epoch 80
- (Resume training a StyleGAN on the greyed-out FFHQ dataset from epoch 80)
Notes on implementation
We implemented the models from scratch, using PyTorch. Here is a list of the main inspirations we used, if any:
- StyleGAN:
- There is no official PyTorch implementation of StyleGAN 1. Most of the code was implemented by hand from reading the StyleGAN paper and the ProGAN paper, although, some building blocks were taken from other repositories, below.
- hukkelas/progan-pytorch: Inspiration for the progressive-growing structure, no actual code was used.
- aladdinpersson/Machine-Learning-Collection: We used this repository for the most granular building blocks, like the weight-scaled Conv2d layer, the pixelwise normalization layer, and the minibatch standard deviation layer.
- NVlabs/stylegan2: Official StyleGAN2 implementation. StyleGAN2 has a very different architecture from StyleGAN1, but we used the R1 loss function from this repository. StyleGAN normally uses WGAN-GP regularization, but we had convergence issues. Using R1 regularization instead of WGAN-GP solved the issue.
- VAE:
- The VAE was implemented from scratch, using the VAE paper as a reference.
- PCA:
- The latent space exploration using PCA was all implemented from scratch.
We also introduced a unified framework for the models. In practice we have an AbstractModel
class, which is inherited by the VAE
, DCGAN
and StyleGAN
classes. It enforces a common structure for the models, allowing the scripts to be nearly model-agnostic.
We also put in place rigorous code standards, using pre-commit hooks (black, flake8, prettier) to enforce code formatting, and linting, as well as automated tests using PyTest, and a code review process using pull requests. We used __ for package management.
To run the pre-commit hooks, you should install the hooks using
pre-commit install
, and thenpre-commit run
(orpre-commit run --all-files
to run on all files).
To run the tests, you should install PyTest using
pip install pytest
, and then runpytest -v
(orpoetry run pytest -v
if you are using ).
๐ Scripts
The package also includes a set of scripts to train, evaluate and generate images and videos from the models.
Training
The train
script is an entry point to train a model. It includes these command-line arguments:
--model
: Required. Specifies the type of model to train with options "DCGAN", "StyleGAN", "VAE".--dataset
: Required. Selects the dataset to use with options "ffhq_raw", "ffhq_blur", "ffhq_grey".--latent-dim
: Optional. Defines the dimension of the latent space for generative models.--config-path
: Optional. Path to a custom JSON configuration file for model training.--lr
: Optional. Learning rate for DCGAN model training.--dlr
: Optional. Discriminator learning rate for StyleGAN training.--glr
: Optional. Generator learning rate for StyleGAN training.--mlr
: Optional. W-Mapping learning rate for StyleGAN training.--loss
: Optional. Specifies the loss function to use with defaults to "r1"; choices are "wgan", "wgan-gp", "r1".--batch-size
: Optional. Defines the batch size during training.--num-epochs
: Optional. Sets the number of epochs for training the model.--save-interval
: Optional. Epoch interval to wait before saving models and generated images.--image-interval
: Optional. Iteration interval to wait before saving generated images.--list
: Optional. Lists all available checkpoints if set.--checkpoint-path
: Optional. Path to a specific checkpoint file to resume training, takes precedence over--checkpoint-epoch
.
Training video
The create-video
script is an entry point to create a video from images saved throughout the training process. It includes these command-line arguments:
--model
: Required. Specifies the type of model to train with options "DCGAN", "StyleGAN", "VAE".--dataset
: Required. Selects the dataset to use with options "ffhq_raw", "ffhq_blur", "ffhq_grey".frame-rate
: Optional. Defines the frame rate of the video.skip-frames
: Optional. Defines the number of images to skip between each frame of the video.
Example usage:
faceai-bgimpact create-video --model StyleGAN --dataset ffhq_grey
=> Output video On the left, the generated image for the current resolution and alpha, on the right, a real image at the same resolution and alpha.
The web application
We developped a web application from scratch to control the latent space of StyleGAN using Vue.JS and Flask-RESTx (Python). It is too resource-intensive to be hosted on a free server, so the best course of action is to host it locally.
โ ๏ธ Warning โ ๏ธ: Since it contains Torch, the environment is quite heavy. At least 5GB of free space required.
The web-application is dockerized, so please install Docker first. Then, refer to this video to install and run the application:
But the general steps are:
git clone https://github.com/thomktz/FaceAI-BGImpact.git
cd FaceAI-BGImpact/webapp
docker compose up --build
Then, in a browser, go to http://localhost:8082
When you're done, don't forget to remove the docker image and container, as they are 5GB in total.
Folder structure
The package is structured as following :
FaceAI-BGImpact
โโโ faceai_bgimpact
โ โโโ configs # Configuration files for models
โ โ โโโ default_dcgan_config.py
โ โ โโโ default_stylegan_config.py
โ โ โโโ default_vae_config.py
โ โโโ data_processing # Scripts and notebooks for data preprocessing
โ โ โโโ paths.py # Data paths
โ โ โโโ download_raw_ffhq.py # Functions to download raw FFHQ dataset
โ โ โโโ create_masks.py # Functions to create masks for FFHQ dataset
โ โ โโโ create_blur_and_grey.py # Functions to create blurred and greyed-out FFHQ datasets
โ โ โโโ download_all_ffhq.py # Functions to download all FFHQ datasets
โ โโโ models
โ โ โโโ dcgan_ # DCGAN model implementation
โ โ โ โโโ dcgan.py
โ โ โ โโโ discriminator.py
โ โ โ โโโ generator.py
โ โ โโโ stylegan_ # StyleGAN model implementation
โ โ โ โโโ discriminator.py
โ โ โ โโโ generator.py
โ โ โ โโโ loss.py # Loss functions for StyleGAN
โ โ โ โโโ stylegan.py
| | โโโ vae_ # VAE model implementation
| | | โโโ decoder.py
โ โ โ โโโ encoder.py
โ โ โ โโโ vae.py
โ โ โโโ abstract_model.py # Abstract model class for common functionalities
โ โ โโโ data_loader.py # Data loading utilities
โ โ โโโ utils.py
โ โโโ scripts
โ โ โโโ train.py # Script to train models
โ โ โโโ create_video.py # Script to create videos from generated images
โ โ โโโ generate_images.py
โ โ โโโ graph_fids.py
โ โโโ main.py # Entry point for the package
โ โโโ Nirmala.ttf # Font file used in the project
โโโ tests/ # Pytests
โโโ webapp/ # Web application folder
โโโ report/ # LaTeX report
โโโ README.md
โโโ pyproject.toml # Poetry package management configuration file
Project details
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
File details
Details for the file faceai_bgimpact-0.2.4.tar.gz
.
File metadata
- Download URL: faceai_bgimpact-0.2.4.tar.gz
- Upload date:
- Size: 725.8 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: poetry/1.7.0 CPython/3.9.7 Darwin/23.1.0
File hashes
Algorithm | Hash digest | |
---|---|---|
SHA256 | 90117582448f6fa3571c9ed4934b81defe7c46a46bfe2623e905a371f08b991c |
|
MD5 | 943e0cc5a29ba5e56e61870cb74cd0cb |
|
BLAKE2b-256 | e5f1304516e085298e9030971132b5bf4eba5db2b8ae146576abe3da66ba5fc4 |
File details
Details for the file faceai_bgimpact-0.2.4-py3-none-any.whl
.
File metadata
- Download URL: faceai_bgimpact-0.2.4-py3-none-any.whl
- Upload date:
- Size: 735.2 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: poetry/1.7.0 CPython/3.9.7 Darwin/23.1.0
File hashes
Algorithm | Hash digest | |
---|---|---|
SHA256 | 5a64c47dc9c7d8bf5271ad9d718b8dcd881b947808c72ea4d11b4adcad8df314 |
|
MD5 | 929c2f58c74bf0ec975961ea3bcb6de5 |
|
BLAKE2b-256 | 2410bbc545472b9057220490b2b9583121c256b91cc890c6065d9adb3bd4341e |