Skip to main content

No project description provided

Project description

scivae

Check out our docs: https://arianemora.github.io/scivae/

If you use this please cite: https://doi.org/10.1101/2021.06.22.449386

scivae is a wrapper around the keras AE that allows you to build/save/visualise with a variational autoencoder.

Blogs & notebooks used as references are noted in the code and a couple at the end of this README.

The primary difference between a VAE and a normal AE is in how the loss function is computed. Here the loss has been abstracted out to the loss class (in loss.py) where we can use a number of loss metrics MMD, KL and combine this with MSE or COR loss.

The VAE (in vae.py) class has the general VAE structure.

Saving has been implemented of the VAE state so that you can re-use your trained model on the same data and get the same latent space (or use the trained VAE on new data).

Optimiser was a temporary deviation where we can pass in a VAE structure and using an evolutionary algorithm the optimisation class will try to get the best VAE structure. This will be returned.

Validate allows for running simple validations using scikitlearn i.e. if your primary interest is to get a meaningful latent space that captures the key features of the dataset, it can be good to compare how much "information" has been captured between your classes. A good way of measuring this is by passing through the latent space and a set of labels and seeing if a simple classifier can distingush your classes better than with the raw data.

Users

Check out the install page and the documentation or our package on pip: https://pypi.org/project/scivae/1.0.4/

pip install scivae

Developers

Install required packages

pip install -r requirements.txt

It is very easy to call the basic VAE. Simply install the package (or raw code). Then you need to setup a config dictionary. This is pretty self explanatory.

from scivae import *
- loss: loss dictionary see Loss class for input details
- encoding: a dictionary of encoding layers, number of nodes and activation function
- decoding: same as above but for decoding layers
- latent: configs for latent space. See (def optimiser(self, optimiser_name: str, params: dict):) in vae.py for details

config_mse = {'loss': {'loss_type': 'mse', 'distance_metric': 'mmd', 'mmd_weight': 1}, 
          'encoding': {'layers': [{'num_nodes': 32, 'activation_fn': 'selu'}, 
                                  {'num_nodes': 16, 'activation_fn': 'selu'}]}, 
          'decoding': {'layers': [{'num_nodes': 16, 'activation_fn': 'selu'}, 
                                  {'num_nodes': 32, 'activation_fn': 'selu'}]}, 
 'latent': {'num_nodes': 2}, 'optimiser': {'params': {}, 'name': 'adam'}}

Run the VAE. Numeric data is expected to follow an approximately normal distribution (each column). It expects a numpy array with each row being a list of features corresponding to some label. Labels mean nothing - they just need to be a list of the same size - these are just used for downstream analyses (e.g. colouring).

Note for most configs we want input_data = output_data however I have left this modular so we can upgrade to having it be denoising etc in the future.

# vae_mse = VAE(input_data, output_data, labels, config_mse, 'vae_label')
vae_mse = VAE(numeric_data, numeric_data, labels, config_mse, 'vae_label')
# Set batch size and number of epochs
vae_mse.encode('default', epochs=500, bacth_size=50)
encoded_data_vae_mse = vae_mse.get_encoded_data()

The VAE can also be used to encode new data.

# Note the [0] on the end - the encoded data is just returned as the 0th element of a np array.
new_data_encoded = vae_mse.encode_new_data(some_new_np_array)[0]

Visualisation is the same as if we got back the PCs from PCA. i.e. the below code will plot a scatter plot of the first and second latent nodes.

plt.scatter(encoded_data_vae_mse[:,0], encoded_data_vae_mse[:,1])

Tests

See tests for further examples.

References

    https://github.com/pren1/keras-MMD-Variational-Autoencoder/blob/master/Keras_MMD_Variational_Autoencoder.ipynb
    https://github.com/s-omranpour/X-VAE-keras/blob/master/VAE/VAE_MMD.ipynb
    https://github.com/ShengjiaZhao/MMD-Variational-Autoencoder/blob/master/mmd_vae.py
    https://github.com/CancerAI-CL/IntegrativeVAEs/blob/master/code/models/mmvae.py

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

scivae-1.0.8.tar.gz (45.9 kB view details)

Uploaded Source

Built Distribution

scivae-1.0.8-py3-none-any.whl (62.8 kB view details)

Uploaded Python 3

File details

Details for the file scivae-1.0.8.tar.gz.

File metadata

  • Download URL: scivae-1.0.8.tar.gz
  • Upload date:
  • Size: 45.9 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/3.1.1 pkginfo/1.5.0.1 requests/2.22.0 setuptools/46.1.3 requests-toolbelt/0.9.1 tqdm/4.42.1 CPython/3.7.6

File hashes

Hashes for scivae-1.0.8.tar.gz
Algorithm Hash digest
SHA256 fef7547cbdb971d43933390cd8bd73a415179b41cac3b60a64df4cc829a338b8
MD5 d5dc383e01a9d54345162db806f4b590
BLAKE2b-256 f088a70985b85d32f43b411d7fa4be0ea31adfed6582a1f37a54e638e6bcdeaf

See more details on using hashes here.

File details

Details for the file scivae-1.0.8-py3-none-any.whl.

File metadata

  • Download URL: scivae-1.0.8-py3-none-any.whl
  • Upload date:
  • Size: 62.8 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/3.1.1 pkginfo/1.5.0.1 requests/2.22.0 setuptools/46.1.3 requests-toolbelt/0.9.1 tqdm/4.42.1 CPython/3.7.6

File hashes

Hashes for scivae-1.0.8-py3-none-any.whl
Algorithm Hash digest
SHA256 66dcd0cb3a6f9d1696864356501b8cedc20c124ae7694106156105a0c1ac47d5
MD5 1987aed44c657c6182771d8d7966853d
BLAKE2b-256 6c04475b45a72324c6d2750bd6f7f9ac0b442249091a2ef731b60b07869ef0dd

See more details on using hashes here.

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page