Skip to main content

No project description provided

Project description

GLDM_PyPi

This repository contains the package of gldm.

Environment Setup

gldm depends on pytorch, pytorch-geometric and rdkit libraries. To install the dependencies, please create a conda environment with the config file GLDM.yml:

conda env create --file=environment.yml
conda activate gldm

gldm can then be installed via pip:

pip install gldm

Adapt pytorch-geometric Package

We need to customize the pre-batching and mini-batching behaviors of pytorch-geometric. Please check the details in adapted_pkgs/pyg/README.md.

Inference with Pretrained Models

Download Pretrained Models

Our models were developed in two stages. First, the autoencoder models were trained to learn the latent space of the molecular graphs. Then, the diffusion models were trained to manufacture the latent space. The pretrained autoencoder models can be downloaded from here, and the diffusion models can be downloaded from here.

Usage

A simple example to generate 10 molecules with dummy gene expression profiles and save the generated molecules as samples.png:

from gldm import sampleMol
import torch

if __name__ == "__main__":
    model_path = 'GLDM_models/GLDM_WAE_cond.pt'
    config_file = 'config/GLDM_WAE_cond.yml'
    # gene_expression_file = 'pseudo_gene_expr_dose.pt'
    dummy_gene_expr = torch.rand((10, 979))
    sampleMol(model_path, config_file, gene_expression=dummy_gene_expr, num_samples=10, output_file='samples.pkl', save_img='samples.png')

A similar example can be found in Example_usage.ipynb.

The sampleMol function has the following parameters:

Parameter Description
model_path The path to the pretrained model file.
config_file The path to the configuration file.
gene_expression A PyTorch tensor or a Numpy array or a list representing the gene expression.
gene_expression_file The path to the file containing the gene expression tensors. If gene_expression is provided, this parameter will be ignored.
num_samples The number of molecules to be generated.
output_file The pickle file where the generated molecules will be saved.
save_img The path where the generated image will be saved.

Note:

  • The default configuration files are provided in the config folder. Remember to change config['model']['first_stage_config']['ckpt_path'] to the path that you stored the pretrained autoencoder models.
  • We provide conditional and unconditional models using VAE, AAE and WAE losses. Please ensure the autoencoder model and the diffusion model are consistent in terms of the loss function and the conditional setting.
  • Ensure the gene expression data are provided if the conditional models are used.

Training GLDM from scratch

Coming soon ...

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distributions

No source distribution files available for this release.See tutorial on generating distribution archives.

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

gldm-0.2-py3-none-any.whl (460.3 kB view details)

Uploaded Python 3

File details

Details for the file gldm-0.2-py3-none-any.whl.

File metadata

  • Download URL: gldm-0.2-py3-none-any.whl
  • Upload date:
  • Size: 460.3 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/5.0.0 CPython/3.9.15

File hashes

Hashes for gldm-0.2-py3-none-any.whl
Algorithm Hash digest
SHA256 7969df362195c77fd56c646017960f8ef11247512f4c7a9f209700d9c71b7cfe
MD5 4e030591ee0e83690031e84e2a7dbeb3
BLAKE2b-256 71b9e10d09a4e9c75fd1d8816831f34cb39bbffb8d030cb44248db6ab49e7bc9

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page