Skip to main content

Package for invariant VAE models on single-cell data

Project description

inVAE

inVAE is a conditionally invariant variational autoencoder that identifies both spurious (distractors) and invariant features. It leverages domain variability to learn conditionally invariant representations. We show that inVAE captures biological variations in single-cell datasets obtained from diverse conditions and labs. inVAE incorporates biological covariates and mechanisms such as disease states, to learn an invariant data representation. This improves cell classification accuracy significantly.

Installation

  1. PyPI only
    pip install invae

  2. Development Version (latest version on github)
    git clone https://github.com/theislab/inVAE.git
    pip install .

Example

Integration of Human Lung Cell Atlas using both healthy and disease samples

Usage

  1. Load the data:
    adata = sc.read(path/to/data)
  2. Optional - Split the data into train, val, test (in supervised case for training classifier as well)
  3. Initialize the model, either Factorized or Non-Factorized:
from inVAE import FinVAE, NFinVAE`

inv_covar_keys = {
    'cont': [],
    'cat': ['cell_type', 'donor'] #set to the keys in the adata
}

spur_covar_keys = {
    'cont': [],
    'cat': ['site'] #set to the keys in the adata
}

model = FinVAE(
    adata = adata_train,
    layer = 'counts', # The layer where the raw counts are stored in adata (None for adata.X: default)
    inv_covar_keys = inv_covar_keys,
    spur_covar_keys = spur_covar_keys
)

or

model = NFinVAE(
    adata = adata_train,
    layer = 'counts', # The layer where the raw counts are stored in adata (None for adata.X: default)
    inv_covar_keys = inv_covar_keys,
    spur_covar_keys = spur_covar_keys
)
  1. Train the generative model:
    model.train(n_epochs=500, lr_train=0.001, weight_decay=0.0001)
  2. Get the latent representation:
    latent = model.get_latent_representation(adata)
  3. Optional - Train the classifer (for cell types):
model.train_classifier(
    adata_val,
    batch_key = 'batch',
    label_key = 'cell_type',
)
  1. Optional - Predict cell types:
    pred_test = model.predict(adata_test, dataset_type='test')

  2. Optional - Saving and loading model:

model.save('./checkpoints/path.pt')
model.load('./checkpoints/path.pt')
```<br/>

## Dependencies

* scanpy==1.9.3
* torch==2.0.1
* tensorboard==2.13.0
* anndata==0.8.0


## Citation

[H. Aliee, F. Kapl, S. Hediyeh-Zadeh, F. J. Theis, Conditionally Invariant Representation Learning for Disentangling Cellular Heterogeneity, 2023](https://arxiv.org/abs/2307.00558)

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

inVAE-0.1.0.tar.gz (20.5 MB view details)

Uploaded Source

Built Distribution

inVAE-0.1.0-py3-none-any.whl (27.3 kB view details)

Uploaded Python 3

File details

Details for the file inVAE-0.1.0.tar.gz.

File metadata

  • Download URL: inVAE-0.1.0.tar.gz
  • Upload date:
  • Size: 20.5 MB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/4.0.2 CPython/3.11.4

File hashes

Hashes for inVAE-0.1.0.tar.gz
Algorithm Hash digest
SHA256 351430f47bdbb9633b68d4232f5da7f29568bf768a5069326e83b49fa9ea7336
MD5 c11ee19bd52106df6147e971cb9444df
BLAKE2b-256 927341f88263de86b8c00c00e597e2285c906b6137e15bc7b31f1098aac59744

See more details on using hashes here.

File details

Details for the file inVAE-0.1.0-py3-none-any.whl.

File metadata

  • Download URL: inVAE-0.1.0-py3-none-any.whl
  • Upload date:
  • Size: 27.3 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/4.0.2 CPython/3.11.4

File hashes

Hashes for inVAE-0.1.0-py3-none-any.whl
Algorithm Hash digest
SHA256 3c88d7ed5848ce57574e702a22ab01b1cd295a0912497e82b86c16e1a451541b
MD5 73ae8be317f64611b50566e7a2afe84c
BLAKE2b-256 d4e3db1df305cac4d2dc1b2c0076ebaee2ed541360cf6ee006af594960f6044e

See more details on using hashes here.

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page