Skip to main content

Garfield: Graph-based Contrastive Learning enable Fast Single-Cell Embedding

Project description

Garfield

Graph-based Contrastive Learning enable Fast Single-Cell Embedding.

Garfield

Repository layout

├── Garfield
│   ├── data
│   │   └── gene_anno
│   │       ├── hg19_genes.bed
│   │       ├── hg38_genes.bed
│   │       ├── mm10_genes.bed
│   │       └── mm9_genes.bed
│   ├── __init__.py
│   ├── model
│   │   ├── Garfield_net.py
│   │   ├── GarfieldTrainer.py
│   │   ├── __init__.py
│   │   ├── _layers.py
│   │   ├── _loss.py
│   │   ├── metrics.py
│   │   ├── prepare_Data.py
│   │   ├── _tools.py
│   │   └── _utils.py
│   ├── preprocessing
│   │   ├── _graph.py
│   │   ├── __init__.py
│   │   ├── _pca.py
│   │   ├── preprocess.py
│   │   ├── _qc.py
│   │   ├── read_adata.py
│   │   └── _utils.py
│   ├── _settings.py
│   └── _version.py
├── imgs
│   └── Garfield_framework.png
├── LICENSE
├── README.md
├── requirements.txt
└── setup.py

Installation

Please install Garfield from pypi with:

pip install garfield

install from Github:

pip install git+https://github.com/zhou-1314/Garfield.git

or git clone and install:

git clone https://github.com/zhou-1314/Garfield.git
cd Garfield
python setup.py install

Garfield is implemented in Pytorch framework.

Usage

## load packages
import os
import Garfield as gf
import scanpy as sc
gf.__version__

## set the working directory
workdir = 'result_garfield_test'
gf.settings.set_workdir(workdir)

gf.settings.set_figure_params(dpi=80,
                              style='white',
                              fig_size=[5,5],
                              rc={'image.cmap': 'viridis'})
## make plots prettier
from matplotlib_inline.backend_inline import set_matplotlib_formats
set_matplotlib_formats('retina')

## Set the parameters for Garfield training
dict_config = dict(
    ## Input options
    data_dir=workdir,  
    project_name='test',  
    adata_list='./example_data/rna_muraro2016.h5ad',  
    profile='RNA',
    data_type=None,  # Paired
    sample_col=None,  

    ## Dataset preprocessing options
    filter_cells_rna=False,  
    min_features=100,
    min_cells=3,
    keep_mt=False,
    normalize=True, 
    target_sum=1e4,
    used_hvg=True,  
    used_scale=True,
    single_n_top_genes=2000,  
    rna_n_top_features=2000, 
    atac_n_top_features=30000,
    metacell=False,  
    metacell_size=1,  
    n_pcs=20,
    n_neighbors=15,  
    svd_solver='arpack',
    method='umap',
    metric='euclidean',  
    resolution_tol=0.1,  
    leiden_runs=1,  
    leiden_seed=None,  
    verbose=True,  

    ## Model options
    gnn_layer=2,
    conv_type='GAT',
    hidden_dims=[128, 128],
    bottle_neck_neurons=20,
    svd_q=5,  # default=5, type=int, help='rank'
    cluster_num=20,
    num_heads=3,
    concat=True,
    used_edge_weight=True,
    used_recon_exp=True,
    used_DSBN=False,
    used_mmd=False,
    patience=5,
    test_split=0.1,
    val_split=0.1,
    batch_size=128,  # INT   batch size of model training
    num_neighbors=[3, 3],
    epochs=50,  # INT       Number of epochs.                        Default is 100.
    dropout=0.2,  # FLOAT     Dropout rate.                            Default is 0.
    mmd_temperature=0.2,  ## mmd regu
    instance_temperature=1.0,
    cluster_temperature=0.5,
    l2_reg=1e-03,
    gradient_clipping=5,  #
    learning_rate=0.001,  # FLOAT     Learning rate.     
    weight_decay=1e-05,  # FLOAT     Weight decay.    

    ## Other options
    monitor_only_val_losses=False,
    outdir=None,  # String      Save the model.
    load=False,  # Bool      Load the model.
)

## start training
from Garfield.model import GarfieldTrainer
trainer = GarfieldTrainer(dict_config)
trainer.fit()

## visualize embeddings of cells
adata_final = trainer.get_latent_representation()
sc.tl.umap(adata_final)
sc.pl.umap(adata_final, color=['cell_type1'], wspace=0.15, edges=False)

Support

Please submit issues or reach out to zhouwg1314@gmail.com.

Authors and acknowledgment

Weige Zhou

Citation

If you have used Garfiled for your work, please consider citing:

Weige Zhou (2024).Garfield: Graph-based Contrastive Learning enable Fast Single-Cell Embedding. https://github.com/zhou-1314/Garfield

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

Garfield-0.1.0.tar.gz (1.9 MB view details)

Uploaded Source

File details

Details for the file Garfield-0.1.0.tar.gz.

File metadata

  • Download URL: Garfield-0.1.0.tar.gz
  • Upload date:
  • Size: 1.9 MB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/5.0.0 CPython/3.9.18

File hashes

Hashes for Garfield-0.1.0.tar.gz
Algorithm Hash digest
SHA256 70f4a4d0aa2c19b99acdd36955cab36521cc474f693723e4a9b28d11cac5f81c
MD5 b56675ab5ee8a217b3c043e2dfefc845
BLAKE2b-256 a5efd2b3a0e5f4e9baf7cac8dc0ba9808b1c08c74ac6974d6ec4c51ab5500ca8

See more details on using hashes here.

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page