Jigsaw-like AggMap: A Robust and Explainable Omics Deep Learning Tool
Project description
Jigsaw-like AggMap
A Robust and Explainable Omics Deep Learning Tool
Installation
install aggmap by:
# create an aggmap env
conda create -n aggmap python=3.8
conda activate aggmap
pip install --upgrade pip
pip install aggmap
Usage
import pandas as pd
from sklearn.datasets import load_breast_cancer
from aggmap import AggMap, AggMapNet
# Data loading
data = load_breast_cancer()
dfx = pd.DataFrame(data.data, columns=data.feature_names)
dfy = pd.get_dummies(pd.Series(data.target))
# AggMap object definition, fitting, and saving
mp = AggMap(dfx, metric = 'correlation')
mp.fit(cluster_channels=5, emb_method = 'umap', verbose=0)
mp.save('agg.mp')
# AggMap visulizations: Hierarchical tree, embeddng scatter and grid
mp.plot_tree()
mp.plot_scatter(enabled_data_labels=True, radius=5)
mp.plot_grid(enabled_data_labels=True)
# Transoformation of 1d vectors to 3D Fmaps (-1, w, h, c) by AggMap
X = mp.batch_transform(dfx.values, n_jobs=4, scale_method = 'minmax')
y = dfy.values
# AggMapNet training, validation, early stopping, and saving
clf = AggMapNet.MultiClassEstimator(epochs=50, gpuid=0)
clf.fit(X, y, X_valid=None, y_valid=None)
clf.save_model('agg.model')
# Model explaination by simply-explainer: global, local
simp_explainer = AggMapNet.simply_explainer(clf, mp)
global_simp_importance = simp_explainer.global_explain(clf.X_, clf.y_)
local_simp_importance = simp_explainer.local_explain(clf.X_[[0]], clf.y_[[0]])
# Model explaination by shapley-explainer: global, local
shap_explainer = AggMapNet.shapley_explainer(clf, mp)
global_shap_importance = shap_explainer.global_explain(clf.X_)
local_shap_importance = shap_explainer.local_explain(clf.X_[[0]])
How It Works?
- AggMap flowchart of feature mapping and agglomeration into ordered (spatially correlated) multi-channel feature maps (Fmaps)
a, AggMap flowchart of feature mapping and aggregation into ordered (spatially-correlated) channel-split feature maps (Fmaps).b, CNN-based AggMapNet architecture for Fmaps learning. c, proof-of-concept illustration of AggMap restructuring of unordered data (randomized MNIST) into clustered channel-split Fmaps (reconstructed MNIST) for CNN-based learning and important feature analysis. d, typical biomedical applications of AggMap in restructuring omics data into channel-split Fmaps for multi-channel CNN-based diagnosis and biomarker discovery (explanation saliency-map
of important features).
Proof-of-Concepts of reconstruction ability on MNIST Dataset
- It can reconstruct to the original image from completely randomly permuted (disrupted) MNIST data:
Org1
: the original grayscale images (channel = 1), OrgRP1
: the randomized images of Org1 (channel = 1), RPAgg1, 5
: the reconstructed images of OrgPR1
by AggMap feature restructuring (channel = 1, 5 respectively, each color represents features of one channel). RPAgg5-tkb
: the original images with the pixels divided into 5 groups according to the 5-channels of RPAgg5
and colored in the same way as RPAgg5
.
The effect of the number of channels on model performance
- Multi-channel Fmaps can boost the model performance notably:
The performance of AggMapNet using different number of channels on the TCGA-T (a)
and COV-D (b)
. For TCGA-T
, ten-fold cross validation average performance, for COV-D
, a fivefold cross validation was performed and repeat 5 rounds using different random seeds (total 25 training times), their average performances of the validation set were reported.
Example for Restructured Fmaps
- The example on WDBC dataset: click here to find out more!
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distributions
Built Distribution
File details
Details for the file aggmap-1.2.1-py3-none-any.whl
.
File metadata
- Download URL: aggmap-1.2.1-py3-none-any.whl
- Upload date:
- Size: 58.7 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/4.0.2 CPython/3.8.17
File hashes
Algorithm | Hash digest | |
---|---|---|
SHA256 | 93dbbf2a47bc91ae9356fe52ca90b0a11458d6e958868b024a91c9ce81192f6b |
|
MD5 | da2c940015aabc620e8f8529f006352e |
|
BLAKE2b-256 | 22308bb42229f3783759273e2e3de7cfc712409bbbf66732b816b775fe9061bf |