Package was renamed from Biocarta v0.2.27 to Biocartograph because of an unintentional name clash
Project description
Biocartograph
Creating Cartographic Representations of Biological Data
Installation
pip install biocartograph
Example code
if __name__ == '__main__' :
from biocartograph.quantification import full_mapping
#
adf = pd.read_csv('analytes.tsv',sep='\t',index_col=0)
#
# WE DO NOT WANT TO KEEP POTENTIALLY BAD ENTRIES
adf = adf.iloc[ np.inf != np.abs( 1.0/np.std(adf.values,1) ) ,
np.inf != np.abs( 1.0/np.std(adf.values,0) ) ].copy()
#
# READING IN SAMPLE INFORMATION
# THIS IS NEEDED FOR THE ALIGNED PCA TO WORK
jdf = pd.read_csv('journal.tsv',sep='\t',index_col=0)
jdf = jdf.loc[:,adf.columns.values]
#
alignment_label , sample_label = 'Disease' , None
add_labels = ['Cell-line']
#
cmd = 'max'
# WRITE FILES AND MAKE NOISE
bVerbose = True
# CREATE AN OPTIMIZED REPRESENTATION
bExtreme = True
# WE MIGHT WANT SOME SPECIFIC INTERSECTIONS OF THE HIERARCHY
n_clusters = [20,40,60,80,100]
# USE ALL INFORMATION
n_components = None
umap_dimension = 2
n_neighbors = 20
local_connectivity = 20.
transform_seed = 42
#
print ( adf , jdf )
#
# distance_type = 'correlation,spearman,absolute' # DONT USE THIS
distance_type = 'covariation' # BECOMES CO-EXPRESSION BASED
#
results = full_mapping ( adf , jdf ,
bVerbose = bVerbose ,
bExtreme = bExtreme ,
n_clusters = n_clusters ,
n_components = n_components ,
distance_type = distance_type ,
umap_dimension = umap_dimension ,
umap_n_neighbors = n_neighbors ,
umap_local_connectivity = local_connectivity ,
umap_seed = transform_seed ,
hierarchy_cmd = cmd ,
add_labels = add_labels ,
alignment_label = alignment_label ,
sample_label = None )
#
map_analytes = results[0]
map_samples = results[1]
hierarchy_analytes = results[2]
hierarchy_samples = results[3]
or just call it using the default values:
import pandas as pd
import numpy as np
if __name__ == '__main__' :
from biocartograph.quantification import full_mapping
#
adf = pd.read_csv('analytes.tsv',sep='\t',index_col=0)
#
adf = adf.iloc[ np.inf != np.abs( 1.0/np.std(adf.values,1) ) ,
np.inf != np.abs( 1.0/np.std(adf.values,0) ) ].copy()
jdf = pd.read_csv('journal.tsv',sep='\t',index_col=0)
jdf = jdf.loc[:,adf.columns.values]
#
alignment_label , sample_label = 'Disease' , None
add_labels = ['Cell-line']
#
results = full_mapping ( adf , jdf ,
bVerbose = True ,
n_clusters = [40,80,120] ,
add_labels = add_labels ,
alignment_label = alignment_label )
#
map_analytes = results[0]
map_samples = results[1]
hierarchy_analytes = results[2]
hierarchy_samples = results[3]
and plotting the information of the map analytes yields : Cancer Disease Example
You can also run an alternative algorithm where the UMAP coordinates are employed directly for clustering by setting
results = full_mapping ( adf , jdf ,
bVerbose = True ,
bUseUmap = True ,
n_clusters = [40,80,120] ,
add_labels = add_labels ,
alignment_label = alignment_label )
with the following results.
Download the zip and open the html index:
chromium index.html
Other generated solutions
The clustering visualisations were created using the Biocartograph and hvplot :
What groupings corresponds to biomarker variance that describe them? Here are two visualisations of that:
Diseases : cancers biocartograph gfa Reactome enrichments biocartograph gfa cluster enrichments biocartograph treemap cluster 61
Tissues : tissues
Single Cells: single cells biocartograph gfa enrichment biocartograph treemap cluster 47
Blood Cells: blood cells biocartograph gfa enrichment biocartograph treemap cluster 2
TCGA-BRCA : Calculated using the biocartograph and a TCGA derived data set with the results for Breast Cancer mRNA-seq
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
File details
Details for the file biocartograph-0.6.1.tar.gz
.
File metadata
- Download URL: biocartograph-0.6.1.tar.gz
- Upload date:
- Size: 26.3 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/3.4.2 importlib_metadata/4.8.1 pkginfo/1.8.1 requests/2.26.0 requests-toolbelt/0.9.1 tqdm/4.62.3 CPython/3.9.13
File hashes
Algorithm | Hash digest | |
---|---|---|
SHA256 | de5de5e2f3b4622d07aad5066f13e9a8d602517ff9e4360b803ebc6e4db80d05 |
|
MD5 | 06123de6df18220f5fb0730909c5547b |
|
BLAKE2b-256 | 7950803826ff1e34dbb8f06bdbe2200136ec7b7623d8108ffaec0d999daa3698 |
File details
Details for the file biocartograph-0.6.1-py3-none-any.whl
.
File metadata
- Download URL: biocartograph-0.6.1-py3-none-any.whl
- Upload date:
- Size: 29.3 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/3.4.2 importlib_metadata/4.8.1 pkginfo/1.8.1 requests/2.26.0 requests-toolbelt/0.9.1 tqdm/4.62.3 CPython/3.9.13
File hashes
Algorithm | Hash digest | |
---|---|---|
SHA256 | 0f7a244e74aa8081396d0151a218eb3f8b8711432a1c6c9144c9c1330eaffbdb |
|
MD5 | 5bf8098d63297d6118631f8ac9cc302f |
|
BLAKE2b-256 | bbfc44dc9185925eeeb4ee9b4109b2e2ba5670e50dc5452f29a23baee99ce183 |