Skip to main content

Package was renamed from Biocarta v0.2.27 to Biocartograph because of an unintentional name clash

Project description

Biocartograph

Creating Cartographic Representations of Biological Data DOI

Installation

pip install biocartograph

Example code

if __name__ == '__main__' :
    from biocartograph.quantification import full_mapping
    #
    adf = pd.read_csv('analytes.tsv',sep='\t',index_col=0)
    #
    # WE DO NOT WANT TO KEEP POTENTIALLY BAD ENTRIES 
    adf = adf.iloc[ np.inf != np.abs( 1.0/np.std(adf.values,1) ) ,
                    np.inf != np.abs( 1.0/np.std(adf.values,0) ) ].copy()
    #
    # READING IN SAMPLE INFORMATION
    # THIS IS NEEDED FOR THE ALIGNED PCA TO WORK
    jdf = pd.read_csv('journal.tsv',sep='\t',index_col=0)
    jdf = jdf.loc[:,adf.columns.values]
    #
    alignment_label , sample_label = 'Disease' , None
    add_labels = ['Cell-line']
    #
    cmd                = 'max'
    # WRITE FILES AND MAKE NOISE
    bVerbose           = True
    # CREATE AN OPTIMIZED REPRESENTATION
    bExtreme           = True
    # WE MIGHT WANT SOME SPECIFIC INTERSECTIONS OF THE HIERARCHY
    n_clusters         = [20,40,60,80,100]
    # USE ALL INFORMATION
    n_components       = None
    umap_dimension     = 2
    n_neighbors        = 20
    local_connectivity = 20.
    transform_seed     = 42
    #
    print ( adf , jdf )
    #
    # distance_type = 'correlation,spearman,absolute' # DONT USE THIS
    distance_type = 'covariation' # BECOMES CO-EXPRESSION BASED
    #
    results = full_mapping ( adf , jdf                  ,
        bVerbose = bVerbose             ,
        bExtreme = bExtreme             ,
        n_clusters = n_clusters         ,
        n_components = n_components     ,
        distance_type = distance_type   ,
        umap_dimension = umap_dimension ,
        umap_n_neighbors = n_neighbors  ,
        umap_local_connectivity = local_connectivity ,
        umap_seed = transform_seed      ,
        hierarchy_cmd = cmd             ,
        add_labels = add_labels         ,
        alignment_label = alignment_label ,
        sample_label = None     )
    #
    map_analytes        = results[0]
    map_samples         = results[1]
    hierarchy_analytes  = results[2]
    hierarchy_samples   = results[3]

or just call it using the default values:

import pandas as pd
import numpy  as np

if __name__ == '__main__' :
    from biocartograph.quantification import full_mapping
    #
    adf = pd.read_csv('analytes.tsv',sep='\t',index_col=0)
    #
    adf = adf.iloc[ np.inf != np.abs( 1.0/np.std(adf.values,1) ) ,
                    np.inf != np.abs( 1.0/np.std(adf.values,0) ) ].copy()
    jdf = pd.read_csv('journal.tsv',sep='\t',index_col=0)
    jdf = jdf.loc[:,adf.columns.values]
    #
    alignment_label , sample_label = 'Disease' , None
    add_labels = ['Cell-line']
    #
    results = full_mapping ( adf , jdf  ,
        bVerbose = True			,
        n_clusters = [40,80,120]        ,
        add_labels = add_labels         ,
        alignment_label = alignment_label )
    #
    map_analytes        = results[0]
    map_samples         = results[1]
    hierarchy_analytes  = results[2]
    hierarchy_samples   = results[3]

and plotting the information of the map analytes yields : Cancer Disease Example

You can also run an alternative algorithm where the UMAP coordinates are employed directly for clustering by setting

    results = full_mapping ( adf , jdf  ,
        bVerbose = True			        ,
        bUseUmap = True                 ,
        n_clusters = [40,80,120]        ,
        add_labels = add_labels         ,
        alignment_label = alignment_label )

with the following results.

Download the zip and open the html index:

chromium index.html

Other generated solutions

The clustering visualisations were created using the Biocartograph and hvplot :

What groupings corresponds to biomarker variance that describe them? Here are two visualisations of that:

Diseases : cancers biocartograph gfa Reactome enrichments biocartograph gfa cluster enrichments biocartograph treemap cluster 61

Tissues : tissues

Single Cells: single cells biocartograph gfa enrichment biocartograph treemap cluster 47

Blood Cells: blood cells biocartograph gfa enrichment biocartograph treemap cluster 2

TCGA-BRCA : Calculated using the biocartograph and a TCGA derived data set with the results for Breast Cancer mRNA-seq

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

biocartograph-0.6.1.tar.gz (26.3 kB view details)

Uploaded Source

Built Distribution

biocartograph-0.6.1-py3-none-any.whl (29.3 kB view details)

Uploaded Python 3

File details

Details for the file biocartograph-0.6.1.tar.gz.

File metadata

  • Download URL: biocartograph-0.6.1.tar.gz
  • Upload date:
  • Size: 26.3 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/3.4.2 importlib_metadata/4.8.1 pkginfo/1.8.1 requests/2.26.0 requests-toolbelt/0.9.1 tqdm/4.62.3 CPython/3.9.13

File hashes

Hashes for biocartograph-0.6.1.tar.gz
Algorithm Hash digest
SHA256 de5de5e2f3b4622d07aad5066f13e9a8d602517ff9e4360b803ebc6e4db80d05
MD5 06123de6df18220f5fb0730909c5547b
BLAKE2b-256 7950803826ff1e34dbb8f06bdbe2200136ec7b7623d8108ffaec0d999daa3698

See more details on using hashes here.

File details

Details for the file biocartograph-0.6.1-py3-none-any.whl.

File metadata

  • Download URL: biocartograph-0.6.1-py3-none-any.whl
  • Upload date:
  • Size: 29.3 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/3.4.2 importlib_metadata/4.8.1 pkginfo/1.8.1 requests/2.26.0 requests-toolbelt/0.9.1 tqdm/4.62.3 CPython/3.9.13

File hashes

Hashes for biocartograph-0.6.1-py3-none-any.whl
Algorithm Hash digest
SHA256 0f7a244e74aa8081396d0151a218eb3f8b8711432a1c6c9144c9c1330eaffbdb
MD5 5bf8098d63297d6118631f8ac9cc302f
BLAKE2b-256 bbfc44dc9185925eeeb4ee9b4109b2e2ba5670e50dc5452f29a23baee99ce183

See more details on using hashes here.

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page