Skip to main content

Package was renamed from Biocarta v0.2.27 to Biocartograph because of an unintentional name clash

Project description

Biocartograph

Creating Cartographic Representations of Biological Data DOI

Installation

pip install biocartograph

Example code

if __name__ == '__main__' :
    from biocartograph.quantification import full_mapping
    #
    adf = pd.read_csv('analytes.tsv',sep='\t',index_col=0)
    #
    # WE DO NOT WANT TO KEEP POTENTIALLY BAD ENTRIES 
    adf = adf.iloc[ np.inf != np.abs( 1.0/np.std(adf.values,1) ) ,
                    np.inf != np.abs( 1.0/np.std(adf.values,0) ) ].copy()
    #
    # READING IN SAMPLE INFORMATION
    # THIS IS NEEDED FOR THE ALIGNED PCA TO WORK
    jdf = pd.read_csv('journal.tsv',sep='\t',index_col=0)
    jdf = jdf.loc[:,adf.columns.values]
    #
    alignment_label , sample_label = 'Disease' , None
    add_labels = ['Cell-line']
    #
    cmd                = 'max'
    # WRITE FILES AND MAKE NOISE
    bVerbose           = True
    # CREATE AN OPTIMIZED REPRESENTATION
    bExtreme           = True
    # WE MIGHT WANT SOME SPECIFIC INTERSECTIONS OF THE HIERARCHY
    n_clusters         = [20,40,60,80,100]
    # USE ALL INFORMATION
    n_components       = None
    umap_dimension     = 2
    n_neighbors        = 20
    local_connectivity = 20.
    transform_seed     = 42
    #
    print ( adf , jdf )
    #
    # distance_type = 'correlation,spearman,absolute' # DONT USE THIS
    distance_type = 'covariation' # BECOMES CO-EXPRESSION BASED
    #
    results = full_mapping ( adf , jdf                  ,
        bVerbose = bVerbose             ,
        bExtreme = bExtreme             ,
        n_clusters = n_clusters         ,
        n_components = n_components     ,
        distance_type = distance_type   ,
        umap_dimension = umap_dimension ,
        umap_n_neighbors = n_neighbors  ,
        umap_local_connectivity = local_connectivity ,
        umap_seed = transform_seed      ,
        hierarchy_cmd = cmd             ,
        add_labels = add_labels         ,
        alignment_label = alignment_label ,
        sample_label = None     )
    #
    map_analytes        = results[0]
    map_samples         = results[1]
    hierarchy_analytes  = results[2]
    hierarchy_samples   = results[3]

or just call it using the default values:

import pandas as pd
import numpy  as np

if __name__ == '__main__' :
    from biocartograph.quantification import full_mapping
    #
    adf = pd.read_csv('analytes.tsv',sep='\t',index_col=0)
    #
    adf = adf.iloc[ np.inf != np.abs( 1.0/np.std(adf.values,1) ) ,
                    np.inf != np.abs( 1.0/np.std(adf.values,0) ) ].copy()
    jdf = pd.read_csv('journal.tsv',sep='\t',index_col=0)
    jdf = jdf.loc[:,adf.columns.values]
    #
    alignment_label , sample_label = 'Disease' , None
    add_labels = ['Cell-line']
    #
    results = full_mapping ( adf , jdf  ,
        bVerbose = True			,
        n_clusters = [40,80,120]        ,
        add_labels = add_labels         ,
        alignment_label = alignment_label )
    #
    map_analytes        = results[0]
    map_samples         = results[1]
    hierarchy_analytes  = results[2]
    hierarchy_samples   = results[3]

and plotting the information of the map analytes yields : Cancer Disease Example

You can also run an alternative algorithm where the UMAP coordinates are employed directly for clustering by setting

    results = full_mapping ( adf , jdf  ,
        bVerbose = True			        ,
        bUseUmap = True                 ,
        n_clusters = [40,80,120]        ,
        add_labels = add_labels         ,
        alignment_label = alignment_label )

with the following results.

Download the zip and open the html index:

chromium index.html

Other generated solutions

The clustering visualisations were created using the Biocartograph and hvplot :

What groupings corresponds to biomarker variance that describe them? Here are two visualisations of that:

Diseases : cancers biocartograph gfa Reactome enrichments biocartograph gfa cluster enrichments biocartograph treemap cluster 61

Tissues : tissues

Single Cells: single cells biocartograph gfa enrichment biocartograph treemap cluster 47

Blood Cells: blood cells biocartograph gfa enrichment biocartograph treemap cluster 2

TCGA-BRCA : Calculated using the biocartograph and a TCGA derived data set with the results for Breast Cancer mRNA-seq

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

biocartograph-0.6.2.tar.gz (26.5 kB view details)

Uploaded Source

Built Distribution

biocartograph-0.6.2-py3-none-any.whl (29.5 kB view details)

Uploaded Python 3

File details

Details for the file biocartograph-0.6.2.tar.gz.

File metadata

  • Download URL: biocartograph-0.6.2.tar.gz
  • Upload date:
  • Size: 26.5 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/3.4.2 importlib_metadata/4.8.1 pkginfo/1.8.1 requests/2.26.0 requests-toolbelt/0.9.1 tqdm/4.62.3 CPython/3.9.13

File hashes

Hashes for biocartograph-0.6.2.tar.gz
Algorithm Hash digest
SHA256 d5a2c58f268116e62b4e3b77fe8e9d67d55dcda2e479c6e81269b5a84b42ddb9
MD5 6e3bfa432bc68a3ac006cff2c01a2c28
BLAKE2b-256 60f33c293378cccee327ab32a2417f2b284e9e7fbf7f6368e423bf12baa1ff41

See more details on using hashes here.

File details

Details for the file biocartograph-0.6.2-py3-none-any.whl.

File metadata

  • Download URL: biocartograph-0.6.2-py3-none-any.whl
  • Upload date:
  • Size: 29.5 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/3.4.2 importlib_metadata/4.8.1 pkginfo/1.8.1 requests/2.26.0 requests-toolbelt/0.9.1 tqdm/4.62.3 CPython/3.9.13

File hashes

Hashes for biocartograph-0.6.2-py3-none-any.whl
Algorithm Hash digest
SHA256 7df065faea214c1890d279ee0a0395f5cacfe8ec2f132bce623e770985b005b0
MD5 403a44b0ac114e66c63f5ca397681c7f
BLAKE2b-256 5a378cd9344c24dc79925c38608814b664e1ecc58bb8657ca7003e18b50271d8

See more details on using hashes here.

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page