Skip to main content

BioLearns: Computational Biology and Bioinformatics Toolbox in Python

Project description

biolearns

BioLearns: Computational Biology and Bioinformatics Toolbox in Python http://biolearns.com

license

Installation

  • From PyPI
pip install biolearns -U

Documentation and Tutorials

  • We select three examples listed below. For full list of tutorial, check our github wiki page:

    Wiki

Disclaimer

Please note that this is a pre-release version of the BioLearns which is still undergoing final testing before its official release. The website, its software and all content found on it are provided on an "as is" and "as available" basis. BioLearns does not give any warranties, whether express or implied, as to the suitability or usability of the website, its software or any of its content. BioLearns will not be liable for any loss, whether such loss is direct, indirect, special or consequential, suffered by any party as a result of their use of the libraries or content. Any usage of the libraries is done at the user's own risk and the user will be solely responsible for any damage to any computer system or loss of data that results from such activities. Should you encounter any bugs, glitches, lack of functionality or other problems on the website, please let us know immediately so we can rectify these accordingly. Your help in this regard is greatly appreciated.

1. Read TCGA Data

Example: Read TCGA Breast invasive carcinoma (BRCA) data

Data is downloaded directly from https://gdac.broadinstitute.org/. The results here are in whole or part based upon data generated by the TCGA Research Network: https://www.cancer.gov/tcga.

from biolearns.dataset import TCGA
brca = TCGA('BRCA')
mRNAseq = brca.mRNAseq
clinical = brca.clinical

TCGA cancer table shortcut:

Barcode Cancer full name Version
1 ACC Adrenocortical carcinoma 2016_01_28
2 BLCA Bladder urothelial carcinoma 2016_01_28
3 BRCA Breast invasive carcinoma 2016_01_28
4 CESC Cervical and endocervical cancers 2016_01_28
5 CHOL Cholangiocarcinoma 2016_01_28
6 COAD Colon adenocarcinoma 2016_01_28
7 COADREAD Colorectal adenocarcinoma 2016_01_28
8 DLBC Lymphoid Neoplasm Diffuse Large B-cell Lymphoma 2016_01_28
9 ESCA Esophageal carcinoma 2016_01_28
... ... ... ...

2. Gene Co-expression Analysis

We firstly download and access the mRNAseq data.

from biolearns.dataset import TCGA

brca = TCGA('BRCA')
mRNAseq = brca.mRNAseq

mRNAseq data is noisy. We filter out 50% of genes with lowest mean values, and then filter out 50% remained genes with lowest variance values.

from biolearns.preprocessing import expression_filter
mRNAseq = expression_filter(mRNAseq, meanq = 0.5, varq = 0.5)

We then use lmQCM class to create an lmQCM object lobj.

The gene co-expression analysis is performed by simply call the fit() function.

from biolearns.coexpression import lmQCM

lobj = lmQCM(mRNAseq)
clusters, genes, eigengene_mat = lobj.fit()

3. Univariate survival analysis

We firstly download and access the mRNAseq data. Use breast cancer as an example.

from biolearns.dataset import TCGA

brca = TCGA('BRCA')
mRNAseq = brca.mRNAseq

We import logranktest from survival subpackage. Choose gene "ABLIM3" as the univariate input.

from biolearns.survival import logranktest

r = mRNAseq.loc['ABLIM3',].values

We find the intersection of univariate, time, and event data

bcd_m = [b[:12] for b in mRNAseq.columns]
bcd_p = [b[:12] for b in clinical.index]
bcd = np.intersect1d(bcd_m, bcd_p)

r = r[np.nonzero(np.in1d(bcd, bcd_m))[0]]
t = brca.overall_survival_time[np.nonzero(np.in1d(bcd, bcd_p))[0]]
e = brca.overall_survival_event[np.nonzero(np.in1d(bcd, bcd_p))[0]]

We perform log-rank test:

logrank_results, fig = logranktest(r[~np.isnan(t)], t[~np.isnan(t)], e[~np.isnan(t)])
test_statistic, p_value = logrank_results.test_statistic, logrank_results.p_value

The output figure looks like:

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Files for biolearns, version 0.0.60
Filename, size File type Python version Upload date Hashes
Filename, size biolearns-0.0.60-py3-none-any.whl (38.6 kB) File type Wheel Python version py3 Upload date Hashes View
Filename, size biolearns-0.0.60.tar.gz (24.1 kB) File type Source Python version None Upload date Hashes View

Supported by

AWS AWS Cloud computing Datadog Datadog Monitoring DigiCert DigiCert EV certificate Facebook / Instagram Facebook / Instagram PSF Sponsor Fastly Fastly CDN Google Google Object Storage and Download Analytics Pingdom Pingdom Monitoring Salesforce Salesforce PSF Sponsor Sentry Sentry Error logging StatusPage StatusPage Status page