Skip to main content

BioLearns: Computational Biology and Bioinformatics Toolbox in Python

Project description

biolearns

BioLearns: Computational Biology and Bioinformatics Toolbox in Python http://biolearns.com

Installation

  • From PyPI
pip install biolearns

Documentation and Tutorials

  • We select three examples listed below. For full list of tutorial, check our github wiki page:

    Wiki

1. Read TCGA Data

Example: Read TCGA Breast invasive carcinoma (BRCA) data

Data is downloaded directly from https://gdac.broadinstitute.org/. The results here are in whole or part based upon data generated by the TCGA Research Network: https://www.cancer.gov/tcga.

from biolearns.dataset.TCGA import TCGACancer
brca = TCGACancer('BRCA')
mRNAseq = brca.mRNAseq
clinical = brca.clinical

TCGA cancer table shortcut:

Barcode Cancer full name Version
1 ACC Adrenocortical carcinoma 2016_01_28
2 BLCA Bladder urothelial carcinoma 2016_01_28
3 BRCA Breast invasive carcinoma 2016_01_28
4 CESC Cervical and endocervical cancers 2016_01_28
5 CHOL Cholangiocarcinoma 2016_01_28
6 COAD Colon adenocarcinoma 2016_01_28
7 COADREAD Colorectal adenocarcinoma 2016_01_28
8 DLBC Lymphoid Neoplasm Diffuse Large B-cell Lymphoma 2016_01_28
9 ESCA Esophageal carcinoma 2016_01_28
... ... ... ...

2. Gene Co-expression Analysis

We firstly download and access the mRNAseq data.

from biolearns.dataset.TCGA import TCGACancer

brca = TCGACancer('BRCA')
mRNAseq = brca.mRNAseq

mRNAseq data is noisy. We filter out 50% of genes with lowest mean values, and then filter out 50% remained genes with lowest variance values.

from biolearns.preprocessing.filter import expression_filter
mRNAseq = expression_filter(mRNAseq, meanq = 0.5, varq = 0.5)

We then use lmQCM class to create an lmQCM object lobj.

The gene co-expression analysis is performed by simply call the fit() function.

from biolearns.coexpression.lmQCM import lmQCM

lobj = lmQCM(mRNAseq)
clusters, genes, eigengene_mat = lobj.fit()

3. Univariate survival analysis

We firstly download and access the mRNAseq data. Use breast cancer as an example.

from biolearns.dataset.TCGA import TCGACancer

brca = TCGACancer('BRCA')
mRNAseq = brca.mRNAseq

We import logranktest from survival subpackage. Choose gene "ABLIM3" as the univariate input.

from biolearns.survival import logranktest

r = mRNAseq.loc['ABLIM3',].values

We find the intersection of univariate, time, and event data

bcd_m = [b[:12] for b in mRNAseq.columns]
bcd_p = [b[:12] for b in clinical.index]
bcd = np.intersect1d(bcd_m, bcd_p)

r = r[np.nonzero(np.in1d(bcd, bcd_m))[0]]
t = brca.overall_survival_time[np.nonzero(np.in1d(bcd, bcd_p))[0]]
e = brca.overall_survival_event[np.nonzero(np.in1d(bcd, bcd_p))[0]]

We perform log-rank test:

logrank_results, fig = logranktest(r[~np.isnan(t)], t[~np.isnan(t)], e[~np.isnan(t)])
test_statistic, p_value = logrank_results.test_statistic, logrank_results.p_value

The output figure looks like:

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

biolearns-0.0.14.tar.gz (16.8 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

biolearns-0.0.14-py3-none-any.whl (29.4 kB view details)

Uploaded Python 3

File details

Details for the file biolearns-0.0.14.tar.gz.

File metadata

  • Download URL: biolearns-0.0.14.tar.gz
  • Upload date:
  • Size: 16.8 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/3.1.1 pkginfo/1.5.0.1 requests/2.22.0 setuptools/45.2.0 requests-toolbelt/0.9.1 tqdm/4.36.1 CPython/3.7.4

File hashes

Hashes for biolearns-0.0.14.tar.gz
Algorithm Hash digest
SHA256 7f6205d29d13c9cb046a691252cce68039759f559f0ced5899f3c54a39eaf23f
MD5 a65e374cb2713f820ab51c4046ba150f
BLAKE2b-256 ee088e626a60f5f90b87a3cbd7921f6b4b813e2829b3d90d25045d2cb2dce765

See more details on using hashes here.

File details

Details for the file biolearns-0.0.14-py3-none-any.whl.

File metadata

  • Download URL: biolearns-0.0.14-py3-none-any.whl
  • Upload date:
  • Size: 29.4 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/3.1.1 pkginfo/1.5.0.1 requests/2.22.0 setuptools/45.2.0 requests-toolbelt/0.9.1 tqdm/4.36.1 CPython/3.7.4

File hashes

Hashes for biolearns-0.0.14-py3-none-any.whl
Algorithm Hash digest
SHA256 3934d520e0302f78950858c2a4b0a44fc0fecd36f955f83369a389fe6d32fc2e
MD5 794196a33a3feb09fc98fb652e54ae97
BLAKE2b-256 25e69b16a8ad8425f47343197390733478ccfe5d0276e62457822ebd6f103846

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page