Skip to main content

Implementation of the TCGA purification protocol

Project description

TCGA paired purification

The tcga_paired_purification packages allows researchers to input common paired experiments (i.e. cancer paired with normal samples) and use that as a basis to remove contamination due to non-cancerous components. The package was specifically wrote to deal with the normal contamination in TCGA datasets.

Download Package

Download the tcga_paired_purification package by:

pip install git+https://github.com/jeffliu6068/tcga_paired_purification.git

or

pip install tcga_paired_purification

Import

Once installed, import the package by:

import tcga_paired_purification

Intuition: How DEA Works to Identify Differentially Expressed Genes

The package takes into account 3 seperate information:

  1. Contaminated expression data (i.e. TCGA cancer RNA-seq)
  2. Mean and standard deviations of the distributions of normal data
  3. Purity of the contaminated data (i.e. copy number variation or histological assessment)

By taking into account the ratio of normal vs cancer proportion via copy number variation, we can remove normal contamination from the paired cancer expression data as shown in the TCGA dataset.

Available Tools in the tcga_paired_purification Package

tcga_paired_purification

import tcga_paired_purification as tpp

purified_df = tpp.tcga_paired_purification(input_data_cancer, input_data_normal, purity_df)
  1. input_data_cancer is the input dataframe with genes (row) x samples (columns)

  2. input_data_normal is a dataframe with the information of the means and standard deviation of the paired normal distribution of each gene and with columns 'mean' and 'std'

  3. purity_df is a dataframe with samples (rows) x purity (columns)

Authors

  • Ta-Chun (Jeff) Liu - jeffliu6068
  • Sir Walter Fred Bodmer FRS FRSE - Supervision

License

This project is licensed under the MIT License - see the LICENSE.md file for details

Acknowledgments

  • Hat tip to anyone whose code was used
  • Inspiration: Thank you for all that has contributed ideas and expertise to make this possible. Let's advance science together.

Project details


Release history Release notifications | RSS feed

This version

0.1

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

tcga_paired_purification-0.1.tar.gz (2.7 kB view hashes)

Uploaded Source

Built Distribution

tcga_paired_purification-0.1-py3-none-any.whl (4.0 kB view hashes)

Uploaded Python 3

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page