Implementation of the TCGA purification protocol
Project description
TCGA paired purification
The tcga_paired_purification packages allows researchers to input common paired experiments (i.e. cancer paired with normal samples) and use that as a basis to remove contamination due to non-cancerous components. The package was specifically wrote to deal with the normal contamination in TCGA datasets.
Download Package
Download the tcga_paired_purification package by:
pip install git+https://github.com/jeffliu6068/tcga_paired_purification.git
or
pip install tcga_paired_purification
Import
Once installed, import the package by:
import tcga_paired_purification
Intuition: How DEA Works to Identify Differentially Expressed Genes
The package takes into account 3 seperate information:
- Contaminated expression data (i.e. TCGA cancer RNA-seq)
- Mean and standard deviations of the distributions of normal data
- Purity of the contaminated data (i.e. copy number variation or histological assessment)
By taking into account the ratio of normal vs cancer proportion via copy number variation, we can remove normal contamination from the paired cancer expression data as shown in the TCGA dataset.
Available Tools in the tcga_paired_purification Package
tcga_paired_purification
import tcga_paired_purification as tpp
purified_df = tpp.tcga_paired_purification(input_data_cancer, input_data_normal, purity_df)
-
input_data_cancer is the input dataframe with genes (row) x samples (columns)
-
input_data_normal is a dataframe with the information of the means and standard deviation of the paired normal distribution of each gene and with columns 'mean' and 'std'
-
purity_df is a dataframe with samples (rows) x purity (columns)
Authors
- Ta-Chun (Jeff) Liu - jeffliu6068
- Sir Walter Fred Bodmer FRS FRSE - Supervision
License
This project is licensed under the MIT License - see the LICENSE.md file for details
Acknowledgments
- Hat tip to anyone whose code was used
- Inspiration: Thank you for all that has contributed ideas and expertise to make this possible. Let's advance science together.
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Hashes for tcga_paired_purification-0.1.tar.gz
Algorithm | Hash digest | |
---|---|---|
SHA256 | 4cf028edcc3703f81612e38bca4df63e2cee9b130182fd39ee1c3ece7645f433 |
|
MD5 | 4d8250c4275ba915ba7fb2abdd1fd18e |
|
BLAKE2b-256 | 8efaf8243c5f34859795c8ece0b6c1de3f9ac096bef1274659bf785480adc980 |
Hashes for tcga_paired_purification-0.1-py3-none-any.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | d40d8488156ee8e3c8ac64314e5fc4d288bf5ca52c6e1c1235112382b165e771 |
|
MD5 | 0bb0ff37ff0e39dcee74b247efedecb7 |
|
BLAKE2b-256 | 272e9f37b1411323d6da20507727c323a3a09b8e9836cfa9d3153b38c897425f |