Skip to main content

A package from the NCI-CPTAC DREAM Proteogenomics Challenge

Project description

proteo_estimator

Overview

We present the first data science competition aiming at predicting protein levels from copy number and transcript levels, as well as phosphorylation levels from protein levels. The winning models outperform standard baseline machine learning methods and simply using the transcript levels as proxy for protein levels with respect to prediction performance on new patient samples. An in depth analysis revealed associations between the commonly predictive genes and essentiality. We provide all the submitted models to the community for re-use and a web application to explore the result of this challenge to support improved large scale proteogenomic characterization of tumor samples and a better understanding of signaling deregulation.

Installation

For development release:

pip install --index-url https://test.pypi.org/simple/ --extra-index-url https://pypi.python.org/pypi proteo_estimator

For production release:

pip install proteo_estimator

Requires Python3

Usage

import proteo_estimator as pr

# Subchallenge 2: predicting protein levels from copy number and transcript levels
prediction_file_protein = pr.predict_protein_abundances(
            tumor,
            rna,
            cna,
            output_dir,
            logging=True)

# Subchallenge 3: predicting phospho levels from protein abundance and transcript levels
prediction_file_phospho = pr.predict_phospho(
            tumor,
            rna,
            protein,
            output_dir,
            logging=True)

predict_protein_abundances

Arguments

Parameter Default Type Description
tumor str Tumor type, options are 'breast' and 'ovarian'
rna str Absolute file path for rna table. Table must be in TSV format of genes x samples
cna str Absolute file path for cna table. Table must be in TSV format of genes x samples
output_dir str Absolute file path for output directory. Prediction table and confidence scores will be saved under this directory as prediction.tsv and confidence.tsv
logging True bool Print progress to stdout

Return Value

Output Type Description
prediction_file str Path to tab-separated file of predicted protein levels in the shape of genes x samples. This file will be saved in the directory passed to the parameter "output_dir" as prediction.tsv

predict_phospho

Arguments

Parameter Default Type Description
tumor str Tumor type, options are 'breast' and 'ovarian'
rna str Absolute file path for rna table. Table must be in TSV format of genes x samples
protein str Absolute file path for cna table. Table must be in TSV format of genes x samples
output_dir str Absolute file path for output directory. Prediction table and confidence scores will be saved under this directory as prediction.tsv and confidence.tsv
logging True bool Print progress to stdout

Return Value

Output Type Description
prediction_file str Path to tab-separated file of predicted protein levels in the shape of genes x samples. This file will be saved in the directory passed to the parameter "output_dir" as prediction.tsv

Note

Please ensure that your docker daemon is running in the background. All file paths must be absolute.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

proteo_estimator-1.0.1.tar.gz (3.1 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

proteo_estimator-1.0.1-py3-none-any.whl (7.6 kB view details)

Uploaded Python 3

File details

Details for the file proteo_estimator-1.0.1.tar.gz.

File metadata

  • Download URL: proteo_estimator-1.0.1.tar.gz
  • Upload date:
  • Size: 3.1 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/1.13.0 pkginfo/1.5.0.1 requests/2.22.0 setuptools/39.1.0 requests-toolbelt/0.9.1 tqdm/4.32.2 CPython/3.7.3

File hashes

Hashes for proteo_estimator-1.0.1.tar.gz
Algorithm Hash digest
SHA256 e738233c5d66795fdb50582b220ba8cdf88ad81122c881136fb58eeb774b5b38
MD5 0ca98bf059694a8a67f987b77bcf445e
BLAKE2b-256 09b3e0fb3da9af1b724698d4c4398abccd22383613fdddc1cd9e7ef0ed42c61b

See more details on using hashes here.

File details

Details for the file proteo_estimator-1.0.1-py3-none-any.whl.

File metadata

  • Download URL: proteo_estimator-1.0.1-py3-none-any.whl
  • Upload date:
  • Size: 7.6 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/1.13.0 pkginfo/1.5.0.1 requests/2.22.0 setuptools/39.1.0 requests-toolbelt/0.9.1 tqdm/4.32.2 CPython/3.7.3

File hashes

Hashes for proteo_estimator-1.0.1-py3-none-any.whl
Algorithm Hash digest
SHA256 256f3c03aa5dc2f8ad31941a194259e745748a250dd118e26f90175f0d765b07
MD5 35e6c06a596faf409ef2686a3b6b35bf
BLAKE2b-256 456204b3e87803611a46eb7adb806d971b75dad4f56e830ce2f77f82b8bdca45

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page