epcy · PyPI

Evaluattion of Predictive CapabilitY for ranking biomarker candidates.

These details have not been verified by PyPI

Project links

Homepage

Project description

https://img.shields.io/badge/python-3.6-blue.svg

https://travis-ci.org/iric-soft/epcy.svg?branch=master

https://codecov.io/gh/iric-soft/epcy/branch/master/graph/badge.svg

Citing:

EPCY: Evaluation of Predictive CapabilitY for ranking biomarker gene candidates. Poster at ISMB ECCB 2019: https://f1000research.com/posters/8-1349

Introduction:

This tool was developed to Evaluate Predictive CapabilitY of each feature to become a biomarker candidates.

Requirements:

python3
(Optional) virtualenv

Install:

python3 -m venv $HOME/.virtualenvs/epcy
source $HOME/.virtualenvs/epcy/bin/activate
cd [your_epcy_folder]
CFLAGS=-std=c99 pip3 install numpy==1.17.0
python3 setup.py install
epcy -h

Usage:

General:

From source:

cd [your_epcy_folder]
python3 -m epcy -h

After setup install:

epcy -h

Generic case:

EPCY is design to work on any quantitative data, provided that values of each feature are comparable between each samples (normalized).
To run a comparative analysis, epcy pred need two tabulated files:
- A matrix of quantitative normalized data for each samples (column) with an “ID” column to identify each feature.
- A design table which describe the comparison.

# Run epcy on any normalized quantification data
epcy pred -d ./data/small_for_test/design.tsv -m ./data/small_for_test/exp_matrix.tsv -o ./data/small_for_test/default_subgroup
# If your data require a log2 transforamtion, add --log
epcy pred --log -d ./data/small_for_test/design.tsv -m ./data/small_for_test/exp_matrix.tsv -o ./data/small_for_test/default_subgroup

Result will be saved in prediction_capability.xls file, which is detail below.
You can personalize the design file using –subgroup –query

epcy pred_rna -d ./data/small_for_test/design.tsv -m ./data/small_for_test/exp_matrix.tsv -o ./data/small_for_test/subgroup2 --subgroup subgroup2 --query A

Working on RNA sequencing readcounts:

To run EPCY on readcounts not mormalized use pred_rna tool as follow:

# To run on read count not normalized, add --cpm --log
epcy pred_rna --cpm --log -d ./data/small_for_test/design.tsv -m ./data/small_for_test/exp_matrix.tsv -o ./data/small_for_test/default_subgroup

Working on kallisto quantification:

EPCY allow to work directly on kallisto quantificaion using h5 files, to have access to bootstrapped samples. To do so, a kallisto column need to be add to the design file (to specify the directory path where to find abundant.h5 file for each sample) and epcy pred_rna need to run as follow:

# To run on kallisto quantification, add --kall (+ --cpm --log)
epcy pred_rna --kal --cpm --log -d ./data/small_leucegene/5_inv16_vs_5/design.tsv -o ./data/small_leucegene/5_inv16_vs_5/
# !!! Take care kallisto quantification is on transcript not on gene

To run on gene level, a gff3 file of the genome annotation is needed, to have the correspondence between transcript and gene. This file can be download on ensembl

# To run on kallisto quantification and gene level, add --gene --anno [file.gff] (+ --kall --cpm --log)
epcy pred_rna --kal --cpm --log --gene --anno ./data/small_genome/Homo_sapiens.GRCh38.84.reduce.gff3 -d ./data/small_leucegene/5_inv16_vs_5/design.tsv -o ./data/small_leucegene/5_inv16_vs_5/

kallisto quantification allow to work on TPM:

# work on TPM, replace --cpm by --tpm
epcy pred_rna --kal --tpm --log --gene --anno ./data/small_genome/Homo_sapiens.GRCh38.84.reduce.gff3 -d ./data/small_leucegene/5_inv16_vs_5/design.tsv -o ./data/small_leucegene/5_inv16_vs_5/

Output:

predictive_capability.xls

This file is the main output which contain the evaluation of each features (genes, proteins, …). It’s a tabulated files 9 columns:

Default columns:
- id: the id of each feature.
- l2fc: log2 Fold change.
- kernel_mcc: Matthews Correlation Coefficient (MCC) compute by a predictor using KDE.
- kernel_mcc_low, kernel_mcc_high: boundaries of confidence interval (90%).
- mean_query: mean(values) of samples specify as Query in design.tsv
- mean_ref: mean(values) of samples specify as Ref in design.ts
- bw_query: Estimate bandwidth used by KDE, to calculate the density of query samples
- bw_ref: Estimate bandwidth used by KDE, to calculate the density of ref samples
Using –normal:
- normal_mcc: MCC compute a predictor using normal distributions.
Using –auc –utest:
- auc: Area Under the Curve
- u_pv: pvalue compute by a MannWhitney rank test
Using –ttest:
- t_pv: pvalue compute by ttest_ind

subgroup_predicted.xls

Using –full a secondary output file (subgroup_predicted.xls) specify for each features if the sample as been correctly predicted. Build an heatmap with this output could help you to explore your data. More details coming soon.

Bagging:

To improve the stability and accuracy of MCC computed, you can add n bagging (using -b n)

#Take care, it's take n time more longer!!!, use multiprocess (-t) seems a good idea :).
epcy pred_rna -b 4 -t 4 --cpm --log -d ./data/small_for_test/design.tsv -m ./data/small_for_test/exp_matrix.tsv -o ./data/small_for_test/default_subgroup

Project details

These details have not been verified by PyPI

Project links

Homepage

Release history Release notifications | RSS feed

0.2.6.4

Jun 13, 2024

0.2.6.3

Jun 13, 2024

0.2.5

Dec 19, 2023

0.2.4

Mar 8, 2023

0.2.3

Dec 7, 2021

0.2.2

Jun 18, 2021

0.2.1

Jun 7, 2021

0.2.0

May 28, 2021

0.1.3

May 15, 2021

0.1.2

May 15, 2021

0.0.2

May 5, 2021

This version

0.0.1

Apr 1, 2020

0.0.0

Mar 31, 2020

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

epcy-0.0.1.tar.gz (26.3 kB view details)

Uploaded Apr 1, 2020 Source

Built Distributions

If you're not sure about the file name format, learn more about wheel file names.

The dropdown lists show the available interpreters, ABIs, and platforms. Enable javascript to be able to filter the list of wheel files.

epcy-0.0.1-py3.7.egg (75.1 kB view details)

Uploaded Apr 1, 2020 Egg

epcy-0.0.1-py3-none-any.whl (33.8 kB view details)

Uploaded Apr 1, 2020 Python 3

File details

Details for the file epcy-0.0.1.tar.gz.

File metadata

Download URL: epcy-0.0.1.tar.gz
Upload date: Apr 1, 2020
Size: 26.3 kB
Tags: Source
Uploaded using Trusted Publishing? No
Uploaded via: twine/3.1.1 pkginfo/1.5.0.1 requests/2.23.0 setuptools/46.1.3 requests-toolbelt/0.9.1 tqdm/4.44.1 CPython/3.7.2

File hashes

Hashes for epcy-0.0.1.tar.gz
Algorithm	Hash digest
SHA256	`2f2dab337ce61edc1cfad9198051c7fe66361bcde897d8c6da86feb6202dce0f`
MD5	`4c780e2b92a758ae013a6ec3d65a3777`
BLAKE2b-256	`6dcb5ae29e137d392eb31c6c53fa2d8275e6e0ab5f0206efd670b10fff0b81f3`

See more details on using hashes here.

File details

Details for the file epcy-0.0.1-py3.7.egg.

File metadata

Download URL: epcy-0.0.1-py3.7.egg
Upload date: Apr 1, 2020
Size: 75.1 kB
Tags: Egg
Uploaded using Trusted Publishing? No
Uploaded via: twine/3.1.1 pkginfo/1.5.0.1 requests/2.23.0 setuptools/46.1.3 requests-toolbelt/0.9.1 tqdm/4.44.1 CPython/3.7.2

File hashes

Hashes for epcy-0.0.1-py3.7.egg
Algorithm	Hash digest
SHA256	`49ffaad2d337f061e4e2d2a7fc150272001ce58d40903cb2e04fa1e765dd23fd`
MD5	`cef2350c90192340bb1e3dcc1d7d91e4`
BLAKE2b-256	`3756c45472b7e524f988054873d286a9ec0bb9d9278ca11a3481c9572ec822d7`

See more details on using hashes here.

File details

Details for the file epcy-0.0.1-py3-none-any.whl.

File metadata

Download URL: epcy-0.0.1-py3-none-any.whl
Upload date: Apr 1, 2020
Size: 33.8 kB
Tags: Python 3
Uploaded using Trusted Publishing? No
Uploaded via: twine/3.1.1 pkginfo/1.5.0.1 requests/2.23.0 setuptools/46.1.3 requests-toolbelt/0.9.1 tqdm/4.44.1 CPython/3.7.2

File hashes

Hashes for epcy-0.0.1-py3-none-any.whl
Algorithm	Hash digest
SHA256	`b27eb112b20c3052c6b1bce633d1cefad0207f4c070e61c66bbacb45d8fbe779`
MD5	`fb1066dfda5d0cee6f61eea38bfad5ec`
BLAKE2b-256	`ad77aa9e55d3b4692c319d673885fe82f9d712d892d023095f36ccec1c765ed4`

See more details on using hashes here.

epcy 0.0.1

Navigation

Verified details

Maintainers

Unverified details

Project links

Meta

Classifiers

Project description

Citing:

Introduction:

Requirements:

Install:

Usage:

General:

From source:

After setup install:

Generic case:

Working on RNA sequencing readcounts:

Working on kallisto quantification:

Output:

predictive_capability.xls

subgroup_predicted.xls

Bagging:

Project details

Verified details

Maintainers

Unverified details

Project links

Meta

Classifiers

Release history Release notifications | RSS feed

Download files

Source Distribution

Built Distributions

File details

File metadata

File hashes

File details

File metadata

File hashes

File details

File metadata

File hashes