A tool for costructing a limited sized diagnostic panels based on methylation data
Project description
What is logloss-beraf?
----------------------
A tool for selection of a limited number of informative DNA methylation
regions (i.e. sites) based on a combination of several feature selection
methods and an ensemble-based classifier. It is expected to handle higly
unbalanced and heterogeneous data. Also it is intended for the design
of diagnostic panels that can be potentially used in routine laboratory practice.
Quick start
-----------
1. Install ``logloss-beraf`` with all the dependencies::
easy_install logloss-beraf
2. Make a test run. It uses data included to the package::
logloss_beraf test_run
3. Prepare input feature and annotation tables (in CSV format). The order of samples in those tables is supposed to be the same
```
# Methylation data
Feature_1 Feature_2 Feature_3
Sample_0 0.909642 0.823715 0.069785
Sample_1 0.564799 0.199724 0.840741
Sample_2 0.685081 0.489773 0.286591
Sample_3 0.810637 0.006836 0.888038
Sample_4 0.124098 0.347752 0.954853
```
```
# Annotation data
Sample_Name Type
Sample_0 Benign
Sample_1 Pathologic
Sample_2 Benign
Sample_3 Benign
Sample_4 Pathologic
```
4. `Train model`
```sh
logloss_beraf train \
--features path_to_feature_table \
--features_max_num 10 \
--min_beta_threshold 0.2 \
--annotation path_to_annotation_table \
--sample_name_column "Sample_Name" \
--class_column "Type" \
--output_folder path_to_output_folder
```
5. `Apply trained model to independent dataset`
```sh
logloss_beraf apply \
--features path_to_test_feature_table \
--model path_to_trained_model
--output_folder path_to_output_folder
```
----------------------
A tool for selection of a limited number of informative DNA methylation
regions (i.e. sites) based on a combination of several feature selection
methods and an ensemble-based classifier. It is expected to handle higly
unbalanced and heterogeneous data. Also it is intended for the design
of diagnostic panels that can be potentially used in routine laboratory practice.
Quick start
-----------
1. Install ``logloss-beraf`` with all the dependencies::
easy_install logloss-beraf
2. Make a test run. It uses data included to the package::
logloss_beraf test_run
3. Prepare input feature and annotation tables (in CSV format). The order of samples in those tables is supposed to be the same
```
# Methylation data
Feature_1 Feature_2 Feature_3
Sample_0 0.909642 0.823715 0.069785
Sample_1 0.564799 0.199724 0.840741
Sample_2 0.685081 0.489773 0.286591
Sample_3 0.810637 0.006836 0.888038
Sample_4 0.124098 0.347752 0.954853
```
```
# Annotation data
Sample_Name Type
Sample_0 Benign
Sample_1 Pathologic
Sample_2 Benign
Sample_3 Benign
Sample_4 Pathologic
```
4. `Train model`
```sh
logloss_beraf train \
--features path_to_feature_table \
--features_max_num 10 \
--min_beta_threshold 0.2 \
--annotation path_to_annotation_table \
--sample_name_column "Sample_Name" \
--class_column "Type" \
--output_folder path_to_output_folder
```
5. `Apply trained model to independent dataset`
```sh
logloss_beraf apply \
--features path_to_test_feature_table \
--model path_to_trained_model
--output_folder path_to_output_folder
```
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
logloss-beraf-0.7.tar.gz
(191.3 kB
view hashes)
Built Distribution
logloss_beraf-0.7-py2-none-any.whl
(217.6 kB
view hashes)
Close
Hashes for logloss_beraf-0.7-py2-none-any.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | 46c7a63978fbb16bb537d059171f2240f0ce3bb5a580b2a5dcffe7bb9f9c4765 |
|
MD5 | ed0e099fb03f6dab011d6c077610dd0c |
|
BLAKE2b-256 | 64d7486207c39f643da057a8e0d3264b670985855917d743c087fe5ca2ccf042 |