A novel method for unsupervised patient stratification.
Project description
UnPaSt
UnPaSt is a novel method for identification of differentially expressed biclusters.
Requirements:
Python: fisher==0.1.10 jenkspy==0.2.0 matplotlib-venn==0.11.6 numba==0.55.2 numpy==1.22.3 scikit-learn==1.1.0 scikit-network==0.25.0 scipy==1.7.3 statsmodels==0.13.2 pandas==1.4.2 python-louvain==0.15 statsmodels==0.13.2 R: WGCNA>=1.70-3 limma>=3.42.2
Installation
- UnPaSt can be installed using
pip
,poetry
, or run usingDocker
, or as a script (see examples section). Follow the appropriate instructions below for your preferred method. You need to have R and Python 3.8-3.10 installed.
-
Using pip:
To install the project usingpip
, first make sure you havepip
installed on your system. If you haven't installed it already, you can find the installation instructions here.
Oncepip
is installed, you can install UnPaSt by running the following command:pip install unpast
Run it:
run_unpast -h
Dependencies. To use this package, you will need to have R and the WGCNA library and limma installed. You can easily install these dependencies by running the following command after installing unpast:
python -m unpast.install_r_dependencies # or you can install it directly R -e "install.packages('BiocManager'); BiocManager::install(c('WGCNA', 'limma'))"
-
Installation using Poetry:
To install the package using Poetry, first make sure you have Poetry installed, clone the repo and run:poetry add unpast
Run it:
poetry run run_unpast -h
Dependencies. To use this package, you will need to have R and the WGCNA library and limma installed. You can easily install these dependencies by running the following command after installing unpast:
poetry run python -m unpast.install_r_dependencies # or you can install it directly R -e "install.packages('BiocManager'); BiocManager::install(c('WGCNA', 'limma'))"
-
Running with Docker:
You can also run the package using Docker. First, pull the Docker image:docker pull freddsle/unpast:latest
Next, run the UnPaSt:
docker run -v /your/data/path/:/user_data/ freddsle/unpast:latest --exprs /user_data/exprs.tsv --out_dir /user_data/out_dir/
Examples
- UnPaSt requires a tab-separated file with features (e.g. genes) in rows, and samples in columns. Feature and sample names must be unique.
cd test; mkdir -p results; # running UnPaSt with default parameters and example data python ../run_unpast.py --exprs scenario_B500.exprs.tsv.gz --basename results/scenario_B500 # with different binarization and clustering methods python ../run_unpast.py --exprs scenario_B500.exprs.tsv.gz --basename results/scenario_B500 --binarization ward --clustering Louvain # help python run_unpast.py -h
Outputs
- <basename>.[parameters].biclusters.tsv - a .tsv table with found biclsuters, where
- the first line starts from '#' and stores parameters
- each following line represents a bicluster
- SNR column contains SNR of a bicluster
- columns "n_genes" and "n_samples" provide the numbers of genes and samples, respectively
- "gene","sample" contain gene and sample names respectively
- "gene_indexes" and "sample_indexes" - 0-based gene and sample indexes in the input matrix.
- binarized expressions, background distributions of SNR for each bicluster size and binarization statistics [if clustering is WGCNA, or '--save_binary' flag is added]
About
UnPaSt is an unconstrained version of DESMOND method (repository, publication)
Major modifications:
- it does not require the network of feature interactions
- UnPaSt clusters individual features instead of pairs of features
- uses 2-means, hierarchicla clustering or GMM for binarization of individual gene expressions
- SNR threshold for featuer selection is authomatically determined; it depends on bicluster size in samples and user-defined p-value cutoff
License
Free for non-for-profit use. For commercial use please contact the developers.
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
File details
Details for the file unpast-0.1.9.4.tar.gz
.
File metadata
- Download URL: unpast-0.1.9.4.tar.gz
- Upload date:
- Size: 69.2 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: poetry/1.3.2 CPython/3.10.12 Linux/6.5.0-18-generic
File hashes
Algorithm | Hash digest | |
---|---|---|
SHA256 | 41179ed129fd2d3db27374e03be1ae554d26de8c0c39713da076d2514346aa60 |
|
MD5 | 6ce96afbcf920ddd9664e243bfc82b1e |
|
BLAKE2b-256 | 22b34bb943bcd27ad0dad391b81c6d73142b8fbfab18e7f5618562c8320f418e |
File details
Details for the file unpast-0.1.9.4-py3-none-any.whl
.
File metadata
- Download URL: unpast-0.1.9.4-py3-none-any.whl
- Upload date:
- Size: 72.2 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: poetry/1.3.2 CPython/3.10.12 Linux/6.5.0-18-generic
File hashes
Algorithm | Hash digest | |
---|---|---|
SHA256 | b34a98c35a87f3e16a7fb0eb2554b708b246d212e8cea2662615da3fdb5df5d7 |
|
MD5 | ea8e581f51dd9f096f828bc5baeabecd |
|
BLAKE2b-256 | 02ec3c06e0ade2d3ccff36d4e6201c1a537ee9fd7010482d956e98f3b5dbe1b2 |