Skip to main content

A novel method for unsupervised patient stratification.

Project description

UnPaSt

UnPaSt is a novel method for identification of differentially expressed biclusters.

alt text

Requirements:

Python:
    fisher==0.1.10
    jenkspy==0.2.0
    matplotlib-venn==0.11.6
    numba==0.55.2
    numpy==1.22.3
    scikit-learn==1.1.0
    scikit-network==0.25.0
    scipy==1.7.3
    statsmodels==0.13.2
    pandas==1.4.2
    python-louvain==0.15
    statsmodels==0.13.2

R:
    WGCNA>=1.70-3
    limma>=3.42.2

Installation

  • UnPaSt can be installed using pip, poetry, or run using Docker, or as a script (see examples section). Follow the appropriate instructions below for your preferred method. You need to have R and Python 3.8-3.10 installed.
  1. Using pip:
    To install the project using pip, first make sure you have pip installed on your system. If you haven't installed it already, you can find the installation instructions here.
    Once pip is installed, you can install UnPaSt by running the following command:

    pip install unpast
    

    Run it:

    run_unpast -h
    

    Dependencies. To use this package, you will need to have R and the WGCNA library and limma installed. You can easily install these dependencies by running the following command after installing unpast:

    python -m unpast.install_r_dependencies
    
    # or you can install it directly
    R -e "install.packages('BiocManager'); BiocManager::install(c('WGCNA', 'limma'))"
    
  2. Installation using Poetry:
    To install the package using Poetry, first make sure you have Poetry installed, clone the repo and run:

    poetry add unpast
    

    Run it:

    poetry run run_unpast -h
    

    Dependencies. To use this package, you will need to have R and the WGCNA library and limma installed. You can easily install these dependencies by running the following command after installing unpast:

    poetry run python -m unpast.install_r_dependencies
    
    # or you can install it directly
    R -e "install.packages('BiocManager'); BiocManager::install(c('WGCNA', 'limma'))"
    
  3. Running with Docker:
    You can also run the package using Docker. First, pull the Docker image:

    docker pull freddsle/unpast:latest
    

    Next, run the UnPaSt:

    docker run -v /your/data/path/:/user_data/ freddsle/unpast:latest --exprs /user_data/exprs.tsv --out_dir /user_data/out_dir/
    

Examples

  • UnPaSt requires a tab-separated file with features (e.g. genes) in rows, and samples in columns. Feature and sample names must be unique.
cd test;
mkdir -p results;

# running UnPaSt with default parameters and example data
python ../run_unpast.py --exprs scenario_B500.exprs.tsv.gz --basename results/scenario_B500

# with different binarization and clustering methods
python ../run_unpast.py --exprs scenario_B500.exprs.tsv.gz --basename results/scenario_B500 --binarization ward --clustering Louvain

# help
python run_unpast.py -h

Outputs

  • <basename>.[parameters].biclusters.tsv - a .tsv table with found biclsuters, where
    • the first line starts from '#' and stores parameters
    • each following line represents a bicluster
    • SNR column contains SNR of a bicluster
    • columns "n_genes" and "n_samples" provide the numbers of genes and samples, respectively
    • "gene","sample" contain gene and sample names respectively
    • "gene_indexes" and "sample_indexes" - 0-based gene and sample indexes in the input matrix.
  • binarized expressions, background distributions of SNR for each bicluster size and binarization statistics [if clustering is WGCNA, or '--save_binary' flag is added]

About

UnPaSt is an unconstrained version of DESMOND method (repository, publication)

Major modifications:

  • it does not require the network of feature interactions
  • UnPaSt clusters individual features instead of pairs of features
  • uses 2-means, hierarchicla clustering or GMM for binarization of individual gene expressions
  • SNR threshold for featuer selection is authomatically determined; it depends on bicluster size in samples and user-defined p-value cutoff

License

Free for non-for-profit use. For commercial use please contact the developers.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

unpast-0.1.9.4.tar.gz (69.2 kB view details)

Uploaded Source

Built Distribution

unpast-0.1.9.4-py3-none-any.whl (72.2 kB view details)

Uploaded Python 3

File details

Details for the file unpast-0.1.9.4.tar.gz.

File metadata

  • Download URL: unpast-0.1.9.4.tar.gz
  • Upload date:
  • Size: 69.2 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: poetry/1.3.2 CPython/3.10.12 Linux/6.5.0-18-generic

File hashes

Hashes for unpast-0.1.9.4.tar.gz
Algorithm Hash digest
SHA256 41179ed129fd2d3db27374e03be1ae554d26de8c0c39713da076d2514346aa60
MD5 6ce96afbcf920ddd9664e243bfc82b1e
BLAKE2b-256 22b34bb943bcd27ad0dad391b81c6d73142b8fbfab18e7f5618562c8320f418e

See more details on using hashes here.

File details

Details for the file unpast-0.1.9.4-py3-none-any.whl.

File metadata

  • Download URL: unpast-0.1.9.4-py3-none-any.whl
  • Upload date:
  • Size: 72.2 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: poetry/1.3.2 CPython/3.10.12 Linux/6.5.0-18-generic

File hashes

Hashes for unpast-0.1.9.4-py3-none-any.whl
Algorithm Hash digest
SHA256 b34a98c35a87f3e16a7fb0eb2554b708b246d212e8cea2662615da3fdb5df5d7
MD5 ea8e581f51dd9f096f828bc5baeabecd
BLAKE2b-256 02ec3c06e0ade2d3ccff36d4e6201c1a537ee9fd7010482d956e98f3b5dbe1b2

See more details on using hashes here.

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page