Skip to main content

A novel method for unsupervised patient stratification.

Project description

UnPaSt

UnPaSt is a novel method for identification of differentially expressed biclusters.

alt text

Requirements:

Python:
    fisher==0.1.10
    jenkspy==0.2.0
    matplotlib-venn==0.11.6
    numba==0.55.2
    numpy==1.22.3
    scikit-learn==1.1.0
    scikit-network==0.25.0
    scipy==1.7.3
    statsmodels==0.13.2
    pandas==1.4.2
    python-louvain==0.15
    statsmodels==0.13.2

R:
    WGCNA>=1.70-3
    limma>=3.42.2

Installation

  • UnPaSt can be installed using pip, poetry, or run using Docker, or as a script (see examples section). Follow the appropriate instructions below for your preferred method. You need to have R and Python 3.8-3.10 installed.
  1. Using pip:
    To install the project using pip, first make sure you have pip installed on your system. If you haven't installed it already, you can find the installation instructions here.
    Once pip is installed, you can install UnPaSt by running the following command:

    pip install unpast
    

    Run it:

    run_unpast -h
    

    Dependencies. To use this package, you will need to have R and the WGCNA library and limma installed. You can easily install these dependencies by running the following command after installing unpast:

    python -m unpast.install_r_dependencies
    
    # or you can install it directly
    R -e "install.packages('BiocManager'); BiocManager::install(c('WGCNA', 'limma'))"
    
  2. Installation using Poetry:
    To install the package using Poetry, first make sure you have Poetry installed, clone the repo and run:

    poetry add unpast
    

    Run it:

    poetry run run_unpast -h
    

    Dependencies. To use this package, you will need to have R and the WGCNA library and limma installed. You can easily install these dependencies by running the following command after installing unpast:

    poetry run python -m unpast.install_r_dependencies
    
    # or you can install it directly
    R -e "install.packages('BiocManager'); BiocManager::install(c('WGCNA', 'limma'))"
    
  3. Running with Docker:
    You can also run the package using Docker. First, pull the Docker image:

    docker pull freddsle/unpast:latest
    

    Next, run the UnPaSt:

    docker run -v /your/data/path/:/user_data/ freddsle/unpast:latest --exprs /user_data/exprs.tsv --out_dir /user_data/out_dir/
    

Examples

  • UnPaSt requires a tab-separated file with features (e.g. genes) in rows, and samples in columns. Feature and sample names must be unique.
cd test;
mkdir -p results;

# running UnPaSt with default parameters and example data
python ../run_unpast.py --exprs scenario_B500.exprs.tsv.gz --basename results/scenario_B500

# with different binarization and clustering methods
python ../run_unpast.py --exprs scenario_B500.exprs.tsv.gz --basename results/scenario_B500 --binarization ward --clustering Louvain

# help
python run_unpast.py -h

Outputs

  • <basename>.[parameters].biclusters.tsv - a .tsv table with found biclsuters, where
    • the first line starts from '#' and stores parameters
    • each following line represents a bicluster
    • SNR column contains SNR of a bicluster
    • columns "n_genes" and "n_samples" provide the numbers of genes and samples, respectively
    • "gene","sample" contain gene and sample names respectively
    • "gene_indexes" and "sample_indexes" - 0-based gene and sample indexes in the input matrix.
  • binarized expressions, background distributions of SNR for each bicluster size and binarization statistics [if clustering is WGCNA, or '--save_binary' flag is added]

About

UnPaSt is an unconstrained version of DESMOND method (repository, publication)

Major modifications:

  • it does not require the network of feature interactions
  • UnPaSt clusters individual features instead of pairs of features
  • uses 2-means, hierarchicla clustering or GMM for binarization of individual gene expressions
  • SNR threshold for featuer selection is authomatically determined; it depends on bicluster size in samples and user-defined p-value cutoff

License

Free for non-for-profit use. For commercial use please contact the developers.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

unpast-0.1.9.2.tar.gz (69.1 kB view details)

Uploaded Source

Built Distribution

unpast-0.1.9.2-py3-none-any.whl (72.2 kB view details)

Uploaded Python 3

File details

Details for the file unpast-0.1.9.2.tar.gz.

File metadata

  • Download URL: unpast-0.1.9.2.tar.gz
  • Upload date:
  • Size: 69.1 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: poetry/1.3.2 CPython/3.10.12 Linux/6.5.0-18-generic

File hashes

Hashes for unpast-0.1.9.2.tar.gz
Algorithm Hash digest
SHA256 f6ea91d3feb1595537d13b5494978b1412f44413fe508466149a83a69b9c05ad
MD5 348f879926a670c90583e6651e0703c2
BLAKE2b-256 2fdc6e78b05c2b4ec64995ba8a2f90e408029ba9a70d7d9bbbe67d08adafdb84

See more details on using hashes here.

File details

Details for the file unpast-0.1.9.2-py3-none-any.whl.

File metadata

  • Download URL: unpast-0.1.9.2-py3-none-any.whl
  • Upload date:
  • Size: 72.2 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: poetry/1.3.2 CPython/3.10.12 Linux/6.5.0-18-generic

File hashes

Hashes for unpast-0.1.9.2-py3-none-any.whl
Algorithm Hash digest
SHA256 d3dd11d5aa5a8748ff72c1ce9edf967b2970d7362187d3546481c1b6db0180b3
MD5 39bfb614647cd3cf399c59c34e9d09c3
BLAKE2b-256 a47a55533a019645222017c493a27f965da7f45c8654af3bec65f933e0857241

See more details on using hashes here.

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page