A novel method for unsupervised patient stratification.

These details have been verified by PyPI

Project links

Repository

GitHub Statistics

Maintainers

freddsle savfod

These details have not been verified by PyPI

Project description

UnPaSt

UnPaSt is a novel method for identification of differentially expressed biclusters.

Cite

UnPaSt preprint: https://arxiv.org/abs/2408.00200.

Code: https://github.com/ozolotareva/unpast_paper/

Quick Start

Using UnPaSt online

Run UnPaSt at CoSy.Bio server

Local installation

UnPaSt is available on PyPI and can be installed using pip

pip install unpast

wget https://github.com/ozolotareva/unpast/raw/refs/heads/main/unpast/tests/test_input/synthetic_clear_biclusters.tsv
unpast --exprs synthetic_clear_biclusters.tsv

To use --clustering WGCNA method instead of default one, you would also need to install the necessary R packages (see Requirements below).

Running in Docker

UnPaSt is also available as a Docker image, preinstalled R packages included. To pull the Docker image:

# load image and example data
docker pull freddsle/unpast
wget https://github.com/ozolotareva/unpast/raw/refs/heads/main/unpast/tests/test_input/synthetic_clear_biclusters.tsv

# run UnPaSt in a Docker environment with current directory and user
docker run --rm -it -u $(id -u):$(id -g) -v "$(pwd)":/data \
  freddsle/unpast \
    --exprs /data/synthetic_clear_biclusters.tsv \
    --out_dir /data/results/synthetic_clear_biclusters

To use some previous docker version, replace freddsle/unpast with freddsle/unpast:<version> with a specific version tag, see available tags here.

Development setup

Developer mode allows you to run modified UnPaSt code. This is useful for local updates or contributing to the project.

Docker development environment

To run UnPaSt in a Docker container with the latest code from the repository, you can use the following command:

# Clone the repository to get code
git clone https://github.com/ozolotareva/unpast.git
cd unpast

# Define the command to run UnPaSt 
# using unpast.run_unpast to surpass pre-insalled version from the Docker image
command="python -m unpast.run_unpast --exprs unpast/tests/scenario_B500.exprs.tsv.gz --basename results/scenario_B500 --verbose"

# Run UnPaSt using Docker
docker run --rm -it -u $(id -u):$(id -g) -v "$(pwd)":/data --entrypoint bash freddsle/unpast -c "cd /data && $command"

Requirements

UnPaSt requires Python 3.9-3.11 and certain Python and R packages.

Python and R dependencies

Python Dependencies

The Python dependencies are installed automatically when installing via pip (see pyproject.toml).

They include (with recommended versions):

fisher = ">=0.1.9,<=0.1.14"
pandas = "1.3.5"
python-louvain = "0.15"
matplotlib = "3.7.1"
seaborn = "0.11.1"
numba = ">=0.51.2,<=0.55.2"
numpy = "1.22.3"
scikit-learn = "1.2.2"
scikit-network = ">=0.24.0,<0.26.0"
scipy = ">=1.7.1,<=1.7.3"
statsmodels = "0.13.2"
kneed = "0.8.1"

R Dependencies

For the WGCNA clustering method, UnPaSt requires R and specific R packages.

UnPaSt utilizes R packages for certain analyses. Ensure that you have R installed with the following packages:

WGCNA (version 1.70-3 or higher)
limma (version 3.42.2 or higher)

Installing R

Ensure that R (version 4.3.1 or higher) is installed on your system. You can download R from CRAN.

It is recommended to use BiocManager for installing R packages:

install.packages("BiocManager")
BiocManager::install("WGCNA")
BiocManager::install("limma")

API Reference

Input

UnPaSt requires a tab-separated file with features (e.g. genes) in rows, and samples in columns.

Feature and sample names must be unique.
At least 2 features and 5 samples are required.
Data must be between-sample normalized.

Recommendations:

It is recommended that UnPaSt be applied to datasets with 20+ samples.
If the cohort is not large (<20 samples), reducing the minimal number of samples in a bicluster (min_n_samples) to 2 is recommended.
If the number of features is small, using the Louvain method for feature clustering instead of WGCNA and/or disabling feature selection by setting the binarization p-value (p-val) to 1 might be helpful.

Examples

Simulated data example: Biclustering of a matrix with 10 000 rows (features) and 200 columns (samples) with four implanted biclusters consisting of 500 features and 10-100 samples each. For more details, see Figure 3 and Methods here.

mkdir -p results;

# running UnPaSt with default parameters and example data
unpast --exprs unpast/tests/scenario_B500.exprs.tsv.gz --basename results/scenario_B500

# with different binarization and clustering methods
unpast --exprs unpast/tests/scenario_B500.exprs.tsv.gz --basename results/scenario_B500 --binarization ward --clustering Louvain

# help
unpast -h

Real data example. Analysis of a subset of 200 samples randomly chosen from TCGA-BRCA dataset, including consensus biclustering and visualization: jupyter-notebook.

Outputs

The program creates a folder runs/run_<timestamp>/ with the results of UnPaSt run, where <timestamp> is the date and time of the run in the format YYYYMMDDTHHMMSS.

The folder contains the files

run_YYYYMMDDTHHMMSS
├── args.tsv
├── biclusters.tsv 
└── unpast.log

The file biclusters.tsv contains the identified biclusters, with one bicluster per line. The format of this file is as follows:

- the first line starts with #, storing the parameters of UnPaSt
- the second line contains the column headers.
- each subsequent line represents a bicluster with the following columns:
- SNR: Signal-to-noise ratio of the bicluster, calculated as the average SNR of its features.
- n_genes: Number of genes in the bicluster.
- n_samples: Number of samples in the bicluster.
- genes: Space-separated list of gene names.
- samples: Space-separated list of sample names.
- direction: Indicates whether the bicluster consists of up-regulated ("UP"), down-regulated ("DOWN"), or both types of genes ("BOTH").
- genes_up, genes_down: Space-separated lists of up- and down-resulated genes respectively.
- gene_indexes: 0-based index of the genes in the input matrix.
- sample_indexes: 0-based index of the samples in the input matrix.

The files args.tsv and unpast.log contain the parameters used for the run and the log of the run respectively.

Along with the biclustering result, if save mode is used, UnPaSt saves the intermediate results of feature binarization. The files are stored in the binarization/ subfolder and include:

binarization
├── bin_args.tsv
├── bin_background.tsv
├── bin_res.tsv
└── bin_stats.tsv

with the files:

bin_args.tsv contains the subset of parameters used for binarization.
bin_background.tsv contains background distributions of SNR values for each evaluated bicluster size.
bin_res.tsv contains binarized input data.
bin_stats.tsv provides binarization statistics for each processed feature.

The binarization files can be used to restart UnPaSt with the same input and seed from the feature clustering step and skip time-consuming feature binarization.

Versions

UnPaSt version used in PathoPlex paper: UnPaSt_PathoPlex.zip

Project details

These details have been verified by PyPI

Project links

Repository

GitHub Statistics

Maintainers

freddsle savfod

These details have not been verified by PyPI

Release history Release notifications | RSS feed

This version

0.1.11

Jul 16, 2025

0.1.10

Oct 25, 2024

0.1.9.6.3 yanked

Oct 10, 2024

0.1.9.4

Feb 28, 2024

0.1.9.3 yanked

Feb 28, 2024

0.1.9.2 yanked

Feb 27, 2024

0.1.9 yanked

Feb 27, 2024

0.1.8

Oct 14, 2023

0.1.7 yanked

Oct 13, 2023

0.1.6

May 15, 2023

0.1.5 yanked

May 9, 2023

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

unpast-0.1.11.tar.gz (20.7 MB view details)

Uploaded Jul 16, 2025 Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

The dropdown lists show the available interpreters, ABIs, and platforms. Enable javascript to be able to filter the list of wheel files.

unpast-0.1.11-py3-none-any.whl (20.7 MB view details)

Uploaded Jul 16, 2025 Python 3

File details

Details for the file unpast-0.1.11.tar.gz.

File metadata

Download URL: unpast-0.1.11.tar.gz
Upload date: Jul 16, 2025
Size: 20.7 MB
Tags: Source
Uploaded using Trusted Publishing? Yes
Uploaded via: twine/6.1.0 CPython/3.12.9

File hashes

Hashes for unpast-0.1.11.tar.gz
Algorithm	Hash digest
SHA256	`1329e57dc61b6f96a36a70a48e497094915c9e84cf35f3703064f8f4741d927d`
MD5	`6d02848549522496813c89da9dbaa7b0`
BLAKE2b-256	`6ce5d9f98a23a12b7936ddf9c575bcddcfebb981a3bd678da6606eca1bc06bf5`

See more details on using hashes here.

Provenance

The following attestation bundles were made for unpast-0.1.11.tar.gz:

Publisher: publish.yml on ozolotareva/unpast

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

Statement:
- Statement type: https://in-toto.io/Statement/v1
- Predicate type: https://docs.pypi.org/attestations/publish/v1
- Subject name: unpast-0.1.11.tar.gz
- Subject digest: 1329e57dc61b6f96a36a70a48e497094915c9e84cf35f3703064f8f4741d927d
- Sigstore transparency entry: 276602684
- Sigstore integration time: Jul 16, 2025
Source repository:
- Permalink: ozolotareva/unpast@876252eff52ebe364e78535d3988c157c65be561
- Branch / Tag: refs/heads/main
- Owner: https://github.com/ozolotareva
- Access: public
Publication detail:
- Token Issuer: https://token.actions.githubusercontent.com
- Runner Environment: github-hosted
- Publication workflow: publish.yml@876252eff52ebe364e78535d3988c157c65be561
- Trigger Event: workflow_dispatch

File details

Details for the file unpast-0.1.11-py3-none-any.whl.

File metadata

Download URL: unpast-0.1.11-py3-none-any.whl
Upload date: Jul 16, 2025
Size: 20.7 MB
Tags: Python 3
Uploaded using Trusted Publishing? Yes
Uploaded via: twine/6.1.0 CPython/3.12.9

File hashes

Hashes for unpast-0.1.11-py3-none-any.whl
Algorithm	Hash digest
SHA256	`1b881e3589ce7b2ceca865a7868954cc56a00e9a768353b638d11d58089a4b00`
MD5	`823f672e2de9a2ca5078e6b65a7b50ea`
BLAKE2b-256	`a5774a369e8399fb0730aff5d84d281106b4b07babcaba2ae9064d32a8087d46`

See more details on using hashes here.

Provenance

The following attestation bundles were made for unpast-0.1.11-py3-none-any.whl:

Publisher: publish.yml on ozolotareva/unpast

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

Statement:
- Statement type: https://in-toto.io/Statement/v1
- Predicate type: https://docs.pypi.org/attestations/publish/v1
- Subject name: unpast-0.1.11-py3-none-any.whl
- Subject digest: 1b881e3589ce7b2ceca865a7868954cc56a00e9a768353b638d11d58089a4b00
- Sigstore transparency entry: 276602701
- Sigstore integration time: Jul 16, 2025
Source repository:
- Permalink: ozolotareva/unpast@876252eff52ebe364e78535d3988c157c65be561
- Branch / Tag: refs/heads/main
- Owner: https://github.com/ozolotareva
- Access: public
Publication detail:
- Token Issuer: https://token.actions.githubusercontent.com
- Runner Environment: github-hosted
- Publication workflow: publish.yml@876252eff52ebe364e78535d3988c157c65be561
- Trigger Event: workflow_dispatch

unpast 0.1.11

Navigation

Verified details

Project links

GitHub Statistics

Maintainers

Unverified details

Meta

Classifiers

Project description

UnPaSt

Cite

Quick Start

Using UnPaSt online

Local installation

Running in Docker

Development setup

Requirements

Python Dependencies

R Dependencies

Installing R

API Reference

Input

Recommendations:

Examples

Outputs

Versions

Project details

Verified details

Project links

GitHub Statistics

Maintainers

Unverified details

Meta

Classifiers

Release history Release notifications | RSS feed

Download files

Source Distribution

Built Distribution

File details

File metadata

File hashes

Provenance

File details

File metadata

File hashes

Provenance