Skip to main content

MutClust: Mutual rank-based coexpression and clustering.

Project description

MutClust: Efficient and Scalable Mutual Rank-Based Coexpression Clustering

MutClust is a Python tool for efficient and scalable mutual rank-based gene coexpression analyses. The clustering analysis is conducted using ClusterONE, as described in Wisecaver et al. 2017. MutClust is still under development.


Features

  • Mutual Rank Analysis: Compute mutual rank (MR) from Pearson correlations on your gene expression matrix.
  • ClusterONE Clustering: Identify gene coexpression clusters from filtered/weighted MR networks.
  • Fast: Multi-threaded, sparse matrix operations for speed on large datasets.

Installation

Recommended

Install MutClust: Create the recommended conda environment:

conda env create -f environment.yml
conda activate mutclust

Alternative

Step 1: Make sure that ClusterONE is available from the command line:

conda install bioconda::clusterone

Step 2a: Install MutClust from PyPI:

pip install mutclust

Step 2b: Or clone the repository from GitHub:

git clone https://github.com/eporetsky/mutclust.git
cd mutclust
pip install .

Usage

1. Calculate Mutual Rank (MR)

mutclust mr -i expr.tsv -o results.mrs.tsv.gz --mr-threshold 100 --threads 4 [--log2]
Argument Short Description Default
--input -i Path to the RNA-seq dataset (.tsv/.tsv.gz) Required
--output -o Output file for mutual rank pairs Required
--mr-threshold -m MR threshold for reporting gene pairs 100
--threads -t Number of CPU threads (correlation) 4
--log2 If set, applies log2(x+1) before calculation OFF by default
  • Input: Genes as rows, samples as columns (TSV, row index 'geneID').
  • Output: Gzipped tab-separated file containing Gene1, Gene2, MR.

2. Cluster Genes (with ClusterONE)

mutclust cls -i results.mrs.tsv.gz -o results.cls.tsv --e_value 10
Argument Short Description Default
--input -i Path to Mutual Rank (MR) pairs (.tsv/.tsv.gz) Required
--output -o Output file for clusters (.tsv) Required
--e_value -e Exponential decay constant for edge weighting 10
  • The tool filters/weights MR pairs and calls ClusterONE for clustering.
  • Output: clusters.tsv, listing clusters with p-value < 0.1. Tab-separated file containing clusterID, geneID, pval.

Example Workflow

mutclust mr -i data/myexpr.tsv -o out.mrs.tsv.gz --mr-threshold 100 --threads 72 --log2
mutclust cls -i out.mrs.tsv.gz -o out.clusters.tsv --e_value 10

Input Format

Expression file:

geneID\tSample1\tSample2\n...
GeneA \t1.1    \t2.2
GeneB \t4.2    \t3.7

Note: MutClust might be limited to linux because of dependency on pynetcor.


Coming Soon

  • Generate cluster gene annotation
  • Calculate cluster GO term enrichment
  • Calculate clusteer eigen-gene data
  • Add a MutClust Dockerfile
  • Add unit testing

License

MIT License. See LICENSE file for details.


Contributing

Suggestions, pull requests, and issues welcome!

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

mutclust-0.1.4.tar.gz (6.8 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

mutclust-0.1.4-py3-none-any.whl (7.5 kB view details)

Uploaded Python 3

File details

Details for the file mutclust-0.1.4.tar.gz.

File metadata

  • Download URL: mutclust-0.1.4.tar.gz
  • Upload date:
  • Size: 6.8 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.1.0 CPython/3.8.18

File hashes

Hashes for mutclust-0.1.4.tar.gz
Algorithm Hash digest
SHA256 3ba5e98c950edcbaeaa0b8008cf5375d923704c637a31fef7a97463e7e9efe4e
MD5 fea4fa34ed5315a4f1f649e6cbac2360
BLAKE2b-256 14d5be7f0c1021dbc65392220bcd2b8c21c2219a7691f227c678491661111906

See more details on using hashes here.

File details

Details for the file mutclust-0.1.4-py3-none-any.whl.

File metadata

  • Download URL: mutclust-0.1.4-py3-none-any.whl
  • Upload date:
  • Size: 7.5 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.1.0 CPython/3.8.18

File hashes

Hashes for mutclust-0.1.4-py3-none-any.whl
Algorithm Hash digest
SHA256 32afe8468e921094c8dbb34a83ccaa03289b523af68598e42ef01ae1cf505185
MD5 c080504a0d840395a1e54f688572d9be
BLAKE2b-256 15351414434356704498f2c6eed9bf79dd1c31086b4d72182a343f4c9e7de627

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page