MutClust: Mutual rank-based coexpression and clustering.
Project description
MutClust: Efficient and Scalable Mutual Rank-Based Coexpression Clustering
MutClust is a Python tool for efficient and scalable mutual rank-based gene coexpression analyses. The clustering analysis is conducted using ClusterONE, as described in Wisecaver et al. 2017. MutClust is still under development.
Features
- Mutual Rank Analysis: Compute mutual rank (MR) from Pearson correlations on your gene expression matrix.
- ClusterONE Clustering: Identify gene coexpression clusters from filtered/weighted MR networks.
- Fast: Multi-threaded, sparse matrix operations for speed on large datasets.
Installation
Recommended
Install MutClust: Create the recommended conda environment:
conda env create -f environment.yml
conda activate mutclust
Alternative
Step 1: Make sure that ClusterONE is available from the command line:
conda install bioconda::clusterone
Step 2a: Install MutClust from PyPI:
pip install mutclust
Step 2b: Or clone the repository from GitHub:
git clone https://github.com/eporetsky/mutclust.git
cd mutclust
pip install .
Usage
1. Calculate Mutual Rank (MR)
mutclust mr -i expr.tsv -o results.mrs.tsv.gz --mr-threshold 100 --threads 4 [--log2]
| Argument | Short | Description | Default |
|---|---|---|---|
| --input | -i | Path to the RNA-seq dataset (.tsv/.tsv.gz) | Required |
| --output | -o | Output file for mutual rank pairs | Required |
| --mr-threshold | -m | MR threshold for reporting gene pairs | 100 |
| --threads | -t | Number of CPU threads (correlation) | 4 |
| --log2 | If set, applies log2(x+1) before calculation | OFF by default |
- Input: Genes as rows, samples as columns (TSV, row index 'geneID').
- Output: Gzipped tab-separated file containing
Gene1,Gene2,MR.
2. Cluster Genes (with ClusterONE)
mutclust cls -i results.mrs.tsv.gz -o results.cls.tsv --e_value 10
| Argument | Short | Description | Default |
|---|---|---|---|
| --input | -i | Path to Mutual Rank (MR) pairs (.tsv/.tsv.gz) | Required |
| --output | -o | Output file for clusters (.tsv) | Required |
| --e_value | -e | Exponential decay constant for edge weighting | 10 |
- The tool filters/weights MR pairs and calls ClusterONE for clustering.
- Output:
clusters.tsv, listing clusters with p-value < 0.1. Tab-separated file containingclusterID,geneID,pval.
Example Workflow
mutclust mr -i data/myexpr.tsv -o out.mrs.tsv.gz --mr-threshold 100 --threads 72 --log2
mutclust cls -i out.mrs.tsv.gz -o out.clusters.tsv --e_value 10
Input Format
Expression file:
geneID\tSample1\tSample2\n...
GeneA \t1.1 \t2.2
GeneB \t4.2 \t3.7
Note: MutClust might be limited to linux because of dependency on pynetcor.
Coming Soon
- Generate cluster gene annotation
- Calculate cluster GO term enrichment
- Calculate clusteer eigen-gene data
- Add a MutClust Dockerfile
- Add unit testing
License
MIT License. See LICENSE file for details.
Contributing
Suggestions, pull requests, and issues welcome!
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file mutclust-0.1.4.tar.gz.
File metadata
- Download URL: mutclust-0.1.4.tar.gz
- Upload date:
- Size: 6.8 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.1.0 CPython/3.8.18
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
3ba5e98c950edcbaeaa0b8008cf5375d923704c637a31fef7a97463e7e9efe4e
|
|
| MD5 |
fea4fa34ed5315a4f1f649e6cbac2360
|
|
| BLAKE2b-256 |
14d5be7f0c1021dbc65392220bcd2b8c21c2219a7691f227c678491661111906
|
File details
Details for the file mutclust-0.1.4-py3-none-any.whl.
File metadata
- Download URL: mutclust-0.1.4-py3-none-any.whl
- Upload date:
- Size: 7.5 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.1.0 CPython/3.8.18
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
32afe8468e921094c8dbb34a83ccaa03289b523af68598e42ef01ae1cf505185
|
|
| MD5 |
c080504a0d840395a1e54f688572d9be
|
|
| BLAKE2b-256 |
15351414434356704498f2c6eed9bf79dd1c31086b4d72182a343f4c9e7de627
|