PORTIA: Fast and Accurate Inference of Gene Regulatory Networks through Robust Precision Matrix Estimation
Project description
PORTIA
Lightning-fast Gene Regulatory Network (GRN) inference tool.
PORTIA builds on power transforms and covariance matrix inversion to approximate GRNs, and is orders of magnitude faster than other existing tools (as of August 2021).
How to use it
Install the dependencies:
pip3 -r requirements.txt
For using the end-to-end inference algorithm, install dependencies from requirements-etel.txt instead.
Install the package:
python3 setup.py install
In Python, create an empty dataset:
import portia as pt
dataset = pt.GeneExpressionDataset()
Microarray experiments can be added with the GeneExpressionDataset.add method. data must be an iterable (list, NumPy array, etc).
for exp_id, data in enumerate(your_data):
dataset.add(pt.Experiment(exp_id, data))
Gene knock-out experiments can be encoded using the knockout optional parameter.
dataset.add(pt.Experiment(exp_id, data, knockout=[gene_idx]))
where gene_idx is the (0-based) index of the gene being knocked out. Dual/multiple knock-out experiments are supported, but won't help in the inference process in any way.
Run PORTIA on your dataset:
M_bar = pt.run(dataset, method='fast')
The output M_bar is a matrix, where each element M_bar[i, j] is a score in the range [0, 1] reflecting the confidence about gene i being a regulator for target gene j. A whitelist of putative transcription factors can be specified with the tf_idx argument. tf_idx must be a (0-based) list of gene indices.
M_bar = pt.run(dataset, tf_idx=tf_idx, method='fast')
The mode of regulation (sign of regulatory link) can be retrieved by passing the return_sign argument. When set to True, both inferred network and sign matrix will be returned. Sign matrix S is a matrix of same shape as M_bar, where 1 stands for activition, -1 stands for inhibition, and 0 stands for no (self-)regulation.
M_bar, S = pt.run(dataset, tf_idx=tf_idx, method='fast', return_sign=True)
Finally, rank and store the results in a text file. gene_names is the list of your genes, provided in the correct order.
with open('your_destination/results.txt', 'w') as f:
for gene_a, gene_b, score in pt.rank_scores(M_bar, gene_names, limit=10000):
f.write(f'{gene_a}\t{gene_b}\t{score}\n')
Real examples on the DREAM datasets are provided in the scripts/ folder.
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
File details
Details for the file portia-grn-0.0.15.tar.gz.
File metadata
- Download URL: portia-grn-0.0.15.tar.gz
- Upload date:
- Size: 17.4 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/3.7.1 importlib_metadata/3.10.0 pkginfo/1.8.2 requests/2.25.1 requests-toolbelt/0.9.1 tqdm/4.59.0 CPython/3.8.8
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
1a2bcd68a4c72313ed1e063ab99b7ea69bc679b2cbecefbf3875b7f3f180c8b6
|
|
| MD5 |
55cc438d5f576a1ba3b745748a596fb6
|
|
| BLAKE2b-256 |
ec9e5c79b867b6263cbd5e1742f41223006a382645c730fbbbc17aedeff9e052
|