Compute genetic distance matrices from RNA-seq data.
Project description
RNA-clique
This is the repository for RNA-clique, a tool for computing pairwise genetic distances from RNA-seq data. The software accepts as input assembled transcriptomes from two or more samples and produces as its output a matrix containing pairwise distances ranging from 0 to 1.
Installation
This software is written in Python. The software additionally requires NCBI BLAST+ and several Python libraries. Guides are provided for installation on specific systems. Alternatively, for installing on other systems, you can see the requirements
Installation guides
Basic usage
To run RNA-clique on your assembled transcriptomes, first make sure that your data are in a format understood by RNA-clique.
Then, run rna-clique with the directories containing your transcriptomes, an
output directory, and a setting for the number of top genes to select.
rna-clique -O my_rna_clique_out -n 50000 \
path/to/transcriptome1_dir \
path/to/transcriptome2_dir \
path/to/transcriptome3_dir ...
RNA-clique produces an output matrix at my_rna_clique_out/matrix.h5. To see it
in a human-readable format, use export_matrix.
python -m rna_clique.export_matrix -m my_rna_clique_out/matrix.h5
More details about the usage of RNA-clique can be found in the Command-line usage guide
Downstream analyses
The export_matrix program prints the calculated matrix to the standard
output, so you can use redirection or pipes to save the results to a file. You
could then use the matrix in any downstream application capable of loading
arbitrary matrices from files.
For example, if you output the matrix to a file named distances, you could
load the matrix in R using the following code:
dis <- as.matrix(read.table("distances", sep=" "))
Using RNA-clique in Python code
You can use RNA-clique directly from your Python code. For example,
from rna_clique.rna_clique import rna_clique
from pathlib import Path
out_dir = Path("rna_clique_out")
out_dir.mkdir(exist_ok=True)
# Get the SampleSimilarity object and a dict mapping paths to their sample
# names.
sim, path_to_sample = rna_clique(
[
Path("path/to/transcriptome1_dir"),
Path("path/to/transcriptome2_dir"),
Path("path/to/transcriptome3_dir"),
],
out_dir_1=out_dir / "od1",
out_dir_2=out_dir / "od2",
cache_dir=out_dir / "db_cache",
output_graph=output_dir / "graph.pkl",
output_matrix=output_dir / "matrix.h5",
top_genes=50000
)
print(sim.get_dissimilarity_df())
For information on finer-grained control via RNA-clique's Python API, see the API guide.
License
All code is licensed under the MIT license, which may be found at LICENSE at the root of this repository.
A machine-readable copyright file in Debian format may also be found at copyright.
Citation
If you use RNA-clique for your work, please cite "RNA-clique: a method for computing genetic distances from RNA-seq data".
@article{tapia2024rna,
title={{RNA-clique: a method for computing genetic distances from RNA-seq data}},
author={Tapia, Andrew C and Jaromczyk, Jerzy W and Moore, Neil and Schardl, Christopher L},
journal={BMC Bioinformatics},
volume={25},
year={2024},
publisher={BioMed Central},
keywords={pub}
}
Additional documentation
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file rna_clique-0.3.0a3.tar.gz.
File metadata
- Download URL: rna_clique-0.3.0a3.tar.gz
- Upload date:
- Size: 295.5 kB
- Tags: Source
- Uploaded using Trusted Publishing? Yes
- Uploaded via: twine/6.0.1 CPython/3.12.8
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
45eb2d81d400d40a5a5795c6c0c7076b26d043dcbe4a66803fefb8e071da6fb3
|
|
| MD5 |
c0a36e8597b4cc459ae066f8a87551c2
|
|
| BLAKE2b-256 |
99d776115a0813bc0d1625008349ee6f33a77c9c800e155424b4d3ad00b7b619
|
Provenance
The following attestation bundles were made for rna_clique-0.3.0a3.tar.gz:
Publisher:
package.yml on actapia/rna_clique
-
Statement:
-
Statement type:
https://in-toto.io/Statement/v1 -
Predicate type:
https://docs.pypi.org/attestations/publish/v1 -
Subject name:
rna_clique-0.3.0a3.tar.gz -
Subject digest:
45eb2d81d400d40a5a5795c6c0c7076b26d043dcbe4a66803fefb8e071da6fb3 - Sigstore transparency entry: 1793663182
- Sigstore integration time:
-
Permalink:
actapia/rna_clique@a55e40a7ffb67971616a959b08877f4b40937336 -
Branch / Tag:
refs/tags/v0.3.0-alpha3 - Owner: https://github.com/actapia
-
Access:
public
-
Token Issuer:
https://token.actions.githubusercontent.com -
Runner Environment:
github-hosted -
Publication workflow:
package.yml@a55e40a7ffb67971616a959b08877f4b40937336 -
Trigger Event:
release
-
Statement type:
File details
Details for the file rna_clique-0.3.0a3-py3-none-any.whl.
File metadata
- Download URL: rna_clique-0.3.0a3-py3-none-any.whl
- Upload date:
- Size: 124.4 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? Yes
- Uploaded via: twine/6.0.1 CPython/3.12.8
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
1373e688c2488536c56d5dbee48f9c84e4bebc18021a4e2cbf1380c5f7830fe0
|
|
| MD5 |
a8927cce40bd2f8fb00219e873ff3c07
|
|
| BLAKE2b-256 |
cf5921fbf85fe893752eeb005376e5f84541d0c903cc676bd99b73621f98272f
|
Provenance
The following attestation bundles were made for rna_clique-0.3.0a3-py3-none-any.whl:
Publisher:
package.yml on actapia/rna_clique
-
Statement:
-
Statement type:
https://in-toto.io/Statement/v1 -
Predicate type:
https://docs.pypi.org/attestations/publish/v1 -
Subject name:
rna_clique-0.3.0a3-py3-none-any.whl -
Subject digest:
1373e688c2488536c56d5dbee48f9c84e4bebc18021a4e2cbf1380c5f7830fe0 - Sigstore transparency entry: 1793663246
- Sigstore integration time:
-
Permalink:
actapia/rna_clique@a55e40a7ffb67971616a959b08877f4b40937336 -
Branch / Tag:
refs/tags/v0.3.0-alpha3 - Owner: https://github.com/actapia
-
Access:
public
-
Token Issuer:
https://token.actions.githubusercontent.com -
Runner Environment:
github-hosted -
Publication workflow:
package.yml@a55e40a7ffb67971616a959b08877f4b40937336 -
Trigger Event:
release
-
Statement type: