Skip to main content

Compute genetic distance matrices from RNA-seq data.

Project description

RNA-clique

DOI

This is the repository for RNA-clique, a tool for computing pairwise genetic distances from RNA-seq data. The software accepts as input assembled transcriptomes from two or more samples and produces as its output a matrix containing pairwise distances ranging from 0 to 1.

Installation

This software is written in Python. The software additionally requires NCBI BLAST+ and several Python libraries. Guides are provided for installation on specific systems. Alternatively, for installing on other systems, you can see the requirements

Installation guides

Basic usage

To run RNA-clique on your assembled transcriptomes, first make sure that your data are in a format understood by RNA-clique.

Then, run rna-clique with the directories containing your transcriptomes, an output directory, and a setting for the number of top genes to select.

rna-clique -O my_rna_clique_out -n 50000 \
           path/to/transcriptome1_dir \
           path/to/transcriptome2_dir \
		   path/to/transcriptome3_dir ...

RNA-clique produces an output matrix at my_rna_clique_out/matrix.h5. To see it in a human-readable format, use export_matrix.

python -m rna_clique.export_matrix -m my_rna_clique_out/matrix.h5 

More details about the usage of RNA-clique can be found in the Command-line usage guide

Downstream analyses

The export_matrix program prints the calculated matrix to the standard output, so you can use redirection or pipes to save the results to a file. You could then use the matrix in any downstream application capable of loading arbitrary matrices from files.

For example, if you output the matrix to a file named distances, you could load the matrix in R using the following code:

dis <- as.matrix(read.table("distances", sep=" "))

Using RNA-clique in Python code

You can use RNA-clique directly from your Python code. For example,

from rna_clique.rna_clique import rna_clique
from pathlib import Path

out_dir = Path("rna_clique_out")
out_dir.mkdir(exist_ok=True)
# Get the SampleSimilarity object and a dict mapping paths to their sample
# names.
sim, path_to_sample = rna_clique(
    [
        Path("path/to/transcriptome1_dir"),
        Path("path/to/transcriptome2_dir"),
        Path("path/to/transcriptome3_dir"),
    ],
	out_dir_1=out_dir / "od1",
	out_dir_2=out_dir / "od2",
	cache_dir=out_dir / "db_cache",
    output_graph=output_dir / "graph.pkl",
    output_matrix=output_dir / "matrix.h5",
	top_genes=50000
)
print(sim.get_dissimilarity_df())

For information on finer-grained control via RNA-clique's Python API, see the API guide.

License

All code is licensed under the MIT license, which may be found at LICENSE at the root of this repository.

A machine-readable copyright file in Debian format may also be found at copyright.

Citation

If you use RNA-clique for your work, please cite "RNA-clique: a method for computing genetic distances from RNA-seq data".

@article{tapia2024rna,
  title={{RNA-clique: a method for computing genetic distances from RNA-seq data}},
  author={Tapia, Andrew C and Jaromczyk, Jerzy W and Moore, Neil and Schardl, Christopher L},
  journal={BMC Bioinformatics},
  volume={25},
  year={2024},
  publisher={BioMed Central},
  keywords={pub}
}

Additional documentation

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

rna_clique-0.3.0a3.tar.gz (295.5 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

rna_clique-0.3.0a3-py3-none-any.whl (124.4 kB view details)

Uploaded Python 3

File details

Details for the file rna_clique-0.3.0a3.tar.gz.

File metadata

  • Download URL: rna_clique-0.3.0a3.tar.gz
  • Upload date:
  • Size: 295.5 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.0.1 CPython/3.12.8

File hashes

Hashes for rna_clique-0.3.0a3.tar.gz
Algorithm Hash digest
SHA256 45eb2d81d400d40a5a5795c6c0c7076b26d043dcbe4a66803fefb8e071da6fb3
MD5 c0a36e8597b4cc459ae066f8a87551c2
BLAKE2b-256 99d776115a0813bc0d1625008349ee6f33a77c9c800e155424b4d3ad00b7b619

See more details on using hashes here.

Provenance

The following attestation bundles were made for rna_clique-0.3.0a3.tar.gz:

Publisher: package.yml on actapia/rna_clique

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file rna_clique-0.3.0a3-py3-none-any.whl.

File metadata

  • Download URL: rna_clique-0.3.0a3-py3-none-any.whl
  • Upload date:
  • Size: 124.4 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.0.1 CPython/3.12.8

File hashes

Hashes for rna_clique-0.3.0a3-py3-none-any.whl
Algorithm Hash digest
SHA256 1373e688c2488536c56d5dbee48f9c84e4bebc18021a4e2cbf1380c5f7830fe0
MD5 a8927cce40bd2f8fb00219e873ff3c07
BLAKE2b-256 cf5921fbf85fe893752eeb005376e5f84541d0c903cc676bd99b73621f98272f

See more details on using hashes here.

Provenance

The following attestation bundles were made for rna_clique-0.3.0a3-py3-none-any.whl:

Publisher: package.yml on actapia/rna_clique

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page