Utitilies for constructing and manipulating models for non-local structural dependencies in genomic sequences
Project description
Quasinet
Description
Infer non-local structural dependencies in genomic sequences. Genomic sequences are esentially compressed encodings of phenotypic information. This package provides a novel set of tools to extract long-range structural dependencies in genotypic data that define the phenotypic outcomes. The key capabilities implemented here are as follows:
- computing the q-net given a database of nucleic acid sequences, which is a family of conditional inference trees capturing the predictability of each nucleotide position given the rest of the genome.
COVID-19 | INFLUENZA |
---|---|
-
Computing a structure-aware evolution-adaptive notion of distance between genomes, which demonstrably is much more biologically relevant compared to the standard edit distance
-
Ability to draw samples in-silico, that have a high probability of being biologically correct. For example, given a database of HIV sequences, we can generate a new genomic sequence, which has a high probability of being a valid encoding of a HIV virion. The constructed q-net for long term non-progressor clinical phenotype in HIV-1 infection is shown below.
Installation
To install with pip:
pip install quasinet
To install with conda:
conda install quasinet
Dependencies
- scikit-learn
- scipy
- numpy
- numba
- pandas
- joblib
- biopython
Usage
from quasinet import qnet
# initialize qnet
myqnet = qnet.Qnet()
# train the qnet
myqnet.fit(X)
# compute qdistance
qdist = qnet.qdistance(seq1, seq2, myqnet, myqnet)
Authors
You can read the ZED lab at: zed.uchicago.edu
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Hashes for quasinet-0.0.49-py3-none-any.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | 6a325fe474c55d7325cc52899fc116b95570eab8f716b740e968f0236414d741 |
|
MD5 | 480a13871a77816aa3ebe828d7894376 |
|
BLAKE2b-256 | acf29f0f5f1d7798844e1b58df725dbb8b502a7563f8c652a7ef571cc090de56 |