Utitilies for constructing and manipulating models for non-local structural dependencies in genomic sequences
Project description
Quasinet
Description
Infer non-local structural dependencies in genomic sequences. Genomic sequences are esentially compressed encodings of phenotypic information. This package provides a novel set of tools to extract long-range structural dependencies in genotypic data that define the phenotypic outcomes. The key capabilities implemented here are as follows:
- Compute the Quasinet (Q-net) given a database of nucleic acid sequences. The Q-net is a family of conditional inference trees that capture the predictability of each nucleotide position given the rest of the genome. The constructed Q-net for COVID-19 and Influenza A H1N1 HA 2008-9 is shown below.
COVID-19 | INFLUENZA |
---|---|
-
Compute a structure-aware evolution-adaptive notion of distance between genomes, which is demonstrably more biologically relevant compared to the standard edit distance.
-
Draw samples in-silico that have a high probability of being biologically correct. For example, given a database of Influenza sequences, we can generate a new genomic sequence that has a high probability of being a valid influenza sequence.
Installation
To install with pip:
pip install quasinet
NOTE: If trying to reproduce the paper below, please use pip install quasinet==0.0.58
Dependencies
- scikit-learn
- scipy
- numpy
- numba
- pandas
- joblib
- biopython
Usage
from quasinet import qnet
# initialize qnet
myqnet = qnet.Qnet()
# train the qnet
myqnet.fit(X)
# compute qdistance
qdist = qnet.qdistance(seq1, seq2, myqnet, myqnet)
Examples
Examples are located here.
Documentation
For more documentation, see here.
Papers
For reference, please check out our paper:
Authors
You can reach the ZED lab at: zed.uchicago.edu
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Hashes for quasinet-0.0.79-py3-none-any.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | e5fa2969110699065b00d58d741ad4d26271912a627633c6cdc85cc42925611b |
|
MD5 | 2f59cabf5e81e8fde0a6aa64c44e85e9 |
|
BLAKE2b-256 | c94ac76d84e55d45695e6e39bb2d410f93b0eaea448fe858a3d40bcac2659210 |