Perform k-mer abundance analysis in DNA sequences
Project description
every-motif-ever
every-motif-ever (eme) is a Python package to perform k-mer abundance analysis in DNA sequences. eme is developed to perform fast and efficient analysis of short k-mers (tested with k-mers up to length 10).
While eme can be used for general purpose k-mer analysis, motivation to develop eme is to perform Systemic Evolution of Ligands by EXponential enrichment coupled with High Throughput sequencing (HT-SELEX) analysis in a Pythonic way. By default, for every k-mer, eme quantifies the fraction of reads containing that k-mer in a non-redundant manner. After the quantification, a basic position frequency matrix (PFM) for the top 50 k-mers is generated. If the user wants to generate more PFMs, they can change the top keyword argument to a desired number.
Installation
pip install https://github.com/kashyapchhatbar/every-motif-ever/archive/refs/tags/v0.1.tar.gz
Usage
Basic Usage
from eme.eme import kmer_fraction_from_file as kf
# By default, keyword arguments for size of the
# k-mer is k=5 and the number of PFMs is top=50
counts, fraction, pfm_models = kf("data/random.fa.gz")
Tutorial for HT-SELEX analysis
Jupyter notebooks detailing the usage of eme for HT-SELEX analysis are hosted on a separate repository
Project details
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Hashes for eme_selex-0.1-py2.py3-none-any.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | 837e19e191e58efc551a6016f36a2928aecd7ae69ca4b93e58b1d6d4f39c0922 |
|
MD5 | e36e85965fa16f586ca2bf1484ce3e37 |
|
BLAKE2b-256 | 86f62d886854ba406a51580e8905bdbd3ea487491d7f79f105bc7150161f4d8b |