Skip to main content

Perform k-mer abundance analysis in DNA sequences

Project description

every-motif-ever

every-motif-ever (eme) is a Python package to perform k-mer abundance analysis in DNA sequences. eme is developed to perform fast and efficient analysis of short k-mers (tested with k-mers up to length 10).

While eme can be used for general purpose k-mer analysis, motivation to develop eme is to perform Systemic Evolution of Ligands by EXponential enrichment coupled with High Throughput sequencing (HT-SELEX) analysis in a Pythonic way. By default, for every k-mer, eme quantifies the fraction of reads containing that k-mer in a non-redundant manner. After the quantification, a basic position frequency matrix (PFM) for the top 50 k-mers is generated. If the user wants to generate more PFMs, they can change the top keyword argument to a desired number.

Installation

pip install https://github.com/kashyapchhatbar/every-motif-ever/archive/refs/tags/v0.1.tar.gz

Usage

Basic Usage

from eme.eme import kmer_fraction_from_file as kf

# By default, keyword arguments for size of the
# k-mer is k=5 and the number of PFMs is top=50
counts, fraction, pfm_models = kf("data/random.fa.gz")

Tutorial for HT-SELEX analysis

Jupyter notebooks detailing the usage of eme for HT-SELEX analysis are hosted on a separate repository

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

eme_selex-0.1.tar.gz (5.5 kB view hashes)

Uploaded Source

Built Distribution

eme_selex-0.1-py2.py3-none-any.whl (7.4 kB view hashes)

Uploaded Python 2 Python 3

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page