Skip to main content

Tool to calculate a k-mer pattern partition from position specific k-mer counts.

Project description

kmerPaPa

Tool to calculate a "k-mer pattern partition" from position specific k-mer counts. This can for instance be used to train a mutation rate model.

Requirements

kmerPaPa requires Python 3.7 or above.

Installation

Usage

If we want to train a mutation rate model then the input data should specifiy the number of times each k-mer is observed mutated and unmutated. One option is to have one file with the mutated k-mer counts (positive) and one file with the count of k-mers in the whole genome (background). We can then run kmerpapa like this:

kmerpapa --positive test_data/mutated_5mers.txt --background test_data/background_5mers.txt --penalty_values 3 5 7

The above command will first use cross validation to find the best penalty value between the values 3,5 and 7. Then it will find the optimal k-mer patter partiton using that penalty value. If both a list of penalty values and a list of pseudo-counts are specified then all combinations of values will be tested during cross validation:

kmerpapa --positive test_data/mutated_5mers.txt --background test_data/background_5mers.txt --penalty_values 3 5 6 --pseudo_counts 0.5 1 10

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

kmerpapa-0.2.0.tar.gz (17.4 kB view hashes)

Uploaded Source

Built Distribution

kmerpapa-0.2.0-py3-none-any.whl (21.7 kB view hashes)

Uploaded Python 3

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page