Skip to main content

Count kmers in regions or at SNVs or at indel breakpoints.

Project description

kmer_counter

Count kmers in regions or at SNVs or at indel breakpoints.

Requirements

kmer_counter requires Python 3.7 or above.

Installation

With pip:

pip install kmer_counter

With pipx:

pipx install kmer_counter

Usage

Counting k-mers at SNVs

To count the 3-mers at SNVs do:

kmer_counter snv {genome}.2bit {snv_file}

Where the {snv_file} should be a vcf-like text file where the first four columns are: Chromosome, Position, Ref_Allele, Alt_Allele. Fx:

chr1  1000000  A G
chr1  1000200  G C
chr1  1000300  A T
chr1  1000500  C G

Comments or header lines starting with "#" are allowed and will be ignored and any additional columns are also allowed but ignored. So a vcf file is also a valid input file. The Ref_Allele column should match the reference genome provided by the 2bit file. 2bit files can be downloaded from: https://hgdownload.cse.ucsc.edu/goldenpath/{genome}/bigZips/{genome}.2bit where {genome} is a valid UCSC genome assembly name (fx. "hg38").

Counting k-mers in genomic regions

To count all 5-mers in a bed file called {regions}.bed do:

kmer_counter background --bed {regions}.bed -radius 2 {genome}.2bit

By default all k-mers where the middle base is not A or C will be reverse complemented before being counted. This behaviour can be changed using the "--reverse_complement_method". If we instead wants to count 4-mers, we can use the "--before_after" option:

kmer_counter background --bed {regions}.bed --before_after 2 1 {genome}.2bit

When this option is used the default is not to reverse complement any of the k-mers but count all.

Counting k-mers at indels

To count one of the possible insertion breakpoint 4-mer for each insertion in a vcf-like file with variants do:

kmer_counter indel -r 2 --sample {genome}.2bit {variants} ins

And for deletion start breakpoints:

kmer_counter indel -r 2 --sample {genome}.2bit {variants} del_start

This will produce 2 counts for each deletion; one for the start breakpoint and one for the reverse complement at the k-mer at end breakpoint.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

kmer_counter-0.2.2.tar.gz (8.0 kB view details)

Uploaded Source

Built Distribution

kmer_counter-0.2.2-py3-none-any.whl (9.2 kB view details)

Uploaded Python 3

File details

Details for the file kmer_counter-0.2.2.tar.gz.

File metadata

  • Download URL: kmer_counter-0.2.2.tar.gz
  • Upload date:
  • Size: 8.0 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: poetry/1.6.1 CPython/3.9.15 Darwin/23.6.0

File hashes

Hashes for kmer_counter-0.2.2.tar.gz
Algorithm Hash digest
SHA256 85c12e648e5b41d2c2608de291435309a274fe9e36f1e584b4daa396953cd093
MD5 c464335896415da76c0a85e2b5670917
BLAKE2b-256 65c6ddfb83dc92c93ce70ad70f04be4741bc7115c577f434bedfa59b2a0c1853

See more details on using hashes here.

File details

Details for the file kmer_counter-0.2.2-py3-none-any.whl.

File metadata

  • Download URL: kmer_counter-0.2.2-py3-none-any.whl
  • Upload date:
  • Size: 9.2 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: poetry/1.6.1 CPython/3.9.15 Darwin/23.6.0

File hashes

Hashes for kmer_counter-0.2.2-py3-none-any.whl
Algorithm Hash digest
SHA256 68c06f2aca124e0267dca6859bd8f7c208b8fb5ef37f03067b348d5169e8b6cb
MD5 959f16fe1d6366e73471ff77cb8cccb2
BLAKE2b-256 c90761f9ab04096c3394a5c51670e6f2ca5a6eba7009bb80cb25688c674d6c7f

See more details on using hashes here.

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page