Skip to main content

kmertools is a k-mer based feature extraction tool designed to support metagenomics and other bioinformatics analytics.

Project description

kmertools: DNA Vectorisation Tool

License: GPL v3 Cargo tests Clippy check install with bioconda Conda - Version Conda Downloads PyPI Downloads codecov PyPI - Version

$$\   $$\                                   $$$$$$$$\                     $$\           
$$ | $$  |                                  \__$$  __|                    $$ |          
$$ |$$  / $$$$$$\$$$$\   $$$$$$\   $$$$$$\     $$ |    $$$$$$\   $$$$$$\  $$ | $$$$$$$\ 
$$$$$  /  $$  _$$  _$$\ $$  __$$\ $$  __$$\    $$ |   $$  __$$\ $$  __$$\ $$ |$$  _____|
$$  $$<   $$ / $$ / $$ |$$$$$$$$ |$$ |  \__|   $$ |   $$ /  $$ |$$ /  $$ |$$ |\$$$$$$\  
$$ |\$$\  $$ | $$ | $$ |$$   ____|$$ |         $$ |   $$ |  $$ |$$ |  $$ |$$ | \____$$\ 
$$ | \$$\ $$ | $$ | $$ |\$$$$$$$\ $$ |         $$ |   \$$$$$$  |\$$$$$$  |$$ |$$$$$$$  |
\__|  \__|\__| \__| \__| \_______|\__|         \__|    \______/  \______/ \__|\_______/ 

Overview

kmertools is a k-mer based feature extraction tool designed to support metagenomics and other bioinformatics analytics. This tool leverages k-mer analysis to vectorize DNA sequences, facilitating the use of these vectors in various AI/ML applications.

Features

  • Oligonucleotide Frequency Vectors: Generate frequency vectors for oligonucleotides.
  • Minimiser Binning: Efficiently bin sequences using minimisers to reduce data complexity.
  • Chaos Game Representation (CGR): Compute CGR vectors for DNA sequences based on k-mers or whole sequence transformation.
  • Coverage Histograms: Create coverage histograms to analyze the depth of sequencing reads.
  • Python Binding: You can import kmertools functionality using import pykmertools as kt

Installation

Option 1: from bioconda (recommended)

You can install kmertools from Bioconda at https://anaconda.org/bioconda/kmertools. Make sure you have conda installed.

# create conda environment and install kmertools
conda create -n kmertools -c bioconda kmertools

# activate environment
conda activate kmertools

Option 2: from PyPI

You can install kmertools from PyPI at https://pypi.org/project/pykmertools/.

pip install pykmertools

Option 3: from sources

You can install kmertools directly from the source by cloning the repository and using Rust's package manager cargo.

git clone https://github.com/your-repository/kmertools.git
cd kmertools
cargo build --release

Now add the binary to path (you may modify ~/.bashrc or ~/.zshrc)

# to add to current terminal
export PATH=$PATH:$(pwd)/target/release/

# to save to ~/.bashrc
echo "export PATH=\$PATH:$(pwd)/target/release/" >> ~/.bashrc
source ~/.bashrc

# to save to ~/.zshrc for Mac
echo "export PATH=\$PATH:$(pwd)/target/release/" >> ~/.zshrc
source ~/.zshrc

To install the python bindings run the following commands. You can use either pip or conda directories for this.

# pip
cd pip
maturin build --release
# conda
cd conda
maturin build --release

Now move to parent directory using cd .. and run the following command.

pip install target/wheels/pykmertools-<VERSION>-cp39-abi3-manylinux_2_34_x86_64.whl

Test the installation

After setting up, run the following command to print out the kmertools help message.

kmertools --help

Help

Please read our comprehensive Wiki.

Authors

Citation

If you use kmertools please cite as follows.

@software{Wickramarachchi_kmertools_DNA_Vectorisation,
  author = {Wickramarachchi, Anuradha and Mallawaarachchi, Vijini},
  title = {{kmertools: DNA Vectorisation Tool}},
  url = {https://github.com/anuradhawick/kmertools},
  version = {0.1.4}
}

Please refer to the Wiki for citations of relevant algorithms.

Support and contributions

Please get in touch via author websites or GitHub issues. Thanks!

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distributions

No source distribution files available for this release.See tutorial on generating distribution archives.

Built Distributions

If you're not sure about the file name format, learn more about wheel file names.

pykmertools-0.2.1-cp39-abi3-win_amd64.whl (834.1 kB view details)

Uploaded CPython 3.9+Windows x86-64

pykmertools-0.2.1-cp39-abi3-musllinux_1_2_x86_64.whl (1.3 MB view details)

Uploaded CPython 3.9+musllinux: musl 1.2+ x86-64

pykmertools-0.2.1-cp39-abi3-musllinux_1_2_aarch64.whl (1.3 MB view details)

Uploaded CPython 3.9+musllinux: musl 1.2+ ARM64

pykmertools-0.2.1-cp39-abi3-manylinux_2_17_x86_64.manylinux2014_x86_64.whl (1.1 MB view details)

Uploaded CPython 3.9+manylinux: glibc 2.17+ x86-64

pykmertools-0.2.1-cp39-abi3-manylinux_2_17_aarch64.manylinux2014_aarch64.whl (1.1 MB view details)

Uploaded CPython 3.9+manylinux: glibc 2.17+ ARM64

pykmertools-0.2.1-cp39-abi3-macosx_11_0_arm64.whl (1.0 MB view details)

Uploaded CPython 3.9+macOS 11.0+ ARM64

pykmertools-0.2.1-cp39-abi3-macosx_10_12_x86_64.whl (1.0 MB view details)

Uploaded CPython 3.9+macOS 10.12+ x86-64

File details

Details for the file pykmertools-0.2.1-cp39-abi3-win_amd64.whl.

File metadata

File hashes

Hashes for pykmertools-0.2.1-cp39-abi3-win_amd64.whl
Algorithm Hash digest
SHA256 acfc2f30e7b84acc5269dd75564024788d41092b54b52efa43ecadf8255eccda
MD5 cef4b314a666820c32fa13ed5b87282f
BLAKE2b-256 b17a2a2328821e12523a842f2f52632179f7bc82b3a48cb93c40cab2c30afae7

See more details on using hashes here.

File details

Details for the file pykmertools-0.2.1-cp39-abi3-musllinux_1_2_x86_64.whl.

File metadata

File hashes

Hashes for pykmertools-0.2.1-cp39-abi3-musllinux_1_2_x86_64.whl
Algorithm Hash digest
SHA256 59d2b87fc59a6b91aeeb59c41babdaf04c84f78fcae50f10018655798450fa48
MD5 59d0494fcfe5a12619685539122aa487
BLAKE2b-256 64d644046fce977a97db389c6208550e5586b79aa5eebff32f4e13a867172e22

See more details on using hashes here.

File details

Details for the file pykmertools-0.2.1-cp39-abi3-musllinux_1_2_aarch64.whl.

File metadata

File hashes

Hashes for pykmertools-0.2.1-cp39-abi3-musllinux_1_2_aarch64.whl
Algorithm Hash digest
SHA256 2adf1d16b786bdaf51a81cb7af4a885f73b8d5a4bd8e9da1071e0632cf819be8
MD5 6d5f4d446f129ea64183fa5cedb2d9a3
BLAKE2b-256 79c305c851649360e27682113d44a453a6bdfc103994f922596741a90af435c6

See more details on using hashes here.

File details

Details for the file pykmertools-0.2.1-cp39-abi3-manylinux_2_17_x86_64.manylinux2014_x86_64.whl.

File metadata

File hashes

Hashes for pykmertools-0.2.1-cp39-abi3-manylinux_2_17_x86_64.manylinux2014_x86_64.whl
Algorithm Hash digest
SHA256 7e3fc2c7fc0b514f633ee9d53d81c9d5ac570679263141c259fd9f60b4fbd5b0
MD5 05db0409ea9323d132e15e28e69ea978
BLAKE2b-256 f6494cc97f22f6dbadf0b922dce483700aff198789b716e0e80ed728bdd1e8ec

See more details on using hashes here.

File details

Details for the file pykmertools-0.2.1-cp39-abi3-manylinux_2_17_aarch64.manylinux2014_aarch64.whl.

File metadata

File hashes

Hashes for pykmertools-0.2.1-cp39-abi3-manylinux_2_17_aarch64.manylinux2014_aarch64.whl
Algorithm Hash digest
SHA256 2a5c026b12728dd0f915382c8cfd51cb5441237e6dbc870b256ec788c1c3572d
MD5 3df45208ccc2eab8f40d9a00208354d5
BLAKE2b-256 90388101ebe3dc0614bfe9e96cb924fa17d30e30ab84b552c2238fc0b2fc3629

See more details on using hashes here.

File details

Details for the file pykmertools-0.2.1-cp39-abi3-macosx_11_0_arm64.whl.

File metadata

File hashes

Hashes for pykmertools-0.2.1-cp39-abi3-macosx_11_0_arm64.whl
Algorithm Hash digest
SHA256 1d905f63165f950f2e7d1f0f80917fc8f74848e8d30d2d47ed3a877fefef09ec
MD5 7ee2cb9e7ca57c737ccdb924475bab95
BLAKE2b-256 066d22ce677ad3d72ac7013fb924e9e44ebe2d1618081a81edf57d455fac9b49

See more details on using hashes here.

File details

Details for the file pykmertools-0.2.1-cp39-abi3-macosx_10_12_x86_64.whl.

File metadata

File hashes

Hashes for pykmertools-0.2.1-cp39-abi3-macosx_10_12_x86_64.whl
Algorithm Hash digest
SHA256 38b1ed709e6361cf57b2fa708a1daa618fd2bd0aa3256bc37fe9b0883cda211e
MD5 fbed9efc66c68bdcafb9ffe652ab2e19
BLAKE2b-256 755b3fb28d4b6a712ce007813fc3665766b08dd29be17ff074e3b35685aaa784

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page