Skip to main content

kmertools is a k-mer based feature extraction tool designed to support metagenomics and other bioinformatics analytics.

Project description

kmertools: DNA Vectorisation Tool

License: GPL v3 Cargo tests Clippy check install with bioconda Conda - Version Conda Downloads PyPI Downloads codecov PyPI - Version

$$\   $$\                                   $$$$$$$$\                     $$\           
$$ | $$  |                                  \__$$  __|                    $$ |          
$$ |$$  / $$$$$$\$$$$\   $$$$$$\   $$$$$$\     $$ |    $$$$$$\   $$$$$$\  $$ | $$$$$$$\ 
$$$$$  /  $$  _$$  _$$\ $$  __$$\ $$  __$$\    $$ |   $$  __$$\ $$  __$$\ $$ |$$  _____|
$$  $$<   $$ / $$ / $$ |$$$$$$$$ |$$ |  \__|   $$ |   $$ /  $$ |$$ /  $$ |$$ |\$$$$$$\  
$$ |\$$\  $$ | $$ | $$ |$$   ____|$$ |         $$ |   $$ |  $$ |$$ |  $$ |$$ | \____$$\ 
$$ | \$$\ $$ | $$ | $$ |\$$$$$$$\ $$ |         $$ |   \$$$$$$  |\$$$$$$  |$$ |$$$$$$$  |
\__|  \__|\__| \__| \__| \_______|\__|         \__|    \______/  \______/ \__|\_______/ 

Overview

kmertools is a k-mer based feature extraction tool designed to support metagenomics and other bioinformatics analytics. This tool leverages k-mer analysis to vectorize DNA sequences, facilitating the use of these vectors in various AI/ML applications.

Features

  • Oligonucleotide Frequency Vectors: Generate frequency vectors for oligonucleotides.
  • Minimiser Binning: Efficiently bin sequences using minimisers to reduce data complexity.
  • Chaos Game Representation (CGR): Compute CGR vectors for DNA sequences based on k-mers or whole sequence transformation.
  • Coverage Histograms: Create coverage histograms to analyze the depth of sequencing reads.
  • Python Binding: You can import kmertools functionality using import pykmertools as kt

Installation

Option 1: from bioconda (recommended)

You can install kmertools from Bioconda at https://anaconda.org/bioconda/kmertools. Make sure you have conda installed.

# create conda environment and install kmertools
conda create -n kmertools -c bioconda kmertools

# activate environment
conda activate kmertools

Option 2: from PyPI

You can install kmertools from PyPI at https://pypi.org/project/pykmertools/.

pip install pykmertools

Option 3: from sources

You can install kmertools directly from the source by cloning the repository and using Rust's package manager cargo.

git clone https://github.com/your-repository/kmertools.git
cd kmertools
cargo build --release

Now add the binary to path (you may modify ~/.bashrc or ~/.zshrc)

# to add to current terminal
export PATH=$PATH:$(pwd)/target/release/

# to save to ~/.bashrc
echo "export PATH=\$PATH:$(pwd)/target/release/" >> ~/.bashrc
source ~/.bashrc

# to save to ~/.zshrc for Mac
echo "export PATH=\$PATH:$(pwd)/target/release/" >> ~/.zshrc
source ~/.zshrc

To install the python bindings run the following commands. You can use either pip or conda directories for this.

# pip
cd pip
maturin build --release
# conda
cd conda
maturin build --release

Now move to parent directory using cd .. and run the following command.

pip install target/wheels/pykmertools-<VERSION>-cp39-abi3-manylinux_2_34_x86_64.whl

Test the installation

After setting up, run the following command to print out the kmertools help message.

kmertools --help

Help

Please read our comprehensive Wiki.

Authors

Citation

If you use kmertools please cite as follows.

@software{Wickramarachchi_kmertools_DNA_Vectorisation,
  author = {Wickramarachchi, Anuradha and Mallawaarachchi, Vijini},
  title = {{kmertools: DNA Vectorisation Tool}},
  url = {https://github.com/anuradhawick/kmertools},
  version = {0.1.4}
}

Please refer to the Wiki for citations of relevant algorithms.

Support and contributions

Please get in touch via author websites or GitHub issues. Thanks!

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distributions

No source distribution files available for this release.See tutorial on generating distribution archives.

Built Distributions

If you're not sure about the file name format, learn more about wheel file names.

pykmertools-0.2.0-cp39-abi3-win_amd64.whl (834.1 kB view details)

Uploaded CPython 3.9+Windows x86-64

pykmertools-0.2.0-cp39-abi3-musllinux_1_2_x86_64.whl (1.3 MB view details)

Uploaded CPython 3.9+musllinux: musl 1.2+ x86-64

pykmertools-0.2.0-cp39-abi3-musllinux_1_2_aarch64.whl (1.3 MB view details)

Uploaded CPython 3.9+musllinux: musl 1.2+ ARM64

pykmertools-0.2.0-cp39-abi3-manylinux_2_17_x86_64.manylinux2014_x86_64.whl (1.1 MB view details)

Uploaded CPython 3.9+manylinux: glibc 2.17+ x86-64

pykmertools-0.2.0-cp39-abi3-manylinux_2_17_aarch64.manylinux2014_aarch64.whl (1.1 MB view details)

Uploaded CPython 3.9+manylinux: glibc 2.17+ ARM64

pykmertools-0.2.0-cp39-abi3-macosx_11_0_arm64.whl (1.0 MB view details)

Uploaded CPython 3.9+macOS 11.0+ ARM64

pykmertools-0.2.0-cp39-abi3-macosx_10_12_x86_64.whl (1.0 MB view details)

Uploaded CPython 3.9+macOS 10.12+ x86-64

File details

Details for the file pykmertools-0.2.0-cp39-abi3-win_amd64.whl.

File metadata

File hashes

Hashes for pykmertools-0.2.0-cp39-abi3-win_amd64.whl
Algorithm Hash digest
SHA256 328937df31acde3884d26905e58c471b4c541fe51f3e333a774346db3b53227b
MD5 8823b3e0effcfae3092e4ad4adb92ddb
BLAKE2b-256 b9ba1234d2984998a4079b76c64b38e9a937517014b69f08e9a976bfd01d7f25

See more details on using hashes here.

File details

Details for the file pykmertools-0.2.0-cp39-abi3-musllinux_1_2_x86_64.whl.

File metadata

File hashes

Hashes for pykmertools-0.2.0-cp39-abi3-musllinux_1_2_x86_64.whl
Algorithm Hash digest
SHA256 0d79b727e1edc5be3827225f8ec173eeeccb33917a63a86e3e92aaec361ad026
MD5 a08d21852a8929ed660f5700c7eedc52
BLAKE2b-256 3ed3deca6dd39d6063607e6e6063af50ee833df8a8055616d3ee08522a18d8bc

See more details on using hashes here.

File details

Details for the file pykmertools-0.2.0-cp39-abi3-musllinux_1_2_aarch64.whl.

File metadata

File hashes

Hashes for pykmertools-0.2.0-cp39-abi3-musllinux_1_2_aarch64.whl
Algorithm Hash digest
SHA256 1292851bab936da1c5d6f5c31801f8bd217c4339395491939f39bb3b453a0c83
MD5 e85405d6810f23ab8d7c55f8ffb504c3
BLAKE2b-256 b99bf24aafa46cef51d48d0792a23abde1eb47f00609a46c1c3109b075f52259

See more details on using hashes here.

File details

Details for the file pykmertools-0.2.0-cp39-abi3-manylinux_2_17_x86_64.manylinux2014_x86_64.whl.

File metadata

File hashes

Hashes for pykmertools-0.2.0-cp39-abi3-manylinux_2_17_x86_64.manylinux2014_x86_64.whl
Algorithm Hash digest
SHA256 d22f2aca26f125e6c0f0e7f28fb932ace17dd28725e32cbe303b783482b4a711
MD5 2de047d633b29b2206011f6be1d0e72d
BLAKE2b-256 04bb156ee67cb10d6e88889c1b2452f77d39c6929f98f351f61885b131bf08d7

See more details on using hashes here.

File details

Details for the file pykmertools-0.2.0-cp39-abi3-manylinux_2_17_aarch64.manylinux2014_aarch64.whl.

File metadata

File hashes

Hashes for pykmertools-0.2.0-cp39-abi3-manylinux_2_17_aarch64.manylinux2014_aarch64.whl
Algorithm Hash digest
SHA256 6eddc7dd8f967c74b087e531a20f1b75a6df1f2941f81c00000f4915dc5229f2
MD5 22ad47df3167b2b0b87a2e052eee2ba5
BLAKE2b-256 226f19650cc8b0015b76ee2631b98a23e81c806856de254fcd6676c269cc1ed5

See more details on using hashes here.

File details

Details for the file pykmertools-0.2.0-cp39-abi3-macosx_11_0_arm64.whl.

File metadata

File hashes

Hashes for pykmertools-0.2.0-cp39-abi3-macosx_11_0_arm64.whl
Algorithm Hash digest
SHA256 0502e1041cf7a9fd0678d70657ba67d5b4db2af378999fd3c533c6fb3740762b
MD5 8eb3553056d3a0caefc6bca41f2b7e0e
BLAKE2b-256 a425334e65d6b8226b1c86ec838a22977c5f3995cd833e24665ffd82bb80a65d

See more details on using hashes here.

File details

Details for the file pykmertools-0.2.0-cp39-abi3-macosx_10_12_x86_64.whl.

File metadata

File hashes

Hashes for pykmertools-0.2.0-cp39-abi3-macosx_10_12_x86_64.whl
Algorithm Hash digest
SHA256 72b091561a27d6d6bb60cb848d5b31db4289026ece35d367f2194a9793f37f2f
MD5 d1fc57cda0f3068882affd42bc3fca44
BLAKE2b-256 df5eb6631dd5360a884ad8e34ed2f4a5bc15cf348ac73d9780bde0ab7d1f3a7e

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page