Skip to main content

kmertools is a k-mer based feature extraction tool designed to support metagenomics and other bioinformatics analytics.

Project description

kmertools: DNA Vectorisation Tool

License: GPL v3 Cargo tests Clippy check install with bioconda Conda Conda codecov PyPI - Version

$$\   $$\                                   $$$$$$$$\                     $$\           
$$ | $$  |                                  \__$$  __|                    $$ |          
$$ |$$  / $$$$$$\$$$$\   $$$$$$\   $$$$$$\     $$ |    $$$$$$\   $$$$$$\  $$ | $$$$$$$\ 
$$$$$  /  $$  _$$  _$$\ $$  __$$\ $$  __$$\    $$ |   $$  __$$\ $$  __$$\ $$ |$$  _____|
$$  $$<   $$ / $$ / $$ |$$$$$$$$ |$$ |  \__|   $$ |   $$ /  $$ |$$ /  $$ |$$ |\$$$$$$\  
$$ |\$$\  $$ | $$ | $$ |$$   ____|$$ |         $$ |   $$ |  $$ |$$ |  $$ |$$ | \____$$\ 
$$ | \$$\ $$ | $$ | $$ |\$$$$$$$\ $$ |         $$ |   \$$$$$$  |\$$$$$$  |$$ |$$$$$$$  |
\__|  \__|\__| \__| \__| \_______|\__|         \__|    \______/  \______/ \__|\_______/ 

Overview

kmertools is a k-mer based feature extraction tool designed to support metagenomics and other bioinformatics analytics. This tool leverages k-mer analysis to vectorize DNA sequences, facilitating the use of these vectors in various AI/ML applications.

NEW

Features

  • Oligonucleotide Frequency Vectors: Generate frequency vectors for oligonucleotides.
  • Minimiser Binning: Efficiently bin sequences using minimisers to reduce data complexity.
  • Chaos Game Representation (CGR): Compute CGR vectors for DNA sequences based on k-mers or whole sequence transformation.
  • Coverage Histograms: Create coverage histograms to analyze the depth of sequencing reads.
  • Python Binding: You can import kmertools functionality using import pykmertools as kt

Installation

Option 1: from bioconda (recommended)

You can install kmertools from Bioconda at https://anaconda.org/bioconda/kmertools. Make sure you have conda installed.

# create conda environment and install kmertools
conda create -n kmertools -c bioconda kmertools

# activate environment
conda activate kmertools

Option 2: from PyPI

You can install kmertools from PyPI at https://pypi.org/project/pykmertools/.

pip install pykmertools

Option 3: from sources

You can install kmertools directly from the source by cloning the repository and using Rust's package manager cargo.

git clone https://github.com/your-repository/kmertools.git
cd kmertools
cargo build --release

Now add the binary to path (you may modify ~/.bashrc or ~/.zshrc)

# to add to current terminal
export PATH=$PATH:$(pwd)/target/release/

# to save to ~/.bashrc
echo "export PATH=\$PATH:$(pwd)/target/release/" >> ~/.bashrc
source ~/.bashrc

# to save to ~/.zshrc for Mac
echo "export PATH=\$PATH:$(pwd)/target/release/" >> ~/.zshrc
source ~/.zshrc

Test the installation

After setting up, run the following command to print out the kmertools help message.

kmertools --help

Help

Please read our comprehensive Wiki.

Authors

Citation

If you use kmertools please cite as follows.

@software{Wickramarachchi_kmertools_DNA_Vectorisation,
  author = {Wickramarachchi, Anuradha and Mallawaarachchi, Vijini},
  title = {{kmertools: DNA Vectorisation Tool}},
  url = {https://github.com/anuradhawick/kmertools},
  version = {0.1.0}
}

Please refer to the Wiki for citations of relevant algorithms.

Support and contributions

Please get in touch via author websites or GitHub issues. Thanks!

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distributions

No source distribution files available for this release.See tutorial on generating distribution archives.

Built Distributions

pykmertools-0.1.4-cp39-abi3-win_amd64.whl (780.9 kB view details)

Uploaded CPython 3.9+ Windows x86-64

pykmertools-0.1.4-cp39-abi3-musllinux_1_2_x86_64.whl (1.2 MB view details)

Uploaded CPython 3.9+ musllinux: musl 1.2+ x86-64

pykmertools-0.1.4-cp39-abi3-musllinux_1_2_aarch64.whl (1.2 MB view details)

Uploaded CPython 3.9+ musllinux: musl 1.2+ ARM64

pykmertools-0.1.4-cp39-abi3-manylinux_2_17_x86_64.manylinux2014_x86_64.whl (1.0 MB view details)

Uploaded CPython 3.9+ manylinux: glibc 2.17+ x86-64

pykmertools-0.1.4-cp39-abi3-manylinux_2_17_aarch64.manylinux2014_aarch64.whl (1.0 MB view details)

Uploaded CPython 3.9+ manylinux: glibc 2.17+ ARM64

pykmertools-0.1.4-cp39-abi3-macosx_11_0_arm64.whl (924.4 kB view details)

Uploaded CPython 3.9+ macOS 11.0+ ARM64

pykmertools-0.1.4-cp39-abi3-macosx_10_12_x86_64.whl (943.2 kB view details)

Uploaded CPython 3.9+ macOS 10.12+ x86-64

File details

Details for the file pykmertools-0.1.4-cp39-abi3-win_amd64.whl.

File metadata

File hashes

Hashes for pykmertools-0.1.4-cp39-abi3-win_amd64.whl
Algorithm Hash digest
SHA256 251fc9b98099edd9ffdfa1e46e9bfa1df7393d081f26fd7f55d6f1c06745a270
MD5 eff83698effb264db9e127bd2e88cce4
BLAKE2b-256 032938a24ec298c6694b07c20ed820063809f6e2c971850fc4a2b2fa943a3eb0

See more details on using hashes here.

File details

Details for the file pykmertools-0.1.4-cp39-abi3-musllinux_1_2_x86_64.whl.

File metadata

File hashes

Hashes for pykmertools-0.1.4-cp39-abi3-musllinux_1_2_x86_64.whl
Algorithm Hash digest
SHA256 96de84049066c56ee57c919295cd00edf5f2a6f707e203db5b035610b1b59dbb
MD5 54b3379585c7aac263d0eb56cda9bdb8
BLAKE2b-256 a5afe77a6b34be465fdd9a627a2afd1554f2afe10c6028fd2b88267b277568b2

See more details on using hashes here.

File details

Details for the file pykmertools-0.1.4-cp39-abi3-musllinux_1_2_aarch64.whl.

File metadata

File hashes

Hashes for pykmertools-0.1.4-cp39-abi3-musllinux_1_2_aarch64.whl
Algorithm Hash digest
SHA256 339a7962d75d8908d456f9811c8193e9b979a93f920bf3f585c82f4cda91ff7b
MD5 540aaf25417c03f9520df29289925ffb
BLAKE2b-256 2315574cd72a6c1de7b5c065c8ee4a85f9ee51e5a7f8d1f1ab1f1bb4ad058ef5

See more details on using hashes here.

File details

Details for the file pykmertools-0.1.4-cp39-abi3-manylinux_2_17_x86_64.manylinux2014_x86_64.whl.

File metadata

File hashes

Hashes for pykmertools-0.1.4-cp39-abi3-manylinux_2_17_x86_64.manylinux2014_x86_64.whl
Algorithm Hash digest
SHA256 88feb5cfcfcea010c13a482d05e381d2d7abe1cd26d2cf6be3cf976d462e6cd2
MD5 a832202dc26cbaaac5199d8271e64fe2
BLAKE2b-256 98db96369df3c21d614c6577d8471b1e0d7e5bdf6120523603ae4f16caa51f6b

See more details on using hashes here.

File details

Details for the file pykmertools-0.1.4-cp39-abi3-manylinux_2_17_aarch64.manylinux2014_aarch64.whl.

File metadata

File hashes

Hashes for pykmertools-0.1.4-cp39-abi3-manylinux_2_17_aarch64.manylinux2014_aarch64.whl
Algorithm Hash digest
SHA256 90de50897a6a38aae0d4745ab7d7bfbf20a09b1bc530dcb4081ed6d372df67a1
MD5 818907c3c940f5cd9d19e7637e3eaa21
BLAKE2b-256 795e16617a74cdc13d9f132fa371c7d615c6e4da0542aff69240b20592baa27a

See more details on using hashes here.

File details

Details for the file pykmertools-0.1.4-cp39-abi3-macosx_11_0_arm64.whl.

File metadata

File hashes

Hashes for pykmertools-0.1.4-cp39-abi3-macosx_11_0_arm64.whl
Algorithm Hash digest
SHA256 33ad0dc0a95ac538250418810b2114c25cbfd534b281da9c126d4b6e665cfe2b
MD5 bf9eacdcb148021c7372f073340f6baa
BLAKE2b-256 359a349d0c5eef5e044cf61e5a77e7bf9ede034fab05669a2d9961abb5d3ca0d

See more details on using hashes here.

File details

Details for the file pykmertools-0.1.4-cp39-abi3-macosx_10_12_x86_64.whl.

File metadata

File hashes

Hashes for pykmertools-0.1.4-cp39-abi3-macosx_10_12_x86_64.whl
Algorithm Hash digest
SHA256 516a675943a666dbefd99f5df40e91ad3a93b233f4ac5082a5fbca60a93ef5a8
MD5 1b970f7928cd28ae4e738729503ce5a6
BLAKE2b-256 184550c57f8a553b086af8c6882d0367e6bc1908ad40ddabdb1db87403ddbd49

See more details on using hashes here.

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page