Skip to main content

kmertools is a k-mer based feature extraction tool designed to support metagenomics and other bioinformatics analytics.

Project description

kmertools: DNA Vectorisation Tool

GitHub License Cargo tests Clippy check install with bioconda Conda Conda codecov

$$\   $$\                                   $$$$$$$$\                     $$\           
$$ | $$  |                                  \__$$  __|                    $$ |          
$$ |$$  / $$$$$$\$$$$\   $$$$$$\   $$$$$$\     $$ |    $$$$$$\   $$$$$$\  $$ | $$$$$$$\ 
$$$$$  /  $$  _$$  _$$\ $$  __$$\ $$  __$$\    $$ |   $$  __$$\ $$  __$$\ $$ |$$  _____|
$$  $$<   $$ / $$ / $$ |$$$$$$$$ |$$ |  \__|   $$ |   $$ /  $$ |$$ /  $$ |$$ |\$$$$$$\  
$$ |\$$\  $$ | $$ | $$ |$$   ____|$$ |         $$ |   $$ |  $$ |$$ |  $$ |$$ | \____$$\ 
$$ | \$$\ $$ | $$ | $$ |\$$$$$$$\ $$ |         $$ |   \$$$$$$  |\$$$$$$  |$$ |$$$$$$$  |
\__|  \__|\__| \__| \__| \_______|\__|         \__|    \______/  \______/ \__|\_______/ 

Overview

kmertools is a k-mer based feature extraction tool designed to support metagenomics and other bioinformatics analytics. This tool leverages k-mer analysis to vectorize DNA sequences, facilitating the use of these vectors in various AI/ML applications.

NEW: kmertools is now available on bioconda at https://anaconda.org/bioconda/kmertools.

Features

  • Oligonucleotide Frequency Vectors: Generate frequency vectors for oligonucleotides.
  • Minimiser Binning: Efficiently bin sequences using minimisers to reduce data complexity.
  • Chaos Game Representation (CGR): Compute CGR vectors for DNA sequences based on k-mers or whole sequence transformation.
  • Coverage Histograms: Create coverage histograms to analyze the depth of sequencing reads.

Installation

Option 1: from bioconda (recommended)

You can install kmertools from Bioconda at https://anaconda.org/bioconda/kmertools. Make sure you have conda installed.

# create conda environment and install kmertools
conda create -n kmertools -c bioconda kmertools

# activate environment
conda activate kmertools

Option 2: from sources

You can install kmertools directly from the source by cloning the repository and using Rust's package manager cargo.

git clone https://github.com/your-repository/kmertools.git
cd kmertools
cargo build --release

Now add the binary to path (you may modify ~/.bashrc or ~/.zshrc)

# to add to current terminal
export PATH=$PATH:$(pwd)/target/release/

# to save to ~/.bashrc
echo "export PATH=\$PATH:$(pwd)/target/release/" >> ~/.bashrc
source ~/.bashrc

# to save to ~/.zshrc for Mac
echo "export PATH=\$PATH:$(pwd)/target/release/" >> ~/.zshrc
source ~/.zshrc

Test the installation

After setting up, run the following command to print out the kmertools help message.

kmertools --help

Help

Please read our comprehensive Wiki.

Authors

Citation

If you use kmertools please cite as follows.

@software{Wickramarachchi_kmertools_DNA_Vectorisation,
  author = {Wickramarachchi, Anuradha and Mallawaarachchi, Vijini},
  title = {{kmertools: DNA Vectorisation Tool}},
  url = {https://github.com/anuradhawick/kmertools},
  version = {0.1.0}
}

Please refer to the Wiki for citations of relevant algorithms.

Support and contributions

Please get in touch via author websites or GitHub issues. Thanks!

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distributions

No source distribution files available for this release.See tutorial on generating distribution archives.

Built Distributions

pykmertools-0.1.2-cp39-abi3-win_amd64.whl (196.2 kB view details)

Uploaded CPython 3.9+ Windows x86-64

pykmertools-0.1.2-cp39-abi3-musllinux_1_2_x86_64.whl (516.8 kB view details)

Uploaded CPython 3.9+ musllinux: musl 1.2+ x86-64

pykmertools-0.1.2-cp39-abi3-musllinux_1_2_aarch64.whl (531.1 kB view details)

Uploaded CPython 3.9+ musllinux: musl 1.2+ ARM64

pykmertools-0.1.2-cp39-abi3-manylinux_2_17_x86_64.manylinux2014_x86_64.whl (347.2 kB view details)

Uploaded CPython 3.9+ manylinux: glibc 2.17+ x86-64

pykmertools-0.1.2-cp39-abi3-manylinux_2_17_aarch64.manylinux2014_aarch64.whl (353.5 kB view details)

Uploaded CPython 3.9+ manylinux: glibc 2.17+ ARM64

pykmertools-0.1.2-cp39-abi3-macosx_11_0_arm64.whl (300.5 kB view details)

Uploaded CPython 3.9+ macOS 11.0+ ARM64

pykmertools-0.1.2-cp39-abi3-macosx_10_12_x86_64.whl (301.5 kB view details)

Uploaded CPython 3.9+ macOS 10.12+ x86-64

File details

Details for the file pykmertools-0.1.2-cp39-abi3-win_amd64.whl.

File metadata

File hashes

Hashes for pykmertools-0.1.2-cp39-abi3-win_amd64.whl
Algorithm Hash digest
SHA256 b339837b5e85de0a19707795d3798aedcbc16eeb62b10f480a1efb524262c0a8
MD5 f9b039a1ce4846500d7d259072bc33de
BLAKE2b-256 6213d89685ab7c7ec68e1fd3136f4f4be9e929f3c2209346a5ab0281f0ee8aab

See more details on using hashes here.

File details

Details for the file pykmertools-0.1.2-cp39-abi3-musllinux_1_2_x86_64.whl.

File metadata

File hashes

Hashes for pykmertools-0.1.2-cp39-abi3-musllinux_1_2_x86_64.whl
Algorithm Hash digest
SHA256 15a968f8eb74b78b22da1f58f193bd22771764127883705f790afe4152aae43c
MD5 8697e697b11668d7a2bd02ee6885b551
BLAKE2b-256 89ae30d96ae7f2d5c8d39599d3e503b5d9ce45d65a8dcb1de8db113e872ef2ba

See more details on using hashes here.

File details

Details for the file pykmertools-0.1.2-cp39-abi3-musllinux_1_2_aarch64.whl.

File metadata

File hashes

Hashes for pykmertools-0.1.2-cp39-abi3-musllinux_1_2_aarch64.whl
Algorithm Hash digest
SHA256 1f82bf87427c908b0eef8c870e3d482ea2876463c934414d4b4852a3d55a2bbf
MD5 086b94419846d6d1dad1a59fd0daefff
BLAKE2b-256 dddca6b641253b512028a463a3ef352d5cdc69098180048ba8a26b34c8c992a7

See more details on using hashes here.

File details

Details for the file pykmertools-0.1.2-cp39-abi3-manylinux_2_17_x86_64.manylinux2014_x86_64.whl.

File metadata

File hashes

Hashes for pykmertools-0.1.2-cp39-abi3-manylinux_2_17_x86_64.manylinux2014_x86_64.whl
Algorithm Hash digest
SHA256 5464cf9f22cf4d61d3411e78ac5182376302cc6678d5365a424a46254c7b703c
MD5 f1dba1bad6873df57c917f4182e2324c
BLAKE2b-256 66b7978d19d4295cebff2195382b1ec79e8040452f6625e6ff8c0eaab92781f1

See more details on using hashes here.

File details

Details for the file pykmertools-0.1.2-cp39-abi3-manylinux_2_17_aarch64.manylinux2014_aarch64.whl.

File metadata

File hashes

Hashes for pykmertools-0.1.2-cp39-abi3-manylinux_2_17_aarch64.manylinux2014_aarch64.whl
Algorithm Hash digest
SHA256 0953f247f03492b7cf6cedff42ff81165508ab08fca76c0697c912b56acbb461
MD5 e7ba0b987d3dc50ced82c0482e227f8f
BLAKE2b-256 6c7beee96cf23aff6328cba17329dbf0e486e27836dca81c5b36ea3a75ceb992

See more details on using hashes here.

File details

Details for the file pykmertools-0.1.2-cp39-abi3-macosx_11_0_arm64.whl.

File metadata

File hashes

Hashes for pykmertools-0.1.2-cp39-abi3-macosx_11_0_arm64.whl
Algorithm Hash digest
SHA256 93357c2b8325793bce351d487e3952dff269c30ae93a1dc522e0ba757c7fcec6
MD5 e7059c09dc042ab5528eb1b11918abbb
BLAKE2b-256 b16057ec815d4c545ca822c843f5b298512084c6e0c104babe92120daa904d17

See more details on using hashes here.

File details

Details for the file pykmertools-0.1.2-cp39-abi3-macosx_10_12_x86_64.whl.

File metadata

File hashes

Hashes for pykmertools-0.1.2-cp39-abi3-macosx_10_12_x86_64.whl
Algorithm Hash digest
SHA256 04a14e2000e7f426e2b34223e8a1f824b93f6df80561a48407ac392cee7ceb9a
MD5 f652da177b24cbeccc6ffd852c75e2a2
BLAKE2b-256 cd8795214f7ae55eab5c237868e427c2fe93b5e72ded8bd7809298f7d9e595c2

See more details on using hashes here.

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page