Skip to main content

kmertools is a k-mer based feature extraction tool designed to support metagenomics and other bioinformatics analytics.

Project description

kmertools: DNA Vectorisation Tool

GitHub License Cargo tests Clippy check install with bioconda Conda Conda codecov PyPI - Version

$$\   $$\                                   $$$$$$$$\                     $$\           
$$ | $$  |                                  \__$$  __|                    $$ |          
$$ |$$  / $$$$$$\$$$$\   $$$$$$\   $$$$$$\     $$ |    $$$$$$\   $$$$$$\  $$ | $$$$$$$\ 
$$$$$  /  $$  _$$  _$$\ $$  __$$\ $$  __$$\    $$ |   $$  __$$\ $$  __$$\ $$ |$$  _____|
$$  $$<   $$ / $$ / $$ |$$$$$$$$ |$$ |  \__|   $$ |   $$ /  $$ |$$ /  $$ |$$ |\$$$$$$\  
$$ |\$$\  $$ | $$ | $$ |$$   ____|$$ |         $$ |   $$ |  $$ |$$ |  $$ |$$ | \____$$\ 
$$ | \$$\ $$ | $$ | $$ |\$$$$$$$\ $$ |         $$ |   \$$$$$$  |\$$$$$$  |$$ |$$$$$$$  |
\__|  \__|\__| \__| \__| \_______|\__|         \__|    \______/  \______/ \__|\_______/ 

Overview

kmertools is a k-mer based feature extraction tool designed to support metagenomics and other bioinformatics analytics. This tool leverages k-mer analysis to vectorize DNA sequences, facilitating the use of these vectors in various AI/ML applications.

NEW

Features

  • Oligonucleotide Frequency Vectors: Generate frequency vectors for oligonucleotides.
  • Minimiser Binning: Efficiently bin sequences using minimisers to reduce data complexity.
  • Chaos Game Representation (CGR): Compute CGR vectors for DNA sequences based on k-mers or whole sequence transformation.
  • Coverage Histograms: Create coverage histograms to analyze the depth of sequencing reads.
  • Python Binding: You can import kmertools functionality using import pykmertools as kt

Installation

Option 1: from bioconda (recommended)

You can install kmertools from Bioconda at https://anaconda.org/bioconda/kmertools. Make sure you have conda installed.

# create conda environment and install kmertools
conda create -n kmertools -c bioconda kmertools

# activate environment
conda activate kmertools

Option 2: from PyPI

You can install kmertools from PyPI at https://pypi.org/project/pykmertools/.

pip install pykmertools

Option 3: from sources

You can install kmertools directly from the source by cloning the repository and using Rust's package manager cargo.

git clone https://github.com/your-repository/kmertools.git
cd kmertools
cargo build --release

Now add the binary to path (you may modify ~/.bashrc or ~/.zshrc)

# to add to current terminal
export PATH=$PATH:$(pwd)/target/release/

# to save to ~/.bashrc
echo "export PATH=\$PATH:$(pwd)/target/release/" >> ~/.bashrc
source ~/.bashrc

# to save to ~/.zshrc for Mac
echo "export PATH=\$PATH:$(pwd)/target/release/" >> ~/.zshrc
source ~/.zshrc

Test the installation

After setting up, run the following command to print out the kmertools help message.

kmertools --help

Help

Please read our comprehensive Wiki.

Authors

Citation

If you use kmertools please cite as follows.

@software{Wickramarachchi_kmertools_DNA_Vectorisation,
  author = {Wickramarachchi, Anuradha and Mallawaarachchi, Vijini},
  title = {{kmertools: DNA Vectorisation Tool}},
  url = {https://github.com/anuradhawick/kmertools},
  version = {0.1.0}
}

Please refer to the Wiki for citations of relevant algorithms.

Support and contributions

Please get in touch via author websites or GitHub issues. Thanks!

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distributions

No source distribution files available for this release.See tutorial on generating distribution archives.

Built Distributions

pykmertools-0.1.3-cp39-abi3-win_amd64.whl (757.6 kB view details)

Uploaded CPython 3.9+ Windows x86-64

pykmertools-0.1.3-cp39-abi3-musllinux_1_2_x86_64.whl (1.2 MB view details)

Uploaded CPython 3.9+ musllinux: musl 1.2+ x86-64

pykmertools-0.1.3-cp39-abi3-musllinux_1_2_aarch64.whl (1.2 MB view details)

Uploaded CPython 3.9+ musllinux: musl 1.2+ ARM64

pykmertools-0.1.3-cp39-abi3-manylinux_2_17_x86_64.manylinux2014_x86_64.whl (1.0 MB view details)

Uploaded CPython 3.9+ manylinux: glibc 2.17+ x86-64

pykmertools-0.1.3-cp39-abi3-manylinux_2_17_aarch64.manylinux2014_aarch64.whl (1.0 MB view details)

Uploaded CPython 3.9+ manylinux: glibc 2.17+ ARM64

pykmertools-0.1.3-cp39-abi3-macosx_11_0_arm64.whl (918.6 kB view details)

Uploaded CPython 3.9+ macOS 11.0+ ARM64

pykmertools-0.1.3-cp39-abi3-macosx_10_12_x86_64.whl (934.3 kB view details)

Uploaded CPython 3.9+ macOS 10.12+ x86-64

File details

Details for the file pykmertools-0.1.3-cp39-abi3-win_amd64.whl.

File metadata

File hashes

Hashes for pykmertools-0.1.3-cp39-abi3-win_amd64.whl
Algorithm Hash digest
SHA256 1d70c9b4931622d52654690f4a8471cf1dceafba7feee264d0f997930e19fe2c
MD5 f699b80483d68b1fb7a134e2cf710e74
BLAKE2b-256 ee8a034e716b31b601cd796a4ecb9f77b16b3c55e8e6554f97172e682b009ca6

See more details on using hashes here.

File details

Details for the file pykmertools-0.1.3-cp39-abi3-musllinux_1_2_x86_64.whl.

File metadata

File hashes

Hashes for pykmertools-0.1.3-cp39-abi3-musllinux_1_2_x86_64.whl
Algorithm Hash digest
SHA256 92e76cf15f032e3cc901a5155c072726ad1fa82887faf245821d21454968e063
MD5 5ba8b03d9973ae6bdc41899393d0ba3c
BLAKE2b-256 72fe0cad9a99ff8bfdadc029ce947e4a4fe823e8955aa08314abde05fd69c055

See more details on using hashes here.

File details

Details for the file pykmertools-0.1.3-cp39-abi3-musllinux_1_2_aarch64.whl.

File metadata

File hashes

Hashes for pykmertools-0.1.3-cp39-abi3-musllinux_1_2_aarch64.whl
Algorithm Hash digest
SHA256 41b967b2c6c4d751cca8a261b6791807deb199a3a08343879808f1b6325db7ec
MD5 11be5c65dfb8039db149a2c5991eb231
BLAKE2b-256 633c9bb528e902cd63e43809fe0737c0d7a3a29479ecaa53d5e97eb21841e1ac

See more details on using hashes here.

File details

Details for the file pykmertools-0.1.3-cp39-abi3-manylinux_2_17_x86_64.manylinux2014_x86_64.whl.

File metadata

File hashes

Hashes for pykmertools-0.1.3-cp39-abi3-manylinux_2_17_x86_64.manylinux2014_x86_64.whl
Algorithm Hash digest
SHA256 103b85c220747dba4f188d947bd666dca134df4324faed31c6bd9aff97718ef5
MD5 17aaa3e6d105e384057d2a658cb67f7e
BLAKE2b-256 adc98d1abc761401efca328ef617de336004839dad8d6c2eddfd659bf661dd49

See more details on using hashes here.

File details

Details for the file pykmertools-0.1.3-cp39-abi3-manylinux_2_17_aarch64.manylinux2014_aarch64.whl.

File metadata

File hashes

Hashes for pykmertools-0.1.3-cp39-abi3-manylinux_2_17_aarch64.manylinux2014_aarch64.whl
Algorithm Hash digest
SHA256 a7f6a60d286f7992c81addef3347a705ddabb3adc13316a166281999858958ff
MD5 bf5080cc2803a3904ebf54bb4aa45687
BLAKE2b-256 87718ec5ed5f954e66dc8b081c8ff6f6e6a7371c1731c27fa50a87c48e4356f5

See more details on using hashes here.

File details

Details for the file pykmertools-0.1.3-cp39-abi3-macosx_11_0_arm64.whl.

File metadata

File hashes

Hashes for pykmertools-0.1.3-cp39-abi3-macosx_11_0_arm64.whl
Algorithm Hash digest
SHA256 7d0fa1425bb68e5deafbac834538b10acb531f84632677302d4c95ec6441776d
MD5 45b2ef75e98fcab58ac5507b3c5fefbe
BLAKE2b-256 525353c82de2ba00df211cb51e3ece9b8f60a2b1bc20195015684253ccbca2bc

See more details on using hashes here.

File details

Details for the file pykmertools-0.1.3-cp39-abi3-macosx_10_12_x86_64.whl.

File metadata

File hashes

Hashes for pykmertools-0.1.3-cp39-abi3-macosx_10_12_x86_64.whl
Algorithm Hash digest
SHA256 db17b180e0c17caf7227ffa31a16fcf577ef6c53a3103cdf0f141b0c385753b7
MD5 d5e0226b59c604cb5f7890045a8f105c
BLAKE2b-256 635a674c0488f7129897e53983393771d940567ecdb0caaf4157bb0684ea30b0

See more details on using hashes here.

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page