Skip to main content

Extract uniq kmer into an ondisk efficient datastructure and allows querying

Project description

A tool to store kmer unique to dataset

To install:

pip install kmeruniq 

Two commands:

  • kmeruniq build
  • kmeruniq query

Check the help with -h flag.

Usage example of the CLI

kmeruniq build -k 21 --fof my_file.fof --output path/to/index -e --shard 20 
kmeruniq query --query_str ACGAAACGTACATTCACACACACACATAGAGAAGGAGAGCAGCACACACA --index-path path/to/index
kmeruniq query --query_file path/to/some/data.fa --index-path path/to/index

The output is a the result of a Counter for each value.

The fof format

A file of file. That is a line separated list of file.

/path/to/foo.fa
/path/to/bar.fa
/path/to/barrr.fa.gz

Each line can also specify a label for the file. Two file with the same label will be considered as merge. For instance:

/path/to/foo.fa     ;chr1
/path/to/bar.fa     ;chr2
/path/to/barrr.fa.gz   ;chr1

Here the foo.fa and barr.fa.gz will be merged together within the index.

Remark that file can be gzip compressed.

Usage example of the Python API:

The Kmer and DNA datatype are the one used by vizibridge.

from kmeruniq.index import Index
from vizibridge import Kmer, DNA

idx = Index("path/to/my/index")
# idx is a dict-like object keyed by kmer and valued by the annotation

idx["ACG..ACG"] # some kmer of the appropriate size. 

for kmer in idx:
    print(idx[kmer]) # print the value of each kmer

dna = DNA("ACG....ACGT") # long sequence of DNA from somewhere

for kmer in dna.enum_canonical_kmer(idx.k):
    print(idx.get(kmer)) # print the value associated to kmer or None if kmer not inside.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

kmeruniq-0.5b0.tar.gz (6.7 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

kmeruniq-0.5b0-py3-none-any.whl (11.0 kB view details)

Uploaded Python 3

File details

Details for the file kmeruniq-0.5b0.tar.gz.

File metadata

  • Download URL: kmeruniq-0.5b0.tar.gz
  • Upload date:
  • Size: 6.7 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.1.0 CPython/3.13.5

File hashes

Hashes for kmeruniq-0.5b0.tar.gz
Algorithm Hash digest
SHA256 9396a0eaf653bf8881472aef8d04aa75b145a3ac126fd181e1323bf8e88e3f98
MD5 f90b52222940f2f87e21c9a724d06eb4
BLAKE2b-256 e2c426c8f9e43d09fa683e5847aabf6a8c723a52b0fd5a2001d63bd6f4afcf77

See more details on using hashes here.

File details

Details for the file kmeruniq-0.5b0-py3-none-any.whl.

File metadata

  • Download URL: kmeruniq-0.5b0-py3-none-any.whl
  • Upload date:
  • Size: 11.0 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.1.0 CPython/3.13.5

File hashes

Hashes for kmeruniq-0.5b0-py3-none-any.whl
Algorithm Hash digest
SHA256 127de503d4da700808b94383fa8ba62e10f33eff128179f317ce910b7d9fbe0e
MD5 d019961dfcdfa8a98b0312fd8de34402
BLAKE2b-256 c31e7d7cfa1faa3bdc21588c6c2c17bdee0d24c6c3f491c11247fb9d818298eb

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page