Skip to main content

Extract uniq kmer into an ondisk efficient datastructure and allows querying

Project description

A tool to store kmer unique to dataset

To install:

pip install kmeruniq 

Two commands:

  • kmeruniq build
  • kmeruniq query

Check the help with -h flag.

Usage example of the CLI

kmeruniq build -k 21 --fof my_file.fof --output path/to/index -e --shard 20 
kmeruniq query --query_str ACGAAACGTACATTCACACACACACATAGAGAAGGAGAGCAGCACACACA --index-path path/to/index
kmeruniq query --query_file path/to/some/data.fa --index-path path/to/index

The output is a the result of a Counter for each value.

The fof format

A file of file. That is a line separated list of file.

/path/to/foo.fa
/path/to/bar.fa
/path/to/barrr.fa.gz

Each line can also specify a label for the file. Two file with the same label will be considered as merge. For instance:

/path/to/foo.fa     ;chr1
/path/to/bar.fa     ;chr2
/path/to/barrr.fa.gz   ;chr1

Here the foo.fa and barr.fa.gz will be merged together within the index.

Remark that file can be gzip compressed.

Usage example of the Python API:

The Kmer and DNA datatype are the one used by vizibridge.

from kmeruniq.index import Index
from vizibridge import Kmer, DNA

idx = Index("path/to/my/index")
# idx is a dict-like object keyed by kmer and valued by the annotation

idx["ACG..ACG"] # some kmer of the appropriate size. 

for kmer in idx:
    print(idx[kmer]) # print the value of each kmer

dna = DNA("ACG....ACGT") # long sequence of DNA from somewhere

for kmer in dna.enum_canonical_kmer(idx.k):
    print(idx.get(kmer)) # print the value associated to kmer or None if kmer not inside.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

kmeruniq-0.6.tar.gz (6.6 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

kmeruniq-0.6-py3-none-any.whl (11.0 kB view details)

Uploaded Python 3

File details

Details for the file kmeruniq-0.6.tar.gz.

File metadata

  • Download URL: kmeruniq-0.6.tar.gz
  • Upload date:
  • Size: 6.6 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.1.0 CPython/3.13.5

File hashes

Hashes for kmeruniq-0.6.tar.gz
Algorithm Hash digest
SHA256 defe9b44ffbf828d611c4b08b228c53ed1ff9d74e26980dc87257dde97b74007
MD5 29243c74b0a42c580c1a691acd0e9136
BLAKE2b-256 47ab4587cec256831742a400a28a0455b3604ccdf57c30ab076f8d0209a39a24

See more details on using hashes here.

File details

Details for the file kmeruniq-0.6-py3-none-any.whl.

File metadata

  • Download URL: kmeruniq-0.6-py3-none-any.whl
  • Upload date:
  • Size: 11.0 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.1.0 CPython/3.13.5

File hashes

Hashes for kmeruniq-0.6-py3-none-any.whl
Algorithm Hash digest
SHA256 7dbe88cfc1fa1cf06e6533a38d93b6d65cd830d193a6a6a243682da4d54e8617
MD5 298934660dde2e3bd33c8700458a4627
BLAKE2b-256 3e2af8d207e3133fc9e054bba1ca9374b8b1241bf11c7ae396690358aaffc248

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page