Skip to main content

Extract uniq kmer into an ondisk efficient datastructure and allows querying

Project description

A tool to store kmer unique to dataset

To install:

pip install kmeruniq 

Two commands:

  • kmeruniq build
  • kmeruniq query

Check the help with -h flag.

Usage example of the CLI

kmeruniq build -k 21 --fof my_file.fof --output path/to/index -e --shard 20 
kmeruniq query --query_str ACGAAACGTACATTCACACACACACATAGAGAAGGAGAGCAGCACACACA --index-path path/to/index
kmeruniq query --query_file path/to/some/data.fa --index-path path/to/index

The output is a the result of a Counter for each value.

The fof format

A file of file. That is a line separated list of file.

/path/to/foo.fa
/path/to/bar.fa
/path/to/barrr.fa.gz

Each line can also specify a label for the file. Two file with the same label will be considered as merge. For instance:

/path/to/foo.fa     ;chr1
/path/to/bar.fa     ;chr2
/path/to/barrr.fa.gz   ;chr1

Here the foo.fa and barr.fa.gz will be merged together within the index.

Remark that file can be gzip compressed.

Usage example of the Python API:

The Kmer and DNA datatype are the one used by vizibridge.

from kmeruniq.index import Index
from vizibridge import Kmer, DNA

idx = Index("path/to/my/index")
# idx is a dict-like object keyed by kmer and valued by the annotation

idx["ACG..ACG"] # some kmer of the appropriate size. 

for kmer in idx:
    print(idx[kmer]) # print the value of each kmer

dna = DNA("ACG....ACGT") # long sequence of DNA from somewhere

for kmer in dna.enum_canonical_kmer(idx.k):
    print(idx.get(kmer)) # print the value associated to kmer or None if kmer not inside.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

kmeruniq-0.5a0.tar.gz (4.5 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

kmeruniq-0.5a0-py3-none-any.whl (6.7 kB view details)

Uploaded Python 3

File details

Details for the file kmeruniq-0.5a0.tar.gz.

File metadata

  • Download URL: kmeruniq-0.5a0.tar.gz
  • Upload date:
  • Size: 4.5 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.1.0 CPython/3.13.3

File hashes

Hashes for kmeruniq-0.5a0.tar.gz
Algorithm Hash digest
SHA256 3c57bda3179f97a5e3c81d270e2034eb9fb942ebae4df4edd627ae76f323df82
MD5 95b15bae918d5e08fef3e9f3b96d7155
BLAKE2b-256 5d4946234080cdec5840229c7d5f3366a31805f6bb10195f1305e966c7e9b48a

See more details on using hashes here.

File details

Details for the file kmeruniq-0.5a0-py3-none-any.whl.

File metadata

  • Download URL: kmeruniq-0.5a0-py3-none-any.whl
  • Upload date:
  • Size: 6.7 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.1.0 CPython/3.13.3

File hashes

Hashes for kmeruniq-0.5a0-py3-none-any.whl
Algorithm Hash digest
SHA256 1be7081b7547e2e70ddeb601e48906afa47218b435530b9670a9ac923fa6e429
MD5 7585c3b4d8d3f5154401bfcecd28508f
BLAKE2b-256 416e585c0b31c3b7fd58dceacb1129a8683ee75469bf04e30ad0140b0e7a15fb

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page