Extract uniq kmer into an ondisk efficient datastructure and allows querying
Project description
A tool to store kmer unique to dataset
To install:
pip install kmeruniq
Two commands:
kmeruniq buildkmeruniq query
Check the help with -h flag.
Usage example of the CLI
kmeruniq build -k 21 --fof my_file.fof --output path/to/index -e --shard 20
kmeruniq query --query_str ACGAAACGTACATTCACACACACACATAGAGAAGGAGAGCAGCACACACA --index-path path/to/index
kmeruniq query --query_file path/to/some/data.fa --index-path path/to/index
The output is a the result of a Counter for each value.
The fof format
A file of file. That is a line separated list of file.
/path/to/foo.fa
/path/to/bar.fa
/path/to/barrr.fa.gz
Each line can also specify a label for the file. Two file with the same label will be considered as merge. For instance:
/path/to/foo.fa ;chr1
/path/to/bar.fa ;chr2
/path/to/barrr.fa.gz ;chr1
Here the foo.fa and barr.fa.gz will be merged together within the index.
Remark that file can be gzip compressed.
Usage example of the Python API:
The Kmer and DNA datatype are the one used by vizibridge.
from kmeruniq.index import Index
from vizibridge import Kmer, DNA
idx = Index("path/to/my/index")
# idx is a dict-like object keyed by kmer and valued by the annotation
idx["ACG..ACG"] # some kmer of the appropriate size.
for kmer in idx:
print(idx[kmer]) # print the value of each kmer
dna = DNA("ACG....ACGT") # long sequence of DNA from somewhere
for kmer in dna.enum_canonical_kmer(idx.k):
print(idx.get(kmer)) # print the value associated to kmer or None if kmer not inside.
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file kmeruniq-0.6.tar.gz.
File metadata
- Download URL: kmeruniq-0.6.tar.gz
- Upload date:
- Size: 6.6 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.1.0 CPython/3.13.5
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
defe9b44ffbf828d611c4b08b228c53ed1ff9d74e26980dc87257dde97b74007
|
|
| MD5 |
29243c74b0a42c580c1a691acd0e9136
|
|
| BLAKE2b-256 |
47ab4587cec256831742a400a28a0455b3604ccdf57c30ab076f8d0209a39a24
|
File details
Details for the file kmeruniq-0.6-py3-none-any.whl.
File metadata
- Download URL: kmeruniq-0.6-py3-none-any.whl
- Upload date:
- Size: 11.0 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.1.0 CPython/3.13.5
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
7dbe88cfc1fa1cf06e6533a38d93b6d65cd830d193a6a6a243682da4d54e8617
|
|
| MD5 |
298934660dde2e3bd33c8700458a4627
|
|
| BLAKE2b-256 |
3e2af8d207e3133fc9e054bba1ca9374b8b1241bf11c7ae396690358aaffc248
|