PyO3 bindings and Python interface to lightmotif, a library for platform-accelerated biological motif scanning using position weight matrices.
Project description
🎼🧬 lightmotif
A lightweight platform-accelerated library for biological motif scanning using position weight matrices.
🗺️ Overview
Motif scanning with position weight matrices (also known as position-specific scoring matrices) is a robust method for identifying motifs of fixed length inside a biological sequence. They can be used to identify transcription factor binding sites in DNA, or protease cleavage site in polypeptides. Position weight matrices are often viewed as sequence logos:
The lightmotif
library provides a Python module to run very efficient
searches for a motif encoded in a position weight matrix. The position
scanning combines several techniques to allow high-throughput processing
of sequences:
- Compile-time definition of alphabets and matrix dimensions.
- Sequence symbol encoding for fast table look-ups, as implemented in HMMER[1] or MEME[2]
- Striped sequence matrices to process several positions in parallel, inspired by Michael Farrar[3].
- Vectorized matrix row look-up using
permute
instructions of AVX2.
This is the Python version, there is a Rust crate available as well.
🔧 Installing
lightmotif
can be installed directly from PyPI,
which hosts some pre-built wheels for most mainstream platforms, as well as the
code required to compile from source with Rust:
$ pip install lightmotif
In the event you have to compile the package from source, all the required Rust libraries are vendored in the source distribution, and a Rust compiler will be setup automatically if there is none on the host machine.
💡 Example
The motif interface should be mostly compatible with the
Bio.motifs
module from Biopython. The notable difference is that
the calculate
method of PSSM objects expects a striped sequence instead.
import lightmotif
# Create a count matrix from an iterable of sequences
motif = lightmotif.create(["GTTGACCTTATCAAC", "GTTGATCCAGTCAAC"])
# Create a PSSM with 0.1 pseudocounts and uniform background frequencies
pwm = motif.counts.normalize(0.1)
pssm = pwm.log_odds()
# Encode the target sequence into a striped matrix
seq = "ATGTCCCAACAACGATACCCCGAGCCCATCGCCGTCATCGGCTCGGCATGCAGATTCCCAGGCG"
striped = lightmotif.stripe(seq)
# Compute scores using the fastest backend implementation for the host machine
scores = pssm.calculate(sseq)
⏱️ Benchmarks
Benchmarks use the MX000001
motif from PRODORIC[4], and the
complete genome of an
Escherichia coli K12 strain.
Benchmarks were run on a i7-10710U CPU running @1.10GHz, compiled with --target-cpu=native
.
lightmotif (avx2): 5,335,999 ns/iter (+/- 3,532,171) = 829.6 MiB/s
Bio.motifs: 346,620,369 ns/iter (+/- 35,120,487) = 12.8 MiB/s
MOODS.scan: 161,808,252 ns/iter (+/- 8,677,959) = 27.4 MiB/s
💭 Feedback
⚠️ Issue Tracker
Found a bug ? Have an enhancement request ? Head over to the GitHub issue tracker if you need to report or ask something. If you are filing in on a bug, please include as much information as you can about the issue, and try to recreate the same bug in a simple, easily reproducible situation.
📋 Changelog
This project adheres to Semantic Versioning and provides a changelog in the Keep a Changelog format.
⚖️ License
This library is provided under the open-source MIT license.
This project was developed by Martin Larralde during his PhD project at the European Molecular Biology Laboratory in the Zeller team.
📚 References
- [1] Eddy, Sean R. ‘Accelerated Profile HMM Searches’. PLOS Computational Biology 7, no. 10 (20 October 2011): e1002195. doi:10.1371/journal.pcbi.1002195.
- [2] Grant, Charles E., Timothy L. Bailey, and William Stafford Noble. ‘FIMO: Scanning for Occurrences of a given Motif’. Bioinformatics 27, no. 7 (1 April 2011): 1017–18. doi:10.1093/bioinformatics/btr064.
- [3] Farrar, Michael. ‘Striped Smith–Waterman Speeds Database Searches Six Times over Other SIMD Implementations’. Bioinformatics 23, no. 2 (15 January 2007): 156–61. doi:10.1093/bioinformatics/btl582.
- [4] Dudek, Christian-Alexander, and Dieter Jahn. ‘PRODORIC: State-of-the-Art Database of Prokaryotic Gene Regulation’. Nucleic Acids Research 50, no. D1 (7 January 2022): D295–302. doi:10.1093/nar/gkab1110.
Project details
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distributions
Hashes for lightmotif-0.5.1-pp39-pypy39_pp73-win_amd64.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | f7640b3eea4b2a36f5ef47ae8e8fe12330cdf78041ae4b743cc467c77784ed7f |
|
MD5 | 70a08bd2184cfb324df3c4a2233bda42 |
|
BLAKE2b-256 | 6511efd22d849ead99c9af68b632670546215235bb8f0d383251669281112a86 |
Hashes for lightmotif-0.5.1-pp39-pypy39_pp73-manylinux_2_17_x86_64.manylinux2014_x86_64.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | 3a697886d6187685c77fa45da1561c97b7d62e764916396d2435497f6b842668 |
|
MD5 | ebfe472ea751b768f68815eeb354da3e |
|
BLAKE2b-256 | db844482978ed4acbb236c6e81f893f29cde561b2bcf7fc7d69da0c96ef96cba |
Hashes for lightmotif-0.5.1-pp39-pypy39_pp73-manylinux_2_17_aarch64.manylinux2014_aarch64.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | e0d99308cffe9ba3a0fe23b46124db48a519621685f1c697a9fb7eb75c6b76f2 |
|
MD5 | 1158f126d805df16016fb931b684de78 |
|
BLAKE2b-256 | ef4661bb3ae25695ceff9812a591775e2691c67e1c146717b5fa8eba2fabdd8f |
Hashes for lightmotif-0.5.1-pp39-pypy39_pp73-macosx_10_9_x86_64.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | 8ef8f7f3e7a6e505793bc2a357c07bd3b0a8d8c2cb289a20864faa6292a9ae41 |
|
MD5 | 9e1e338af01b9283b915d1a2044bd91d |
|
BLAKE2b-256 | b2177f55003b08855b93b729b126e6d5f3b0c662a2915b715253777a62291463 |
Hashes for lightmotif-0.5.1-pp38-pypy38_pp73-win_amd64.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | ec8dcedb5a8e00888c6777e06f2f1a1ea0bf6e0733ff5c2f9346952847a95543 |
|
MD5 | 7673bc264fccf7eea3e7e34eb7827252 |
|
BLAKE2b-256 | 19de3210a0bb89bf0a06d60387222d152c21f723ba704f0e3a4abda656a1aa8d |
Hashes for lightmotif-0.5.1-pp38-pypy38_pp73-manylinux_2_17_x86_64.manylinux2014_x86_64.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | 44aa652ae4420e1aa9032cc9870c4bdab8875ce902c10c560a97235f718857f8 |
|
MD5 | 2a761b71bc819a63c51665b09ee35df4 |
|
BLAKE2b-256 | 16c8e07c2758a17689ba886c894e057b39350bc63c3d30017529544b17896519 |
Hashes for lightmotif-0.5.1-pp38-pypy38_pp73-manylinux_2_17_aarch64.manylinux2014_aarch64.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | a7df69410c9c58f6b1d61201141c8f8df9a2f1229688f98bfc7dacd63a4fb270 |
|
MD5 | d6ca1e2cda4ea599cd65202c6155244c |
|
BLAKE2b-256 | 0a3fa31f95e673acc0b54b9508de58d051cc88bdbada00149f56d3bda08a9527 |
Hashes for lightmotif-0.5.1-pp38-pypy38_pp73-macosx_10_9_x86_64.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | 3849143cbb8fff49c7e7b5f836ae1cbc1c8431ac52abece8d1d2c11857e68b14 |
|
MD5 | 1e2e6b8337d55387972f415fa1f7ee75 |
|
BLAKE2b-256 | 72d37f5375c5c493b3ad65d92faf30b60652630f8c0282b2e068bcedc0708062 |
Hashes for lightmotif-0.5.1-pp37-pypy37_pp73-win_amd64.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | 56d96522e7a35c2f093294f1fd9a04bab0a7db74ac767fa58a471c6fd4c5efb3 |
|
MD5 | 929a4c70adcff07c1322cde31f9ee886 |
|
BLAKE2b-256 | 33c24168e1cb74848d9487fad53381bd7434b68e38d93e463f8a3de601136e9f |
Hashes for lightmotif-0.5.1-pp37-pypy37_pp73-manylinux_2_17_x86_64.manylinux2014_x86_64.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | 4b826e7ab5166c127a4d7a2360cdc8f0e028fa283bc9473aa00ec15b2122d064 |
|
MD5 | 402cd56c159b5a2aef2d3cb23da7d6cf |
|
BLAKE2b-256 | 1350fb3c2676b4461b22c26ece33cef0437d74487cdd437625a64e810f26962a |
Hashes for lightmotif-0.5.1-pp37-pypy37_pp73-manylinux_2_17_aarch64.manylinux2014_aarch64.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | c4ada2ea6ba5190e07d179098ae29a09ae94440382a61e0d54fe2deb9d4269b2 |
|
MD5 | 2e79b0f5a2c68bf12725e20cbe9e83d2 |
|
BLAKE2b-256 | dea0ba2c258c06d50b32aed0de51502006561ef423e76e05eb5a82eed2971add |
Hashes for lightmotif-0.5.1-pp37-pypy37_pp73-macosx_10_9_x86_64.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | db960f0019244fcb1044d06547cbd4a8a8e6cfcd6e17cfbf3ac861084b4803ee |
|
MD5 | c283ebc7d91afca355de1603a9fe96ec |
|
BLAKE2b-256 | 5ebc2d96eeef925bf930fc8f02bb8bacc0d6deae94435c2af53201d0c143cb76 |
Hashes for lightmotif-0.5.1-cp311-cp311-win_amd64.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | 7ccaf5a47777ec0e2ec27c206bec2d9361959216ce106ac67736b79dfb5ba564 |
|
MD5 | b4e203069ef892b2c16b6eb5575d1e00 |
|
BLAKE2b-256 | 9a587997cf6990c544dde32a70602bcbee12b007f255fca39bf0a1d247eda477 |
Hashes for lightmotif-0.5.1-cp311-cp311-manylinux_2_17_x86_64.manylinux2014_x86_64.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | eb4823c689224e1e7d7ff4fa2b74e3ae63a703408de727c2436238668b881ea9 |
|
MD5 | 530c0fed3756655b82c967c31676170a |
|
BLAKE2b-256 | 2f7c86022ec3e611335a7ea9af251e0d584044ce39b5b0f8330899bcc856b9df |
Hashes for lightmotif-0.5.1-cp311-cp311-manylinux_2_17_aarch64.manylinux2014_aarch64.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | 68e7f5b114353dae158fd2ecf972fe5c566596183ae08656b10724bbd0846f8d |
|
MD5 | 08af56832575bb2fe514cc88b2d90c84 |
|
BLAKE2b-256 | 0b110b561c50f461526ef30cffa145a430c17079f6de5738e1df0dfa899ff739 |
Hashes for lightmotif-0.5.1-cp311-cp311-macosx_11_0_arm64.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | ed13aa67531182317014ecb05c6cb3b83643cc24e7ab9f6242fef206762e5a9e |
|
MD5 | a8aef988051f23a209f8830e71366242 |
|
BLAKE2b-256 | 8d81c83e74d52ef36c130a7e0c20e7a21e9e3c68799cc843cc0d3c4a573c09e5 |
Hashes for lightmotif-0.5.1-cp311-cp311-macosx_10_9_x86_64.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | 450fc020df35c3cb1359d21fa88a46d28befb7a92c3a77fd17581677f14bb5fa |
|
MD5 | 246b3c533400c7ca11643d8f78607a48 |
|
BLAKE2b-256 | a1ad29592ffb03e85ed30157318c38a011dc3066a2f1ec9ace3b94a2f9ee2d01 |
Hashes for lightmotif-0.5.1-cp310-cp310-win_amd64.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | 1d1ad6412758cbea093fa64f03a46a6f0d2225289df49f2d4424ef9e8f3f9efb |
|
MD5 | a79183443c8b95b47edfe5ad3863fc25 |
|
BLAKE2b-256 | 6ff444f773bdf094f2f9ed3f24b9060a53935435028817e82443c86b7df2cbdd |
Hashes for lightmotif-0.5.1-cp310-cp310-manylinux_2_17_x86_64.manylinux2014_x86_64.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | 6977e424b4add6a1cfc8353889a6091c8229e77d09835c82db227feeaecedbbe |
|
MD5 | 9b9c8ac6bc8074fe46edf8dbaf071054 |
|
BLAKE2b-256 | c477e0f08e3963b79b2293fdedab7db23546d40931ad8f178f3889e11ced3efd |
Hashes for lightmotif-0.5.1-cp310-cp310-manylinux_2_17_aarch64.manylinux2014_aarch64.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | b8f08ed39fa3dcc5c9049af0246cf9877b74d6a924cf20b59298db77b449ff2a |
|
MD5 | 656f54362c1b45b9ce17ff71045a49c1 |
|
BLAKE2b-256 | 046c45c17d3c5ca54429b2b79430bc7ee1fa5ab8e40312d62f1c75c513d907be |
Hashes for lightmotif-0.5.1-cp310-cp310-macosx_11_0_arm64.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | eb4cf734e562d9020e0791330da32d7a71d8f997977d5d993e8b88e6384f645b |
|
MD5 | 708678696919a6d62bfe03b9ddb5df06 |
|
BLAKE2b-256 | 7a2ad20c59cd7c95d777b0ebb2c31c6014035d81f687f43fef7b191874f6dddd |
Hashes for lightmotif-0.5.1-cp310-cp310-macosx_10_9_x86_64.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | 7625bceba6b38d17a9327d60331a846cf980cb6506624ba7b04f13715c9b06a3 |
|
MD5 | 814578cfaa24b588715d845918b10682 |
|
BLAKE2b-256 | be920dacf61be9e6b033478c6a0c7b8a4d698b081cc9b0a32e8bd8f0ff293609 |
Hashes for lightmotif-0.5.1-cp39-cp39-win_amd64.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | a95191a855e73527cfdbe9a444da8fa082f424a942ea4a4434385ec43d9505d7 |
|
MD5 | 27674cc7324bb5f8fc051b11fa95420e |
|
BLAKE2b-256 | 5ceacf6edbed0ce6c645c1372da8e9f8700529613c6846be1228924229fea823 |
Hashes for lightmotif-0.5.1-cp39-cp39-manylinux_2_17_x86_64.manylinux2014_x86_64.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | 63347c94a8fad964654cac784e80f34c1627b4e9ab84ce7faffde6bd2eab43b4 |
|
MD5 | 525f5ce0cc79143147717dc2fa589799 |
|
BLAKE2b-256 | 894114575c0c5f694a0ae6736bdf085219755aaa362303f5432bdf0efc5662aa |
Hashes for lightmotif-0.5.1-cp39-cp39-manylinux_2_17_aarch64.manylinux2014_aarch64.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | a6ed4cc3594fcc919d928e067db1bd39b9909f498178decd2b22354387b43adb |
|
MD5 | 4aa019a74ca9b460ebb24b351eac173c |
|
BLAKE2b-256 | 841a798f8c7bb706945d4300625738a0b68a2d681f86b9682627a800b5b34f10 |
Hashes for lightmotif-0.5.1-cp39-cp39-macosx_11_0_arm64.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | 8459a82464a5fdb964aefecdedf86b19ef42112663cbb7582d43f288bde67983 |
|
MD5 | 76f292422f8291eb48991ce91eb02eae |
|
BLAKE2b-256 | 9f6b4c40835d19287f964ea917aa6516d73f920689a6419895543c418f81b2ac |
Hashes for lightmotif-0.5.1-cp39-cp39-macosx_10_9_x86_64.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | 533b17186b25f230881d541462c26796bd0d6f09de624581ef05fd74ce387fea |
|
MD5 | 461042288ac15fb933bf43e101bb3134 |
|
BLAKE2b-256 | 7e51a469121b041a54a0fc996ab5b41eeea66e3afdff987923b9b81e64177f3c |
Hashes for lightmotif-0.5.1-cp38-cp38-win_amd64.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | 65c90c007d8bd4c4fb284f6a23fc7a6ec86f3ecb7d785dddaf89f05cd3988570 |
|
MD5 | 8cddd10c14f2e43c0fc89d4b9c62d7c5 |
|
BLAKE2b-256 | 94b711b63ad2acd5bdbc82cb5e464ebabf14ac79117c6a3c2001e5d125fc867a |
Hashes for lightmotif-0.5.1-cp38-cp38-manylinux_2_17_x86_64.manylinux2014_x86_64.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | d9451d58864827623417951d266db7eda206a5d01c20394e15e1b49c79217546 |
|
MD5 | fb1bf36fb50909b66d678c03bf36baa0 |
|
BLAKE2b-256 | a060d49b55ce116801b55fae7f43a72e7afcc91e7debe99adf1dc185ff3803dd |
Hashes for lightmotif-0.5.1-cp38-cp38-manylinux_2_17_aarch64.manylinux2014_aarch64.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | d2836183358aeb21b7f651adb8b19867129704c861420cacf49c460f90d695f7 |
|
MD5 | 08a5302f2a36334abcd99b84cef783e0 |
|
BLAKE2b-256 | 9597d35bfa6e161213d08eb1ce75ce8054695eabf99530f7930608fd36d116c4 |
Hashes for lightmotif-0.5.1-cp38-cp38-macosx_11_0_arm64.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | e4a2e6a1c663dee567d7319bd105a28c6790560c34bcd119c68d791f25fb141a |
|
MD5 | 4fff99007187cd57f46c95a17a7b1ff3 |
|
BLAKE2b-256 | 32d39e9c5ff282beeffc3a8c63e446ce67a13004e5ca4dbe77374884d9da6cd5 |
Hashes for lightmotif-0.5.1-cp38-cp38-macosx_10_9_x86_64.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | 25654f0dc2fae579a2a7e5c8864e198293d7af22a65080cac9ffbbfe6ef10fe0 |
|
MD5 | d4a55a61d672fc854cebbf63056ad454 |
|
BLAKE2b-256 | bc2fdb8a194c4a4b0800e7bc228882af07fbb615342fe4fcb764132dc57cffce |
Hashes for lightmotif-0.5.1-cp37-cp37m-win_amd64.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | 9cf32c06003fa6ae14a88fab11ea3da39f4dc7e90bf3fc696658eb369789c645 |
|
MD5 | 9be73289766b818834020a7bf75679d2 |
|
BLAKE2b-256 | 766a4f26d34612f2cdb9dda0c4b2a3fb80f8aaab985906c4e475334d6c05e32d |
Hashes for lightmotif-0.5.1-cp37-cp37m-manylinux_2_17_x86_64.manylinux2014_x86_64.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | 5cfd8499fbc87f41496ccd7d978702a2686b76b79c828f5dfb35b0999dab04f5 |
|
MD5 | d0cb4fd675d4c90fcea4efb9069fb80f |
|
BLAKE2b-256 | e0ccaaa3412c3c18a18da71e9e8eddf3a6bc05250935d251c8c9b9f76b6728b3 |
Hashes for lightmotif-0.5.1-cp37-cp37m-manylinux_2_17_aarch64.manylinux2014_aarch64.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | 8bcbb7fa9f022b84cda2534a6cc8dad5930b8891dbfaf0cc19df7f692278a9fe |
|
MD5 | 86188936e7553f533f710c90529822f9 |
|
BLAKE2b-256 | 36e94ebb71df7b04a18fb0954de667f8677a61d1aa6e79ba2c5ca68a5a4aaa80 |
Hashes for lightmotif-0.5.1-cp37-cp37m-macosx_10_9_x86_64.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | 4308ccd5f155a4763f1c4eea1a51a5247dc5849ada3aa892eabef8dd84db9de5 |
|
MD5 | dd3d26a6856c021616de19c11b50f0ef |
|
BLAKE2b-256 | ad2849a46c86e409edffce859faedf3d2a48abefbc904f17e95a9d2228a8ca54 |