PyO3 bindings and Python interface to lightmotif, a library for platform-accelerated biological motif scanning using position weight matrices.
Project description
🎼🧬 lightmotif
A lightweight platform-accelerated library for biological motif scanning using position weight matrices.
🗺️ Overview
Motif scanning with position weight matrices (also known as position-specific scoring matrices) is a robust method for identifying motifs of fixed length inside a biological sequence. They can be used to identify transcription factor binding sites in DNA, or protease cleavage site in polypeptides. Position weight matrices are often viewed as sequence logos:
The lightmotif
library provides a Python module to run very efficient
searches for a motif encoded in a position weight matrix. The position
scanning combines several techniques to allow high-throughput processing
of sequences:
- Compile-time definition of alphabets and matrix dimensions.
- Sequence symbol encoding for fast table look-ups, as implemented in HMMER[1] or MEME[2]
- Striped sequence matrices to process several positions in parallel, inspired by Michael Farrar[3].
- Vectorized matrix row look-up using
permute
instructions of AVX2.
This is the Python version, there is a Rust crate available as well.
🔧 Installing
lightmotif
can be installed directly from PyPI,
which hosts some pre-built wheels for most mainstream platforms, as well as the
code required to compile from source with Rust:
$ pip install lightmotif
In the event you have to compile the package from source, all the required Rust libraries are vendored in the source distribution, and a Rust compiler will be setup automatically if there is none on the host machine.
💡 Example
The motif interface should be mostly compatible with the
Bio.motifs
module from Biopython. The notable difference is that
the calculate
method of PSSM objects expects a striped sequence instead.
import lightmotif
# Create a count matrix from an iterable of sequences
motif = lightmotif.create(["GTTGACCTTATCAAC", "GTTGATCCAGTCAAC"])
# Create a PSSM with 0.1 pseudocounts and uniform background frequencies
pwm = motif.counts.normalize(0.1)
pssm = pwm.log_odds()
# Encode the target sequence into a striped matrix
seq = "ATGTCCCAACAACGATACCCCGAGCCCATCGCCGTCATCGGCTCGGCATGCAGATTCCCAGGCG"
striped = lightmotif.stripe(seq)
# Compute scores using the fastest backend implementation for the host machine
scores = pssm.calculate(sseq)
⏱️ Benchmarks
Benchmarks use the MX000001
motif from PRODORIC[4], and the
complete genome of an
Escherichia coli K12 strain.
Benchmarks were run on a i7-10710U CPU running @1.10GHz, compiled with --target-cpu=native
.
lightmotif (avx2): 5,479,884 ns/iter (+/- 3,370,523) = 807.8 MiB/s
Bio.motifs: 334,359,765 ns/iter (+/- 11,045,456) = 13.2 MiB/s
MOODS.scan: 182,710,624 ns/iter (+/- 9,459,257) = 24.2 MiB/s
pymemesuite.fimo: 239,694,118 ns/iter (+/- 7,444,620) = 18.5 MiB/s
💭 Feedback
⚠️ Issue Tracker
Found a bug ? Have an enhancement request ? Head over to the GitHub issue tracker if you need to report or ask something. If you are filing in on a bug, please include as much information as you can about the issue, and try to recreate the same bug in a simple, easily reproducible situation.
📋 Changelog
This project adheres to Semantic Versioning and provides a changelog in the Keep a Changelog format.
⚖️ License
This library is provided under the open-source MIT license.
This project was developed by Martin Larralde during his PhD project at the European Molecular Biology Laboratory in the Zeller team.
📚 References
- [1] Eddy, Sean R. ‘Accelerated Profile HMM Searches’. PLOS Computational Biology 7, no. 10 (20 October 2011): e1002195. doi:10.1371/journal.pcbi.1002195.
- [2] Grant, Charles E., Timothy L. Bailey, and William Stafford Noble. ‘FIMO: Scanning for Occurrences of a given Motif’. Bioinformatics 27, no. 7 (1 April 2011): 1017–18. doi:10.1093/bioinformatics/btr064.
- [3] Farrar, Michael. ‘Striped Smith–Waterman Speeds Database Searches Six Times over Other SIMD Implementations’. Bioinformatics 23, no. 2 (15 January 2007): 156–61. doi:10.1093/bioinformatics/btl582.
- [4] Dudek, Christian-Alexander, and Dieter Jahn. ‘PRODORIC: State-of-the-Art Database of Prokaryotic Gene Regulation’. Nucleic Acids Research 50, no. D1 (7 January 2022): D295–302. doi:10.1093/nar/gkab1110.
Project details
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distributions
Hashes for lightmotif-0.6.0-pp39-pypy39_pp73-win_amd64.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | 5736470fce2080dc36391aaf2852ce1bf621764805e1350403a1410ba22ec408 |
|
MD5 | c090a2e107f7528b7a8269385fc36689 |
|
BLAKE2b-256 | 840b51079b1d3a47888350fddc04d1c255004f2c649f035b5d2b3662ce0ca93c |
Hashes for lightmotif-0.6.0-pp39-pypy39_pp73-manylinux_2_17_x86_64.manylinux2014_x86_64.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | 823e545c32435768b3d895eafcdcc33541418f3a3b092ef2ea0a99370463d6f5 |
|
MD5 | 239304d31635e48164e2e21f54ead37e |
|
BLAKE2b-256 | f4dc4179a1a0f4776b4aab465f24920dfa2abdadc0a84778a69b42459c6020c4 |
Hashes for lightmotif-0.6.0-pp39-pypy39_pp73-manylinux_2_17_aarch64.manylinux2014_aarch64.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | 27974139c91f7abeb2f9539f004cdcc542c97baa9a624878a6b1b128f196ef41 |
|
MD5 | 11e82c0cd36f2526727bcb6a29540a94 |
|
BLAKE2b-256 | 7c0f83ee7492a7a8516843bb04198ea1dd2a6e9101fe81ff62e748b1fc2ab1d5 |
Hashes for lightmotif-0.6.0-pp39-pypy39_pp73-macosx_10_9_x86_64.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | 747ed37260ff01070826a32b92d8ea6e9145728807a94bc2c1445c6839b80ce7 |
|
MD5 | e35bb92392e775fe7ac1ec6d69ad7999 |
|
BLAKE2b-256 | e3e18846d97b7305b86c1514481b0088805ae742496a46d977abbafb5904a540 |
Hashes for lightmotif-0.6.0-pp38-pypy38_pp73-win_amd64.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | 5ffb1597f98bdb7f1fa7cd7aa7a8a5bdd3ec39e9eb6f71622deaf76efe4f438a |
|
MD5 | c15957b4204e48ccef759ed9168db19b |
|
BLAKE2b-256 | 7e8e63476aed1bd44c60da50edd6cbcc22a3c2ccd4364275151a61f0ba1f9564 |
Hashes for lightmotif-0.6.0-pp38-pypy38_pp73-manylinux_2_17_x86_64.manylinux2014_x86_64.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | fe639a3ea20fe245d8b93130d1f271648da026cba9a90906fee34a3e305bc0b5 |
|
MD5 | 721ef562713418feab6888d99d8f3540 |
|
BLAKE2b-256 | 2017b3e3c455a892a78044fd9e815b27672a9456f6855b9ae67adc970c23087f |
Hashes for lightmotif-0.6.0-pp38-pypy38_pp73-manylinux_2_17_aarch64.manylinux2014_aarch64.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | 27725d01d758dab43efb6060fbbf0c3f513c822d793a76aed44bf2d54af358c7 |
|
MD5 | e0d5b92f42e321a938eb3f6a2182d33c |
|
BLAKE2b-256 | 87eb1a81ea9e7b2590b799626cec7cb3109166fb17e6b3b1c338deae0c444b55 |
Hashes for lightmotif-0.6.0-pp38-pypy38_pp73-macosx_10_9_x86_64.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | e9fabe90f669607c3fc028ed7ced4e323d60eb05a8568e1378454451c936b773 |
|
MD5 | e4ce7bf9b6ca5114149c8df255a7ac91 |
|
BLAKE2b-256 | f4a2355cfb320fb69c74a8ea5f9faa9580cf0ee91032bdd15dde0989ceb6054c |
Hashes for lightmotif-0.6.0-pp37-pypy37_pp73-win_amd64.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | 22cc80aef29e97149312469167d59e94dac69c5180388bd5a4f1309edc35b8c4 |
|
MD5 | b6184d3244041705c970f11527c4a98f |
|
BLAKE2b-256 | 61c4bb940518a282ba5aeff42aa08c9a1a35e1ef69b343d3819dec5433ca00a7 |
Hashes for lightmotif-0.6.0-pp37-pypy37_pp73-manylinux_2_17_x86_64.manylinux2014_x86_64.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | df26ca89ef019fcfbfce309ae248bedf250d092c5371e02d3e97cc73470072b3 |
|
MD5 | 3db03ca8e7c0dd353e9d36b644ce92b8 |
|
BLAKE2b-256 | 97ffffde751e8804df9234fe10fffa7e00ac51adf89ade9aa7f4e56850fbe92b |
Hashes for lightmotif-0.6.0-pp37-pypy37_pp73-manylinux_2_17_aarch64.manylinux2014_aarch64.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | f109e355518e9843644a633245330faebf7abd8ff1a1c9ba73e10024357732a3 |
|
MD5 | bc6afb2c6a98be57abca92f780fba6b0 |
|
BLAKE2b-256 | 3547a6984e6299ab0992d9fb7376ce2428a8faaa916b5122a31340a7181479a2 |
Hashes for lightmotif-0.6.0-pp37-pypy37_pp73-macosx_10_9_x86_64.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | a626f9c35dcaa9ea0e68fed8a50be22925dadc092e004e44824810cf57dddbd2 |
|
MD5 | 2e15e0a79e59bc5c84587bd05057a758 |
|
BLAKE2b-256 | 6cc746e944a1d99ce718432922eea540370ee890f3143568279cde82d8d3a872 |
Hashes for lightmotif-0.6.0-cp311-cp311-win_amd64.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | dbe06f2651decdd19420d4948dc96b97393da4549af3f64f160324863e9b7297 |
|
MD5 | b66c0b33ad7ca0d68315153e77b266fa |
|
BLAKE2b-256 | 1d6afe593306f784de0e8a417da55f3b0ab06c54dc0c52ce0eb245727608d12f |
Hashes for lightmotif-0.6.0-cp311-cp311-manylinux_2_17_x86_64.manylinux2014_x86_64.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | 1400620063f66cc8b628b08dbf06fd2fddbbb582c1358eb327d2adda1681af34 |
|
MD5 | ddb43726ebf9e60f8ffcf34401eded4b |
|
BLAKE2b-256 | 058c0f9f1daa673c913c8d2d3d87cc6018d61e4b15b721f6ecfdeabaced864f6 |
Hashes for lightmotif-0.6.0-cp311-cp311-manylinux_2_17_aarch64.manylinux2014_aarch64.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | 880a09ab5803cb7ea776807bff9c192e3eb93bef94df475c3935d628fc6b9e42 |
|
MD5 | 345619e95496e55b3d76585bc4aea463 |
|
BLAKE2b-256 | a6f5a959aa8f04ebbe01e52a1cdbb1b54da46a7e3daffee2e46250aa23de5ae5 |
Hashes for lightmotif-0.6.0-cp311-cp311-macosx_11_0_arm64.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | 1560b4cc79b85a0c0f99fa93af8900d5f600ad2363ca51dd91c43efc73609142 |
|
MD5 | fc40c401d616a0019933103e89dc570d |
|
BLAKE2b-256 | 712252557477f9d8c91b672de173627c1106bf32f2ed6f596f970113b7abb2a0 |
Hashes for lightmotif-0.6.0-cp311-cp311-macosx_10_9_x86_64.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | 66e3f8a5baf292dbb8798e385ad56e00fd2c6d7665e20d1cac75303fb79f3312 |
|
MD5 | 3cfdcc094427f5df2c10ee9ef8275566 |
|
BLAKE2b-256 | 8efd758b1707e45eb4800fde8d1f0e42892fd709883d0454541114c0ceb3c06b |
Hashes for lightmotif-0.6.0-cp310-cp310-win_amd64.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | 5afa5c00855a390af2f64f7f524d3433049c6d9fe14b13cf65da29bf7cfc24ed |
|
MD5 | fb1a8efe7b56f6b19ac785f2f265ed03 |
|
BLAKE2b-256 | 258c78cfd19455ae3489ce76cd2509250fe3cc02e19a6fcdf8e2d95f699ada62 |
Hashes for lightmotif-0.6.0-cp310-cp310-manylinux_2_17_x86_64.manylinux2014_x86_64.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | beeca62d6588ff67a37b04a3c2b2bf6518ea6ddedd73df72ba72b959b94d92a6 |
|
MD5 | 6d8be05c00f62fd3c830cdfbc3e5354f |
|
BLAKE2b-256 | 217d0cb2b3a0d5737e3f5d13c047535ab34e72df91d0f5e7b0bbb01fd823085c |
Hashes for lightmotif-0.6.0-cp310-cp310-manylinux_2_17_aarch64.manylinux2014_aarch64.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | c162fba16768a1e77544fd6c5e622f335c44e438cb4d994e2dcb95a0f4e0e2c5 |
|
MD5 | 9d130b936e4fe1050a946dd23b0de2b4 |
|
BLAKE2b-256 | 60f55e225a93c4287bf68da79cfd0aa986173da10c4ab13f9c4c875c4b9679b3 |
Hashes for lightmotif-0.6.0-cp310-cp310-macosx_11_0_arm64.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | 0a737326b4e4865231e515d79da42a8462f43c5aca482e4b6f223db46eb29c53 |
|
MD5 | a8a4e63c5e7a941d3195438916f3f7c2 |
|
BLAKE2b-256 | a24da460c576e6d89affefe0799b88f998229c10ea997840f8d85418b9451b7a |
Hashes for lightmotif-0.6.0-cp310-cp310-macosx_10_9_x86_64.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | 965a6e6e39a206b91cc73f06bb42f56f1bf9c14ec1acd692a9198d04ab1b7c26 |
|
MD5 | a5302b24218ce516c159b21d4902c8fc |
|
BLAKE2b-256 | 7240f8a621a53b4bc70f3d5da4f8292f84b36291322de5d14585899bc4469146 |
Hashes for lightmotif-0.6.0-cp39-cp39-win_amd64.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | 7f7965e9d93aacb50bda828f3159b73e971c3231f3cb24722c8140977f6ae0de |
|
MD5 | 644cb01ae742d121d45075969b93c7f0 |
|
BLAKE2b-256 | a43ade5ad7e6d2abfb98f473872d77c365f243faecf075870fe94ce72b04da05 |
Hashes for lightmotif-0.6.0-cp39-cp39-manylinux_2_17_x86_64.manylinux2014_x86_64.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | 2646f42f0aa2c94e8b460ad810a402fb1dcae5a9d24cdd0d2e0c14581c4e61f5 |
|
MD5 | f58385b4bc85e3053ab996cee399636d |
|
BLAKE2b-256 | 3db3a776852c42638be9cd9f8e0b2b01bfe3313a55f4ba10a0bdd9462127217d |
Hashes for lightmotif-0.6.0-cp39-cp39-manylinux_2_17_aarch64.manylinux2014_aarch64.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | c43533e41107dac37e16656a590b52b9b6e22eafc731ed82fc2bb86aca96b684 |
|
MD5 | 3b01b20d90eb067d7fe7e9c4d3f41ed7 |
|
BLAKE2b-256 | 15ac83d878e9b98d5273903bcb1feb6fce29b2f6face631628682b75937ec17d |
Hashes for lightmotif-0.6.0-cp39-cp39-macosx_11_0_arm64.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | 9b5c1e15a6ceea3e7516ed1c13110efd26805fb4edd401d001cf2fbf2bd3ea05 |
|
MD5 | 0da186c8fd47c0c4de5889fd744e0a59 |
|
BLAKE2b-256 | 86f08ad67441142ff95293119a8a8af3ff7bf596e5653b862d737493a307a204 |
Hashes for lightmotif-0.6.0-cp39-cp39-macosx_10_9_x86_64.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | 4f2748867a9bb5fb95235fb4ccd087d77d20e8590a86887c3456e62a5df00722 |
|
MD5 | 18e8ba0098d8ec71ef3d2a8029c899f5 |
|
BLAKE2b-256 | feaa6e4efad464c97eb4ab0eb0dda3e81a916d337711538081e3b6df65b06a41 |
Hashes for lightmotif-0.6.0-cp38-cp38-win_amd64.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | af15f9a98c86c16cf271047f005ad7625a64e6a79eeb78a5d67bccc81f5d6f8e |
|
MD5 | f2a61bdb21fe896e6785bf57ad7a7060 |
|
BLAKE2b-256 | 5c157c623d21665ce219e05617dbf9794d5ab9dd97a1f84db8044d35fb81a3c8 |
Hashes for lightmotif-0.6.0-cp38-cp38-manylinux_2_17_x86_64.manylinux2014_x86_64.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | cb6742a19a5d84ae9be84c77ee832326d7f4646cf5ba1cc3811b25eea657d71e |
|
MD5 | 3b02afcfb9570bea175e7a72074a4dc2 |
|
BLAKE2b-256 | cc047e1f60c7fd90fa8dadbe8c44db4fc7d59b52e8ef26247c4b2792387191bd |
Hashes for lightmotif-0.6.0-cp38-cp38-manylinux_2_17_aarch64.manylinux2014_aarch64.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | 0751785a896fb1842e83c313d8fb0ca5e503563ed058d819af6e29babf368b68 |
|
MD5 | afe3c7d48340a4a9e2427164c74925ab |
|
BLAKE2b-256 | 229ef520ba6641fa671658df23ae18c664f04df493e45cd20ad2bd4e4580a632 |
Hashes for lightmotif-0.6.0-cp38-cp38-macosx_11_0_arm64.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | a7e8c81e419b48b60118760bd36f658b4346cd4c788a247400c8813781348a6c |
|
MD5 | b7dfd51b783093cab78552a0b78cd647 |
|
BLAKE2b-256 | 5ee4165f4d6f1591e526f7d444dfb0b5881ae92638a634dada8f9fcf5adba259 |
Hashes for lightmotif-0.6.0-cp38-cp38-macosx_10_9_x86_64.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | 3abb5b892ed098b48646dd303fda436ae88b9ae09c15297c5dbb26e9f5ef3e93 |
|
MD5 | 659057e7e0226a8b3f91db80dbcb82a1 |
|
BLAKE2b-256 | 498fe6ffd5e7dea906322350f8d1af2e49cf5061d63d738d3b7b3542a74feab5 |
Hashes for lightmotif-0.6.0-cp37-cp37m-win_amd64.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | 5a125c065965e507fab702fade55aba642b1ef5a6d55afac6c79de85078063d5 |
|
MD5 | ea0aa3f87baef0c09e11be39867ebda3 |
|
BLAKE2b-256 | 5fd4fd1d89a8b02e23f66db34db3309755360f459cc988327097a014192f603b |
Hashes for lightmotif-0.6.0-cp37-cp37m-manylinux_2_17_x86_64.manylinux2014_x86_64.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | bc837e04b15dce85b0639be328ee80a2c4bccf8896e33c5c66044d6e4eb00989 |
|
MD5 | e246de52056e27da002ebea41344e60e |
|
BLAKE2b-256 | b7ab050814c918675e335bb4524b1ea6e947923559031f4f45107c0fea93e825 |
Hashes for lightmotif-0.6.0-cp37-cp37m-manylinux_2_17_aarch64.manylinux2014_aarch64.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | 94ecb6ec476a6b86cac16dbff1dc02287dc2649554a72d0bee222748db3b6bfc |
|
MD5 | 7a3aa06766cf4860fae0e5bacc4a35ce |
|
BLAKE2b-256 | 06fe895dd6cdc6249f14c3ac0af95011532b7bf096ed04d014ffe1527cb8a869 |
Hashes for lightmotif-0.6.0-cp37-cp37m-macosx_10_9_x86_64.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | 5aa3fef281cca71da22850692d72f30b50f38489603866ea10e57b1c8e84032b |
|
MD5 | 19262794de4e3112bf3bb1f5e43bdb06 |
|
BLAKE2b-256 | 2914f49abdfe84b704573c1fa19f4b89e7a7d211bc70829f0fb2dc66f6472596 |