PyO3 bindings and Python interface to lightmotif, a library for platform-accelerated biological motif scanning using position weight matrices.
Project description
🎼🧬 lightmotif
A lightweight platform-accelerated library for biological motif scanning using position weight matrices.
🗺️ Overview
Motif scanning with position weight matrices (also known as position-specific scoring matrices) is a robust method for identifying motifs of fixed length inside a biological sequence. They can be used to identify transcription factor binding sites in DNA, or protease cleavage site in polypeptides. Position weight matrices are often viewed as sequence logos:
The lightmotif
library provides a Python module to run very efficient
searches for a motif encoded in a position weight matrix. The position
scanning combines several techniques to allow high-throughput processing
of sequences:
- Compile-time definition of alphabets and matrix dimensions.
- Sequence symbol encoding for fast table look-ups, as implemented in HMMER[1] or MEME[2]
- Striped sequence matrices to process several positions in parallel, inspired by Michael Farrar[3].
- Vectorized matrix row look-up using
permute
instructions of AVX2.
This is the Python version, there is a Rust crate available as well.
🔧 Installing
lightmotif
can be installed directly from PyPI,
which hosts some pre-built wheels for most mainstream platforms, as well as the
code required to compile from source with Rust:
$ pip install lightmotif
In the event you have to compile the package from source, all the required Rust libraries are vendored in the source distribution, and a Rust compiler will be setup automatically if there is none on the host machine.
💡 Example
The motif interface should be mostly compatible with the
Bio.motifs
module from Biopython. The notable difference is that
the calculate
method of PSSM objects expects a striped sequence instead.
import lightmotif
# Create a count matrix from an iterable of sequences
motif = lightmotif.create(["GTTGACCTTATCAAC", "GTTGATCCAGTCAAC"])
# Create a PSSM with 0.1 pseudocounts and uniform background frequencies
pwm = motif.counts.normalize(0.1)
pssm = pwm.log_odds()
# Encode the target sequence into a striped matrix
seq = "ATGTCCCAACAACGATACCCCGAGCCCATCGCCGTCATCGGCTCGGCATGCAGATTCCCAGGCG"
striped = lightmotif.stripe(seq)
# Compute scores using the fastest backend implementation for the host machine
scores = pssm.calculate(sseq)
⏱️ Benchmarks
Benchmarks use the MX000001
motif from PRODORIC[4], and the
complete genome of an
Escherichia coli K12 strain.
Benchmarks were run on a i7-10710U CPU running @1.10GHz, compiled with --target-cpu=native
.
lightmotif (avx2): 5,335,999 ns/iter (+/- 3,532,171) = 829.6 MiB/s
Bio.motifs: 346,620,369 ns/iter (+/- 35,120,487) = 12.8 MiB/s
MOODS.scan: 161,808,252 ns/iter (+/- 8,677,959) = 27.4 MiB/s
💭 Feedback
⚠️ Issue Tracker
Found a bug ? Have an enhancement request ? Head over to the GitHub issue tracker if you need to report or ask something. If you are filing in on a bug, please include as much information as you can about the issue, and try to recreate the same bug in a simple, easily reproducible situation.
📋 Changelog
This project adheres to Semantic Versioning and provides a changelog in the Keep a Changelog format.
⚖️ License
This library is provided under the open-source MIT license.
This project was developed by Martin Larralde during his PhD project at the European Molecular Biology Laboratory in the Zeller team.
📚 References
- [1] Eddy, Sean R. ‘Accelerated Profile HMM Searches’. PLOS Computational Biology 7, no. 10 (20 October 2011): e1002195. doi:10.1371/journal.pcbi.1002195.
- [2] Grant, Charles E., Timothy L. Bailey, and William Stafford Noble. ‘FIMO: Scanning for Occurrences of a given Motif’. Bioinformatics 27, no. 7 (1 April 2011): 1017–18. doi:10.1093/bioinformatics/btr064.
- [3] Farrar, Michael. ‘Striped Smith–Waterman Speeds Database Searches Six Times over Other SIMD Implementations’. Bioinformatics 23, no. 2 (15 January 2007): 156–61. doi:10.1093/bioinformatics/btl582.
- [4] Dudek, Christian-Alexander, and Dieter Jahn. ‘PRODORIC: State-of-the-Art Database of Prokaryotic Gene Regulation’. Nucleic Acids Research 50, no. D1 (7 January 2022): D295–302. doi:10.1093/nar/gkab1110.
Project details
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distributions
Hashes for lightmotif-0.4.0-pp39-pypy39_pp73-win_amd64.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | 39ab6a4346b2be935376a83e391c5fafcc58abc1c7582bafa1713802f4c737bb |
|
MD5 | 43ca2b4fd19edc540254112c11de7cfe |
|
BLAKE2b-256 | f15bbf5344356b54e1175c62ec5d042b0f59e323f4c958387f208eaa4ade61e3 |
Hashes for lightmotif-0.4.0-pp39-pypy39_pp73-manylinux_2_17_x86_64.manylinux2014_x86_64.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | dd5ba5fc245725925582c23c6ba3243cd13bf644d5cf543d48f6e0b9887f4225 |
|
MD5 | a3e44d8142a3cbfdc3887c0086fcab19 |
|
BLAKE2b-256 | 97157740d9c6ece92b2ac80cb895d905e31368dcc3bceabc44537d647a060b7f |
Hashes for lightmotif-0.4.0-pp39-pypy39_pp73-manylinux_2_17_aarch64.manylinux2014_aarch64.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | 606dc67edc95e941a22d61fa8b5a6912d4f133e51a152a3224f6ce6dd5eb68bb |
|
MD5 | b18a208f46e090b37622a963b0728d9b |
|
BLAKE2b-256 | b740b70531f0ce63e0d3aa43a46cec316d5c0f029fc825d12da6605746f7c2fb |
Hashes for lightmotif-0.4.0-pp39-pypy39_pp73-macosx_10_9_x86_64.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | f60c8c56b43a2ec131ead871408fa3ec0ff2819a339727761b6981da071ec0ba |
|
MD5 | 992ff32c13416f197687a690eb1f2d70 |
|
BLAKE2b-256 | 9d1c794c95bfaf47ce7224a7d7c77b3bff41bfdd6a1f65caa04221d1f89883f2 |
Hashes for lightmotif-0.4.0-pp38-pypy38_pp73-win_amd64.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | 357b192f002c2257329cb3dcd63990a633f54e2aa95d26f0db6b9ea0c57d0159 |
|
MD5 | 186bc5f4aedd4c147ea02922f3744635 |
|
BLAKE2b-256 | fdffd65b4e5421aa1280223f37bc8819ceea96ec3064a50a103e7a741b72b19a |
Hashes for lightmotif-0.4.0-pp38-pypy38_pp73-manylinux_2_17_x86_64.manylinux2014_x86_64.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | 5d121019517c369063c1e83056d448a1d4915047dc9bc27661afb4ba0dae6e45 |
|
MD5 | 2c2255aaf5952fa643c72618c5a826eb |
|
BLAKE2b-256 | 4ff943e4d459d0f7510c34c850a9c5c4b2a0dd6c77a1aac7ecc56f6a1ba762ee |
Hashes for lightmotif-0.4.0-pp38-pypy38_pp73-manylinux_2_17_aarch64.manylinux2014_aarch64.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | fdcd741e337a25ebd4c0d93ef801e432eb92c3ff5aeefe0c360db94dac694232 |
|
MD5 | 12cefd16f41815029b58300a0723e5c4 |
|
BLAKE2b-256 | 253defa7e064f7a0b638e53cdfa81739c0c51a2b410428a28995ed2f4f4a38c0 |
Hashes for lightmotif-0.4.0-pp38-pypy38_pp73-macosx_10_9_x86_64.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | 2f3f9ef01357658c14a328e234a3ecdd6a551e4abff636c5f90faaaed56adffe |
|
MD5 | 1765470c0fe5f2e5bb646d2b9a225b33 |
|
BLAKE2b-256 | f1cb5f6e8b9bddf5557a4388e91f1d8f9ab6dd46eacf5c6e867356e4cbf8d122 |
Hashes for lightmotif-0.4.0-pp37-pypy37_pp73-win_amd64.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | dee1ac49115f8d93f9b024ac08b782d9b0639966fee890426ba06c2c6b5c7100 |
|
MD5 | 7194f947cca2acd67b5354d210080790 |
|
BLAKE2b-256 | d62eb662275387f868c3b6fffc0f7d1bd3eb10f1bf7457d883e85b21125bb4ff |
Hashes for lightmotif-0.4.0-pp37-pypy37_pp73-manylinux_2_17_x86_64.manylinux2014_x86_64.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | b836e2eb4666ed577fd27be6b957f606598e5a0403c1e79bdc58638675e0dd16 |
|
MD5 | 9a96cf730adab74d9a8ba382bd97d20f |
|
BLAKE2b-256 | a1c5a66d071ac782c597e49a6f226fd5a66318f553380b4b7d2197e363b75b39 |
Hashes for lightmotif-0.4.0-pp37-pypy37_pp73-manylinux_2_17_aarch64.manylinux2014_aarch64.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | 400eff0f083985e463a54680d99f91bf6a761fc5a41f2d02e1a4cc850ef1c5cc |
|
MD5 | a176915f03e66613d80640850e6447fa |
|
BLAKE2b-256 | 4807dadef71063ac7ae40462c45fcfbf7260e68dcd4f0604279791ccd688ec81 |
Hashes for lightmotif-0.4.0-pp37-pypy37_pp73-macosx_10_9_x86_64.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | 032857f96b0753dea7c3081ac754a5dcaa3d025776594bd040f51fe106c55c54 |
|
MD5 | cd527c8bb4a9624234b46f3f7ba57c71 |
|
BLAKE2b-256 | 9390b8b344f058287835268d8ce384ab47cae979c14b3c7ee3d93c399b4f9a2c |
Hashes for lightmotif-0.4.0-cp311-cp311-win_amd64.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | 791acb90c4dc92237d5f83fb5c8d52e6585e9536096d135f217a4d415db17160 |
|
MD5 | 104c8b39606a922330fe289380f47483 |
|
BLAKE2b-256 | 44b84f003b53f37d49ba19d40992c8d8e6942bc6371da5f18f8e417c575c9846 |
Hashes for lightmotif-0.4.0-cp311-cp311-manylinux_2_17_x86_64.manylinux2014_x86_64.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | 65f91d05bb64488a98c66afe3d8b0383f6b5a692e026c726903d4183b3fb1b04 |
|
MD5 | 3eba8974e6c8b2521cec1df67bbdc498 |
|
BLAKE2b-256 | 86e29b9ccca4de4e97215c658920f08f7a2cf86220e1ed13e92d7191d6b064bc |
Hashes for lightmotif-0.4.0-cp311-cp311-manylinux_2_17_aarch64.manylinux2014_aarch64.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | c888bb745622f468cbd5a64f602290def87f26eec82f75a4658a99e3881a4621 |
|
MD5 | 4f584ea45843a7472c7abb412729d0ac |
|
BLAKE2b-256 | abdbdfe4c9404bd18362cf2dfeb23122e898dd64438ac08aaf61f0eedadbeb3b |
Hashes for lightmotif-0.4.0-cp311-cp311-macosx_11_0_arm64.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | 1f59b69e3a2099c3738287c7b0b1625ab914efbce725f37c4c677d0eec62708c |
|
MD5 | 76cd718b83e014d5328ec010e1b3ca04 |
|
BLAKE2b-256 | a5b15012471432b12b82ce408487117f54c6d66cdc28ea97a0fdcc757e176cf0 |
Hashes for lightmotif-0.4.0-cp311-cp311-macosx_10_9_x86_64.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | 03d7425d8a0ade9ab13a64ca576b827a79ece0017da211ca7d332a5384d6d902 |
|
MD5 | b8c19c692e1e43b06b6c098fe17c3442 |
|
BLAKE2b-256 | af840febea4cfcbeed96f1a3f81d93d55c17813a548faf6c33fce5dbe8e9b36a |
Hashes for lightmotif-0.4.0-cp310-cp310-win_amd64.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | d642c1b9010a9e1fcd2586a2daeba71c4e830e0d14d55fd00f7ccaaa573e4b94 |
|
MD5 | 6ff4d1c1345bf9c5c7428c00877987da |
|
BLAKE2b-256 | 4ef116c18abdf60e670c62ccfab95768eb71f7bda9a1c63595e9c5212c44deef |
Hashes for lightmotif-0.4.0-cp310-cp310-manylinux_2_17_x86_64.manylinux2014_x86_64.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | 693d26c7778c0f58bd76d222599cc3756b23ca3990a5e674beb42fdf92c861de |
|
MD5 | fbe1acd24441d6aff488bc5a8847f271 |
|
BLAKE2b-256 | 9131813cce74d57e98a38f67c9360f5fca3a6a5e8fa871f0fb489e4535eca6b1 |
Hashes for lightmotif-0.4.0-cp310-cp310-manylinux_2_17_aarch64.manylinux2014_aarch64.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | 9498ebdde0fe942d52c71ebe786e4fce58284f8d61e2033f994c932a627788eb |
|
MD5 | 185929347642fc1c2e5a75e97f6da94d |
|
BLAKE2b-256 | fceb6e3010f782f3714748284876984a90651ff13d6253661d0ff3169b932e28 |
Hashes for lightmotif-0.4.0-cp310-cp310-macosx_11_0_arm64.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | fe7b850883554bbf7ad896f050473be2ba339e048d9bb7d6da1ba15296f6099d |
|
MD5 | cc2e16e1acda5341d290d398ccbb6239 |
|
BLAKE2b-256 | 7a71e758612db60d31a3d52cb10250a44c882a11a6692549f2d0cc4e73ed63d6 |
Hashes for lightmotif-0.4.0-cp310-cp310-macosx_10_9_x86_64.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | 87eebffe72375dc086e4be0f5f362f205cabb145dd88b61408a8b66b350c6bf2 |
|
MD5 | 2f2a9fdd5745c4784d649ea8508fc865 |
|
BLAKE2b-256 | 278c9c9d1481caee9c1402c4cba0c4dda64eb9746e676c8af4f44e7a43e730e5 |
Hashes for lightmotif-0.4.0-cp39-cp39-win_amd64.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | 5227e5b33a15151c069c49cdd6d19a541639ad648d3ee5f90d9f1a718f9a8f45 |
|
MD5 | b8bcb320ec0d3301c41ae719a65ee2c3 |
|
BLAKE2b-256 | 929f2456f4dad7cb331e4805dcab0afeb082884f652fb8d3a800f0efa35fee9f |
Hashes for lightmotif-0.4.0-cp39-cp39-manylinux_2_17_x86_64.manylinux2014_x86_64.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | c41a8a58425efab27319c5c0f67e545148b92c4675fe6ea643354cb027b524ad |
|
MD5 | 6037873322f24bc10ca0b3b25d11b066 |
|
BLAKE2b-256 | 5e3ae22abb7f687dd2be7f6ed09e613b5663b9ca95ca6fe44a28e19440794dcc |
Hashes for lightmotif-0.4.0-cp39-cp39-manylinux_2_17_aarch64.manylinux2014_aarch64.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | 1008270d93685602df3d325630f21c02a533f34bfdfd3c798cbeef18e4436c8f |
|
MD5 | f59c56833e47d27004bd0e2ca4f94e85 |
|
BLAKE2b-256 | 86f868fbae41940228880394a2d25e1e79afcc2bbd52d252867c52272d3d37de |
Hashes for lightmotif-0.4.0-cp39-cp39-macosx_11_0_arm64.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | 148cbc740d497c7cacf9f80eda7cd6cd6ca292d43740264252463caa6ee96a3d |
|
MD5 | fc078ef8ead3954bf52ab384fd39f26c |
|
BLAKE2b-256 | 9bf8e2f9df8b0b4ebae59f3992e701cff72e5a02f7d78e27d29fbbfebef45d54 |
Hashes for lightmotif-0.4.0-cp39-cp39-macosx_10_9_x86_64.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | fa79a594957b51fbc00366feaed9bb3b2f2fe2bf07571a7406f384f592405f32 |
|
MD5 | f49e953a4199c0c48120192cf018cba1 |
|
BLAKE2b-256 | f272a6e8c613fdfad4b05770e23ca2d0b89fc304482ac50d0cbdff1770bd16c2 |
Hashes for lightmotif-0.4.0-cp38-cp38-win_amd64.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | 1bf474d5d0f73f8d7f04f0720c3133f91d433d591cad71345cccc32ddd890f14 |
|
MD5 | a9e814bd77a598843a09b343888455ab |
|
BLAKE2b-256 | d843a42e221dcd5771b2eb80d9c2f878d378a82e92ba241ba184edbde0b2676a |
Hashes for lightmotif-0.4.0-cp38-cp38-manylinux_2_17_x86_64.manylinux2014_x86_64.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | 008b5c16b98a856ba9ead5a020d0816372497dc2e70a2b490bdf5b7566c035cc |
|
MD5 | 8a67965acb64b350c0a6524c8219335a |
|
BLAKE2b-256 | 0932906d05ee09913525f15c98174234882b3fa2875952a5e03238f4017062f3 |
Hashes for lightmotif-0.4.0-cp38-cp38-manylinux_2_17_aarch64.manylinux2014_aarch64.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | fb2d57156473362517a6c14b88fe44e556f3d9ae2c72bca72b194f2d2de910a2 |
|
MD5 | b076da0e752ebd967f22f3b7baeae0ed |
|
BLAKE2b-256 | d000ebca6d6d3440d994c7bd65370144ea568d16d7a1a0ebf1076f34e2c3efdd |
Hashes for lightmotif-0.4.0-cp38-cp38-macosx_11_0_arm64.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | da3ac2cfa2cb203ccb76d9de657f5ac9c80ab74c2304e27d86b0003b293076ff |
|
MD5 | 1782c09f302f7913cbbc38dda464f188 |
|
BLAKE2b-256 | f8edf4c42f24d87fa732009b03610da115fab5f6708e74d52e8c2e2fb828f70c |
Hashes for lightmotif-0.4.0-cp38-cp38-macosx_10_9_x86_64.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | 826bdae69776a73e779ec69575f81fce39cbf0af61a6408639f975b2ca547695 |
|
MD5 | d93dc7545b6dc54cd22a9aaa59eb62fc |
|
BLAKE2b-256 | 53a09608549ceebc532b94458a19c157850ffc1ea2c0a700837cb9414e056bf8 |
Hashes for lightmotif-0.4.0-cp37-cp37m-win_amd64.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | 69ecaf34fd8ae69a7a31f6015dd9f8b089358a74182b23daa1ff9b2204ce814f |
|
MD5 | b8111907e088daa5b21b5b83127f3f52 |
|
BLAKE2b-256 | ff646ac8fee0d30fc5468747e848f11c7dd40a396d0264c37aa0f8dfe2eb5774 |
Hashes for lightmotif-0.4.0-cp37-cp37m-manylinux_2_17_x86_64.manylinux2014_x86_64.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | a7ec76d5e87e64cd789071473e7d7515ed63b4c40bd23f262d9e1d43a244801c |
|
MD5 | 7e0278cb125b5fec408b837a60a5206c |
|
BLAKE2b-256 | d29fadce98632cd8b3dd740f929b9d6a762b351575d7b020fe8fd2a2b95e1978 |
Hashes for lightmotif-0.4.0-cp37-cp37m-manylinux_2_17_aarch64.manylinux2014_aarch64.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | a7d67989a00cad5652b6d67fdc904292169baa07f44596040fb0f26332c297aa |
|
MD5 | 6d3728cda2fd9c6c2ff89495dc599b81 |
|
BLAKE2b-256 | aa78824ab6d762d594f95581b7f50e1fe71a01809c383b7a4840453a1f08d52b |
Hashes for lightmotif-0.4.0-cp37-cp37m-macosx_10_9_x86_64.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | bc852b9bd95613a783bff8a5484bbace13211ab9b47cb6f7eb9e493b4cf574da |
|
MD5 | ab71cbe2e35488b900765046dc0434a8 |
|
BLAKE2b-256 | 3ff33893d7d40351013186e42eb077668ec7b1557feab6dea50ee334305a3908 |