Skip to main content

Cython bindings and Python interface to HMMER3.

Project description

🐍🟡♦️🟦 pyHMMER Stars

Cython bindings and Python interface to HMMER3.

TravisCI Coverage PyPI Wheel Python Versions Python Implementations License Source Mirror GitHub issues Docs Changelog Downloads DOI

🗺️ Overview

HMMER is a biological sequence analysis tool that uses profile hidden Markov models to search for sequence homologs. HMMER3 is maintained by members of the the Eddy/Rivas Laboratory at Harvard University.

pyhmmer is a Python module, implemented using the Cython language, that provides bindings to HMMER3. It directly interacts with the HMMER internals, which has the following advantages over CLI wrappers (like hmmer-py):

  • single dependency: If your software or your analysis pipeline is distributed as a Python package, you can add pyhmmer as a dependency to your project, and stop worrying about the HMMER binaries being properly setup on the end-user machine.
  • no intermediate files: Everything happens in memory, in Python objects you have control on, making it easier to format your inputs to pass to HMMER without needing to write them to a file. Output retrieval is also done in memory, through instances of the pyhmmer.plan7.TopHits class.
  • no input formatting: The Easel object model is exposed in the pyhmmer.easel module, and you have the possibility to build a Sequence object yourself to pass to the HMMER pipeline. This is useful if your sequences are already loaded in memory, for instance because you obtained them from another Python library (such as Pyrodigal or Biopython).
  • no output formatting: HMMER3 is notorious for its numerous output files and its fixed-width tabular output, which is hard to parse (even Bio.SearchIO.HmmerIO is struggling on some sequences).
  • efficient: Using pyhmmer to launch hmmsearch on sequences and HMMs in disk storage is typically not slower than directly using the hmmsearch binary (see the Benchmarks section). pyhmmer.hmmsearch uses a different parallelisation strategy compared to the hmmsearch binary from HMMER, which helps getting the most of multiple CPUs.

This library is still a work-in-progress, and in a very experimental stage, but it should already pack enough features to run simple biological analyses involving hmmsearch.

🔧 Installing

pyhmmer can be installed from PyPI, which hosts some pre-built CPython wheels for x86-64 Linux, as well as the code required to compile from source with Cython:

$ pip install pyhmmer

Compilation for UNIX PowerPC is not tested in CI, but should work out of the box. Other architectures (e.g. Arm) and OSes (e.g. Windows) are not supported by HMMER.

A bioconda package is planned when this package exits the alpha status.

📖 Documentation

A complete API reference can be found in the online documentation, or directly from the command line using pydoc:

$ pydoc pyhmmer.easel
$ pydoc pyhmmer.plan7

💡 Example

Use pyhmmer to run hmmsearch, and obtain an iterable over TopHits that can be used for further sorting/querying in Python:

import pyhmmer

with pyhmmer.easel.SequenceFile("938293.PRJEB85.HG003687.faa") as file:
    alphabet = file.guess_alphabet()
    sequences = [seq.digitize(alphabet) for seq in file]

with pyhmmer.plan7.HMMFile("Pfam.hmm") as hmms:
    all_hits = list(pyhmmer.hmmsearch(hmms, sequences_file, cpus=4))

Processing happens in parallel using Python threads, and a TopHits object is yielded for every HMM passed in the input iterable. Note that for optimal performance, you should pass the number of physical cores to the cpus argument of the pyhmmer.hmmsearch function, as HMMER requires too many SIMD registers to benefit from hyperthreading.

💭 Feedback

⚠️ Issue Tracker

Found a bug ? Have an enhancement request ? Head over to the GitHub issue tracker if you need to report or ask something. If you are filing in on a bug, please include as much information as you can about the issue, and try to recreate the same bug in a simple, easily reproducible situation.

🏗️ Contributing

Contributions are more than welcome! See CONTRIBUTING.md for more details.

⏱️ Benchmarks

Benchmarks were run on a i7-8550U CPU running at 1.80GHz, using a FASTA file containing 2100 protein sequences (tests/data/seqs/938293.PRJEB85.HG003687.faa) and a subset of the Pfam HMM library containing 2873 domains. Commands were run 20 times.

Command # CPUs mean (s) σ (ms) min (s) max (s) Speedup
python -m pyhmmer hmmsearch 4 20.706 316 19.960 42.457 x1.00
python -m pyhmmer hmmsearch 2 24.076 842 22.289 21.118 x1.16
hmmsearch 2 35.046 161 34.734 35.183 x1.69
hmmsearch 4 37.721 78 37.605 37.847 x1.82
python -m pyhmmer hmmsearch 1 39.022 1346 36.081 40.644 x1.88
hmmsearch 1 44.360 243 44.184 45.018 x2.14
hmmscan 2 102.248 381 101.479 102.765 x4.93
hmmscan 4 106.779 375 106.197 107.482 x5.15
hmmscan 1 107.945 326 107.460 108.502 x5.21

⚖️ License

This library is provided under the MIT License. The HMMER3 and Easel code is available under the BSD 3-clause license. See vendor/hmmer/LICENSE and vendor/easel/LICENSE for more information.

This project is in no way not affiliated, sponsored, or otherwise endorsed by the original HMMER authors. It was developed by Martin Larralde during his PhD project at the European Molecular Biology Laboratory in the Zeller team.

Project details


Release history Release notifications | RSS feed

This version

0.1.1

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

pyhmmer-0.1.1.tar.gz (2.1 MB view details)

Uploaded Source

Built Distributions

pyhmmer-0.1.1-pp36-pypy3_72-manylinux2010_x86_64.whl (727.6 kB view details)

Uploaded PyPy manylinux: glibc 2.12+ x86-64

pyhmmer-0.1.1-cp39-cp39-manylinux2010_x86_64.whl (3.3 MB view details)

Uploaded CPython 3.9 manylinux: glibc 2.12+ x86-64

pyhmmer-0.1.1-cp39-cp39-manylinux1_x86_64.whl (3.3 MB view details)

Uploaded CPython 3.9

pyhmmer-0.1.1-cp38-cp38-manylinux2010_x86_64.whl (3.4 MB view details)

Uploaded CPython 3.8 manylinux: glibc 2.12+ x86-64

pyhmmer-0.1.1-cp38-cp38-manylinux1_x86_64.whl (3.4 MB view details)

Uploaded CPython 3.8

pyhmmer-0.1.1-cp37-cp37m-manylinux2010_x86_64.whl (3.2 MB view details)

Uploaded CPython 3.7m manylinux: glibc 2.12+ x86-64

pyhmmer-0.1.1-cp37-cp37m-manylinux1_x86_64.whl (3.2 MB view details)

Uploaded CPython 3.7m

pyhmmer-0.1.1-cp36-cp36m-manylinux2010_x86_64.whl (3.2 MB view details)

Uploaded CPython 3.6m manylinux: glibc 2.12+ x86-64

pyhmmer-0.1.1-cp36-cp36m-manylinux1_x86_64.whl (3.2 MB view details)

Uploaded CPython 3.6m

File details

Details for the file pyhmmer-0.1.1.tar.gz.

File metadata

  • Download URL: pyhmmer-0.1.1.tar.gz
  • Upload date:
  • Size: 2.1 MB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/3.2.0 pkginfo/1.6.1 requests/2.25.0 setuptools/47.3.1 requests-toolbelt/0.9.1 tqdm/4.54.0 CPython/3.8.1

File hashes

Hashes for pyhmmer-0.1.1.tar.gz
Algorithm Hash digest
SHA256 5e5b2aafdb1f9b316f8d614bbcc01900b7c7546f654748d66ca99675b131c1e3
MD5 4d499129fd248332cbf81d644d00870f
BLAKE2b-256 36b401720558702134da302c5fbf42308e68eabcd798c8a0cb13f4561086a9aa

See more details on using hashes here.

Provenance

File details

Details for the file pyhmmer-0.1.1-pp36-pypy3_72-manylinux2010_x86_64.whl.

File metadata

  • Download URL: pyhmmer-0.1.1-pp36-pypy3_72-manylinux2010_x86_64.whl
  • Upload date:
  • Size: 727.6 kB
  • Tags: PyPy, manylinux: glibc 2.12+ x86-64
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/3.2.0 pkginfo/1.6.1 requests/2.25.0 setuptools/50.3.2 requests-toolbelt/0.9.1 tqdm/4.54.0 PyPy/7.2.0

File hashes

Hashes for pyhmmer-0.1.1-pp36-pypy3_72-manylinux2010_x86_64.whl
Algorithm Hash digest
SHA256 5640a8cdafe366c6a3f5788ac9810ce3e1c64fb6d90c8b550e4db2794bcb0c0d
MD5 d2cb0badf332b0e110289f1a2f6fd9e7
BLAKE2b-256 854d1d8353402201354fc144524379cc8aaa212df90ac9b8172c56a189955797

See more details on using hashes here.

Provenance

File details

Details for the file pyhmmer-0.1.1-pp36-pypy3_72-manylinux1_x86_64.whl.

File metadata

  • Download URL: pyhmmer-0.1.1-pp36-pypy3_72-manylinux1_x86_64.whl
  • Upload date:
  • Size: 727.6 kB
  • Tags: PyPy
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/3.2.0 pkginfo/1.6.1 requests/2.25.0 setuptools/50.3.2 requests-toolbelt/0.9.1 tqdm/4.54.0 PyPy/7.2.0

File hashes

Hashes for pyhmmer-0.1.1-pp36-pypy3_72-manylinux1_x86_64.whl
Algorithm Hash digest
SHA256 86c034e6caf421980c21fce82faac3fb76c91901e4ab2ce94351971d48334cbb
MD5 4943dfc58e6bf70ac138a5f5490c3c7e
BLAKE2b-256 e936cc246f60791deb84df93240ccdf05a2c2e83cca21b4ccb90d6d0fc83e105

See more details on using hashes here.

Provenance

File details

Details for the file pyhmmer-0.1.1-cp39-cp39-manylinux2010_x86_64.whl.

File metadata

  • Download URL: pyhmmer-0.1.1-cp39-cp39-manylinux2010_x86_64.whl
  • Upload date:
  • Size: 3.3 MB
  • Tags: CPython 3.9, manylinux: glibc 2.12+ x86-64
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/3.2.0 pkginfo/1.6.1 requests/2.25.0 setuptools/50.3.2 requests-toolbelt/0.9.1 tqdm/4.54.0 CPython/3.9.0

File hashes

Hashes for pyhmmer-0.1.1-cp39-cp39-manylinux2010_x86_64.whl
Algorithm Hash digest
SHA256 bb1af27d42650e8e455acd88faf225c909991dd97584ee651dc888564a6d385a
MD5 2300ce721c87d3fb7a4e964d369484d6
BLAKE2b-256 708488b2f83c69d7e82340c1536321bdd89b504cf1dfc27c13808f477a9e293b

See more details on using hashes here.

Provenance

File details

Details for the file pyhmmer-0.1.1-cp39-cp39-manylinux1_x86_64.whl.

File metadata

  • Download URL: pyhmmer-0.1.1-cp39-cp39-manylinux1_x86_64.whl
  • Upload date:
  • Size: 3.3 MB
  • Tags: CPython 3.9
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/3.2.0 pkginfo/1.6.1 requests/2.25.0 setuptools/50.3.2 requests-toolbelt/0.9.1 tqdm/4.54.0 CPython/3.9.0

File hashes

Hashes for pyhmmer-0.1.1-cp39-cp39-manylinux1_x86_64.whl
Algorithm Hash digest
SHA256 7a20ce68cdd7fc9f0e8b206dadcb1e18eb3be9641b0d2c47c41d2c5b98d25e7b
MD5 ad556d8f353a93143682cfc6a2138aa5
BLAKE2b-256 d0d08c86f6066c44cdb96c3903c2afdf995889d59928153648c9d5b914e8e229

See more details on using hashes here.

Provenance

File details

Details for the file pyhmmer-0.1.1-cp38-cp38-manylinux2010_x86_64.whl.

File metadata

  • Download URL: pyhmmer-0.1.1-cp38-cp38-manylinux2010_x86_64.whl
  • Upload date:
  • Size: 3.4 MB
  • Tags: CPython 3.8, manylinux: glibc 2.12+ x86-64
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/3.2.0 pkginfo/1.6.1 requests/2.25.0 setuptools/47.3.1 requests-toolbelt/0.9.1 tqdm/4.54.0 CPython/3.8.1

File hashes

Hashes for pyhmmer-0.1.1-cp38-cp38-manylinux2010_x86_64.whl
Algorithm Hash digest
SHA256 55ba1c34464ba52b3d88d96acc4b2f912d3fc7d7b258d252d39d5afdd43df371
MD5 e228f34cbb02de2c4752fafd57db1db6
BLAKE2b-256 b1e74569641df1bd13deb351ff6f720987beef4cdd33d8bcc7698dcd2e8a663b

See more details on using hashes here.

Provenance

File details

Details for the file pyhmmer-0.1.1-cp38-cp38-manylinux1_x86_64.whl.

File metadata

  • Download URL: pyhmmer-0.1.1-cp38-cp38-manylinux1_x86_64.whl
  • Upload date:
  • Size: 3.4 MB
  • Tags: CPython 3.8
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/3.2.0 pkginfo/1.6.1 requests/2.25.0 setuptools/47.3.1 requests-toolbelt/0.9.1 tqdm/4.54.0 CPython/3.8.1

File hashes

Hashes for pyhmmer-0.1.1-cp38-cp38-manylinux1_x86_64.whl
Algorithm Hash digest
SHA256 d27fc9d89006389405a28e00777d11873d78c340e1632145cd08d497b038989f
MD5 0127034ae93949ef7f89dcbfa9834794
BLAKE2b-256 ea4525f87f17f71843f9f88e4110ffb70bf21a9d5b651408954fb1eeb007c63e

See more details on using hashes here.

Provenance

File details

Details for the file pyhmmer-0.1.1-cp37-cp37m-manylinux2010_x86_64.whl.

File metadata

  • Download URL: pyhmmer-0.1.1-cp37-cp37m-manylinux2010_x86_64.whl
  • Upload date:
  • Size: 3.2 MB
  • Tags: CPython 3.7m, manylinux: glibc 2.12+ x86-64
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/3.2.0 pkginfo/1.6.1 requests/2.25.0 setuptools/47.3.1 requests-toolbelt/0.9.1 tqdm/4.54.0 CPython/3.7.6

File hashes

Hashes for pyhmmer-0.1.1-cp37-cp37m-manylinux2010_x86_64.whl
Algorithm Hash digest
SHA256 2ba4cd154bbf001caf9dc886c9553ce44320b61938e1b241af374f4d1ee3f931
MD5 61e50b1d643149ec54ca4122de4c7b82
BLAKE2b-256 3a0f5ae8bcfad05dbf958c293ba0ce182cf2d235ce28b7c5f4b2bc6efb8e21ed

See more details on using hashes here.

Provenance

File details

Details for the file pyhmmer-0.1.1-cp37-cp37m-manylinux1_x86_64.whl.

File metadata

  • Download URL: pyhmmer-0.1.1-cp37-cp37m-manylinux1_x86_64.whl
  • Upload date:
  • Size: 3.2 MB
  • Tags: CPython 3.7m
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/3.2.0 pkginfo/1.6.1 requests/2.25.0 setuptools/47.3.1 requests-toolbelt/0.9.1 tqdm/4.54.0 CPython/3.7.6

File hashes

Hashes for pyhmmer-0.1.1-cp37-cp37m-manylinux1_x86_64.whl
Algorithm Hash digest
SHA256 c7955e450cc4534638a7af11bbbec1b2df521631c830bf9f1fe31944f76d5116
MD5 23698b327a4484945ee3be332ce53054
BLAKE2b-256 966be0f1612b0e9dc24ce8bc7dd876177d30bb4555b4b7b4db3d2ad328f22410

See more details on using hashes here.

Provenance

File details

Details for the file pyhmmer-0.1.1-cp36-cp36m-manylinux2010_x86_64.whl.

File metadata

  • Download URL: pyhmmer-0.1.1-cp36-cp36m-manylinux2010_x86_64.whl
  • Upload date:
  • Size: 3.2 MB
  • Tags: CPython 3.6m, manylinux: glibc 2.12+ x86-64
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/3.2.0 pkginfo/1.6.1 requests/2.25.0 setuptools/47.3.1 requests-toolbelt/0.9.1 tqdm/4.54.0 CPython/3.6.10

File hashes

Hashes for pyhmmer-0.1.1-cp36-cp36m-manylinux2010_x86_64.whl
Algorithm Hash digest
SHA256 cc78381317b94c9297e411c7f52478dd92b3e471d4299c66525875efcb805617
MD5 daac0bc358b76f9fbf04d7b093b21389
BLAKE2b-256 935ca01a36b27d63ff25786813e31ab0f7ac63556657ed47f4afb8fb596b6163

See more details on using hashes here.

Provenance

File details

Details for the file pyhmmer-0.1.1-cp36-cp36m-manylinux1_x86_64.whl.

File metadata

  • Download URL: pyhmmer-0.1.1-cp36-cp36m-manylinux1_x86_64.whl
  • Upload date:
  • Size: 3.2 MB
  • Tags: CPython 3.6m
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/3.2.0 pkginfo/1.6.1 requests/2.25.0 setuptools/47.3.1 requests-toolbelt/0.9.1 tqdm/4.54.0 CPython/3.6.10

File hashes

Hashes for pyhmmer-0.1.1-cp36-cp36m-manylinux1_x86_64.whl
Algorithm Hash digest
SHA256 eda57573e4bc4ff3ce11e97d041f9f2f10bee9aeda7e54f679bf4f2cefd32372
MD5 c1ba374dfef819c82f10fb0591e1d20a
BLAKE2b-256 305c720bed5a56da8768db2b94f988b402071729ea2a51911e9dbe3dacb79616

See more details on using hashes here.

Provenance

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page