Skip to main content

Cython bindings and Python interface to HMMER3.

Project description

🐍🟡♦️🟦 pyHMMER Stars

Cython bindings and Python interface to HMMER3.

GitLabCI Coverage PyPI Bioconda Wheel Python Versions Python Implementations License Source Mirror GitHub issues Docs Changelog Downloads DOI

🗺️ Overview

HMMER is a biological sequence analysis tool that uses profile hidden Markov models to search for sequence homologs. HMMER3 is maintained by members of the the Eddy/Rivas Laboratory at Harvard University.

pyhmmer is a Python module, implemented using the Cython language, that provides bindings to HMMER3. It directly interacts with the HMMER internals, which has the following advantages over CLI wrappers (like hmmer-py):

  • single dependency: If your software or your analysis pipeline is distributed as a Python package, you can add pyhmmer as a dependency to your project, and stop worrying about the HMMER binaries being properly setup on the end-user machine.
  • no intermediate files: Everything happens in memory, in Python objects you have control on, making it easier to pass your inputs to HMMER without needing to write them to a temporary file. Output retrieval is also done in memory, via instances of the pyhmmer.plan7.TopHits class.
  • no input formatting: The Easel object model is exposed in the pyhmmer.easel module, and you have the possibility to build a Sequence object yourself to pass to the HMMER pipeline. This is useful if your sequences are already loaded in memory, for instance because you obtained them from another Python library (such as Pyrodigal or Biopython).
  • no output formatting: HMMER3 is notorious for its numerous output files and its fixed-width tabular output, which is hard to parse (even Bio.SearchIO.HmmerIO is struggling on some sequences).
  • efficient: Using pyhmmer to launch hmmsearch on sequences and HMMs in disk storage is typically faster than directly using the hmmsearch binary (see the Benchmarks section). pyhmmer.hmmsearch uses a different parallelisation strategy compared to the hmmsearch binary from HMMER, which helps getting the most of multiple CPUs.

This library is still a work-in-progress, and in an experimental stage, but it should already pack enough features to run biological analyses involving hmmsearch or phmmer.

🔧 Installing

pyhmmer can be installed from PyPI, which hosts some pre-built CPython wheels for x86-64 Linux, as well as the code required to compile from source with Cython:

$ pip install pyhmmer

Compilation for UNIX PowerPC is not tested in CI, but should work out of the box. Other architectures (e.g. Arm) and OSes (e.g. Windows) are not supported by HMMER.

A Bioconda package is also available, but only for Linux:

$ conda install -c bioconda pyhmmer

📖 Documentation

A complete API reference can be found in the online documentation, or directly from the command line using pydoc:

$ pydoc pyhmmer.easel
$ pydoc pyhmmer.plan7

💡 Example

Use pyhmmer to run hmmsearch, and obtain an iterable over TopHits that can be used for further sorting/querying in Python:

import pyhmmer

with pyhmmer.easel.SequenceFile("938293.PRJEB85.HG003687.faa") as file:
    alphabet = file.guess_alphabet()
    sequences = [seq.digitize(alphabet) for seq in file]

with pyhmmer.plan7.HMMFile("Pfam.hmm") as hmms:
    all_hits = list(pyhmmer.hmmsearch(hmms, sequences_file, cpus=4))

Processing happens in parallel using Python threads, and a TopHits object is yielded for every HMM passed in the input iterable.

💭 Feedback

⚠️ Issue Tracker

Found a bug ? Have an enhancement request ? Head over to the GitHub issue tracker if you need to report or ask something. If you are filing in on a bug, please include as much information as you can about the issue, and try to recreate the same bug in a simple, easily reproducible situation.

🏗️ Contributing

Contributions are more than welcome! See CONTRIBUTING.md for more details.

⏱️ Benchmarks

Benchmarks were run on a i7-10710U CPU running 1.10GHz with 6 physical / 12 logical cores, using a FASTA file containing 2100 protein sequences extracted from the genome of Anaerococcus provencensis (938293.PRJEB85.HG003687.faa) and the version 33.1 of the Pfam HMM library containing 18,259 domains. Commands were run 4 times on a warm SSD. Plain lines show the times for pressed HMMs, and dashed-lines the times for HMMs in text format.

Benchmarks

Raw numbers can be found in the benches folder. They suggest that phmmer should be run with the number of logical cores, while hmmsearch should be run with the number of physical cores (or less). A possible explanation for this observation would be that HMMER platform-specific code requires too many SIMD registers per thread to benefit from simultaneous multi-threading.

🔍 See Also

If despite of all the advantages listed earlier, you would rather use HMMER through its CLI, this package will not be of great help. You should then check the hmmer-py package developed by Danilo Horta at the EMBL-EBI.

⚖️ License

This library is provided under the MIT License. The HMMER3 and Easel code is available under the BSD 3-clause license. See vendor/hmmer/LICENSE and vendor/easel/LICENSE for more information.

This project is in no way not affiliated, sponsored, or otherwise endorsed by the original HMMER authors. It was developed by Martin Larralde during his PhD project at the European Molecular Biology Laboratory in the Zeller team.

Project details


Release history Release notifications | RSS feed

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

pyhmmer-0.2.1.tar.gz (2.1 MB view details)

Uploaded Source

Built Distributions

pyhmmer-0.2.1-pp37-pypy37_pp73-manylinux2010_x86_64.whl (6.2 MB view details)

Uploaded PyPy manylinux: glibc 2.12+ x86-64

pyhmmer-0.2.1-pp36-pypy36_pp73-manylinux2010_x86_64.whl (6.2 MB view details)

Uploaded PyPy manylinux: glibc 2.12+ x86-64

pyhmmer-0.2.1-cp39-cp39-manylinux2010_x86_64.whl (4.0 MB view details)

Uploaded CPython 3.9 manylinux: glibc 2.12+ x86-64

pyhmmer-0.2.1-cp39-cp39-manylinux1_x86_64.whl (4.0 MB view details)

Uploaded CPython 3.9

pyhmmer-0.2.1-cp38-cp38-manylinux2010_x86_64.whl (4.1 MB view details)

Uploaded CPython 3.8 manylinux: glibc 2.12+ x86-64

pyhmmer-0.2.1-cp38-cp38-manylinux1_x86_64.whl (4.1 MB view details)

Uploaded CPython 3.8

pyhmmer-0.2.1-cp37-cp37m-manylinux2010_x86_64.whl (3.8 MB view details)

Uploaded CPython 3.7m manylinux: glibc 2.12+ x86-64

pyhmmer-0.2.1-cp37-cp37m-manylinux1_x86_64.whl (3.8 MB view details)

Uploaded CPython 3.7m

pyhmmer-0.2.1-cp36-cp36m-manylinux2010_x86_64.whl (3.9 MB view details)

Uploaded CPython 3.6m manylinux: glibc 2.12+ x86-64

pyhmmer-0.2.1-cp36-cp36m-manylinux1_x86_64.whl (3.9 MB view details)

Uploaded CPython 3.6m

File details

Details for the file pyhmmer-0.2.1.tar.gz.

File metadata

  • Download URL: pyhmmer-0.2.1.tar.gz
  • Upload date:
  • Size: 2.1 MB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/3.3.0 pkginfo/1.7.0 requests/2.25.1 setuptools/52.0.0 requests-toolbelt/0.9.1 tqdm/4.56.0 CPython/3.9.1

File hashes

Hashes for pyhmmer-0.2.1.tar.gz
Algorithm Hash digest
SHA256 676e79bdd525a25046d3bee12affbbd379a81b89193657cac1f6449cbc064d2d
MD5 01cdceed6986afcfe6a9801c789a2641
BLAKE2b-256 c846fffe6de4f62fff6e0f9a75f7b9bd0b2ce227acceb74e5410953b40f56663

See more details on using hashes here.

Provenance

File details

Details for the file pyhmmer-0.2.1-pp37-pypy37_pp73-manylinux2010_x86_64.whl.

File metadata

  • Download URL: pyhmmer-0.2.1-pp37-pypy37_pp73-manylinux2010_x86_64.whl
  • Upload date:
  • Size: 6.2 MB
  • Tags: PyPy, manylinux: glibc 2.12+ x86-64
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/3.3.0 pkginfo/1.7.0 requests/2.25.1 setuptools/52.0.0 requests-toolbelt/0.9.1 tqdm/4.56.0 CPython/3.9.1

File hashes

Hashes for pyhmmer-0.2.1-pp37-pypy37_pp73-manylinux2010_x86_64.whl
Algorithm Hash digest
SHA256 0ae0cbb1ecf15795947cda7e0a08043cc61f904b92f61032d1da6a3a7324e7bd
MD5 7691f4103173b5306398ec25f264cc3f
BLAKE2b-256 9a62b11a93e7c6caf67167ae17994708f78fccd1d09595b6091f4bf7d1ab757f

See more details on using hashes here.

Provenance

File details

Details for the file pyhmmer-0.2.1-pp37-pypy37_pp73-manylinux1_x86_64.whl.

File metadata

  • Download URL: pyhmmer-0.2.1-pp37-pypy37_pp73-manylinux1_x86_64.whl
  • Upload date:
  • Size: 6.2 MB
  • Tags: PyPy
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/3.3.0 pkginfo/1.7.0 requests/2.25.1 setuptools/52.0.0 requests-toolbelt/0.9.1 tqdm/4.56.0 CPython/3.9.1

File hashes

Hashes for pyhmmer-0.2.1-pp37-pypy37_pp73-manylinux1_x86_64.whl
Algorithm Hash digest
SHA256 6b822faafb5b07300ad70c5db4e499d56c8d11d389d536a1202a7b8bef2d5e1d
MD5 3a1024e4aaa8b929311f7bd02f554d7e
BLAKE2b-256 bc60ff3221b612563f41a1ed356c5de3769ea1e3085bd121c685c945fcc0a7aa

See more details on using hashes here.

Provenance

File details

Details for the file pyhmmer-0.2.1-pp36-pypy36_pp73-manylinux2010_x86_64.whl.

File metadata

  • Download URL: pyhmmer-0.2.1-pp36-pypy36_pp73-manylinux2010_x86_64.whl
  • Upload date:
  • Size: 6.2 MB
  • Tags: PyPy, manylinux: glibc 2.12+ x86-64
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/3.3.0 pkginfo/1.7.0 requests/2.25.1 setuptools/52.0.0 requests-toolbelt/0.9.1 tqdm/4.56.0 CPython/3.9.1

File hashes

Hashes for pyhmmer-0.2.1-pp36-pypy36_pp73-manylinux2010_x86_64.whl
Algorithm Hash digest
SHA256 c874d2e3b05b7af8964f4d6fb3961b54077de37acf236d4e5c421e10749e42f9
MD5 7423bfd3b3190774b55389eb6f3687b2
BLAKE2b-256 7977b6513032ae50e0d846b77e459e093cd829ffa46d61e6e9f8e0c93fd53f61

See more details on using hashes here.

Provenance

File details

Details for the file pyhmmer-0.2.1-pp36-pypy36_pp73-manylinux1_x86_64.whl.

File metadata

  • Download URL: pyhmmer-0.2.1-pp36-pypy36_pp73-manylinux1_x86_64.whl
  • Upload date:
  • Size: 6.2 MB
  • Tags: PyPy
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/3.3.0 pkginfo/1.7.0 requests/2.25.1 setuptools/52.0.0 requests-toolbelt/0.9.1 tqdm/4.56.0 CPython/3.9.1

File hashes

Hashes for pyhmmer-0.2.1-pp36-pypy36_pp73-manylinux1_x86_64.whl
Algorithm Hash digest
SHA256 23e9b319804a8415bd7801f1e877f5e2048564562aab0cf97d7524f8dbec96cb
MD5 6d066e68ddb1ad8652bfca95d6f69649
BLAKE2b-256 dfdc45bc67bc9ceb8280a62259d1c018e64bc188d006b61d0ec44ae61cb7e704

See more details on using hashes here.

Provenance

File details

Details for the file pyhmmer-0.2.1-cp39-cp39-manylinux2010_x86_64.whl.

File metadata

  • Download URL: pyhmmer-0.2.1-cp39-cp39-manylinux2010_x86_64.whl
  • Upload date:
  • Size: 4.0 MB
  • Tags: CPython 3.9, manylinux: glibc 2.12+ x86-64
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/3.3.0 pkginfo/1.7.0 requests/2.25.1 setuptools/52.0.0 requests-toolbelt/0.9.1 tqdm/4.56.0 CPython/3.9.1

File hashes

Hashes for pyhmmer-0.2.1-cp39-cp39-manylinux2010_x86_64.whl
Algorithm Hash digest
SHA256 22f18450a837e206b8346b206a86337dbe13d11de906a659b146ff6e22630c93
MD5 c9eff19557ed478b3292ba6c3b311ded
BLAKE2b-256 864e019bdee89691d26bb66acba19b354cf88270396db6000362069b0d2f86b8

See more details on using hashes here.

Provenance

File details

Details for the file pyhmmer-0.2.1-cp39-cp39-manylinux1_x86_64.whl.

File metadata

  • Download URL: pyhmmer-0.2.1-cp39-cp39-manylinux1_x86_64.whl
  • Upload date:
  • Size: 4.0 MB
  • Tags: CPython 3.9
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/3.3.0 pkginfo/1.7.0 requests/2.25.1 setuptools/52.0.0 requests-toolbelt/0.9.1 tqdm/4.56.0 CPython/3.9.1

File hashes

Hashes for pyhmmer-0.2.1-cp39-cp39-manylinux1_x86_64.whl
Algorithm Hash digest
SHA256 2a54ae3d76e030fa5016c87bd18c1ebaf166a446a944bc7894d7dc058239cae3
MD5 29950a6d4aaa1f26870e476b66b840f1
BLAKE2b-256 c64a1f95ca9c7f09f3a37c25e1096431c711f250902c2d4755d7bc3cf0ae200c

See more details on using hashes here.

Provenance

File details

Details for the file pyhmmer-0.2.1-cp38-cp38-manylinux2010_x86_64.whl.

File metadata

  • Download URL: pyhmmer-0.2.1-cp38-cp38-manylinux2010_x86_64.whl
  • Upload date:
  • Size: 4.1 MB
  • Tags: CPython 3.8, manylinux: glibc 2.12+ x86-64
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/3.3.0 pkginfo/1.7.0 requests/2.25.1 setuptools/52.0.0 requests-toolbelt/0.9.1 tqdm/4.56.0 CPython/3.9.1

File hashes

Hashes for pyhmmer-0.2.1-cp38-cp38-manylinux2010_x86_64.whl
Algorithm Hash digest
SHA256 7658431b9114e26dc5ffdf6377b1b93e86dfa39a464df054aca3fe7ac957138e
MD5 f309940b9f9db1b3cb2e1dcd70b0fe69
BLAKE2b-256 c533388b9ab97e2e203b771ee35c840220aea998d326bf0101e00769101253bc

See more details on using hashes here.

Provenance

File details

Details for the file pyhmmer-0.2.1-cp38-cp38-manylinux1_x86_64.whl.

File metadata

  • Download URL: pyhmmer-0.2.1-cp38-cp38-manylinux1_x86_64.whl
  • Upload date:
  • Size: 4.1 MB
  • Tags: CPython 3.8
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/3.3.0 pkginfo/1.7.0 requests/2.25.1 setuptools/52.0.0 requests-toolbelt/0.9.1 tqdm/4.56.0 CPython/3.9.1

File hashes

Hashes for pyhmmer-0.2.1-cp38-cp38-manylinux1_x86_64.whl
Algorithm Hash digest
SHA256 59a12e8631fac56ff7da3bcc24583a6961f32d42b2d72644d27561b7175f7615
MD5 0937b3a6b4120ad42c5cf50b94fcde38
BLAKE2b-256 903d7ea26b8136b23b600c9269c60cffc119889cbdc5bed174485cf864517875

See more details on using hashes here.

Provenance

File details

Details for the file pyhmmer-0.2.1-cp37-cp37m-manylinux2010_x86_64.whl.

File metadata

  • Download URL: pyhmmer-0.2.1-cp37-cp37m-manylinux2010_x86_64.whl
  • Upload date:
  • Size: 3.8 MB
  • Tags: CPython 3.7m, manylinux: glibc 2.12+ x86-64
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/3.3.0 pkginfo/1.7.0 requests/2.25.1 setuptools/52.0.0 requests-toolbelt/0.9.1 tqdm/4.56.0 CPython/3.9.1

File hashes

Hashes for pyhmmer-0.2.1-cp37-cp37m-manylinux2010_x86_64.whl
Algorithm Hash digest
SHA256 b42d8883707dc9deb40192b9e8f7faa8d14fdf678190d878e2bc862723c273a8
MD5 cde87a6ad223b0e977967b8195fafe29
BLAKE2b-256 ee086e9cd70c963df09e626a0aa4ca3e0db88cf06aac837521fdaf4105f2a4cc

See more details on using hashes here.

Provenance

File details

Details for the file pyhmmer-0.2.1-cp37-cp37m-manylinux1_x86_64.whl.

File metadata

  • Download URL: pyhmmer-0.2.1-cp37-cp37m-manylinux1_x86_64.whl
  • Upload date:
  • Size: 3.8 MB
  • Tags: CPython 3.7m
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/3.3.0 pkginfo/1.7.0 requests/2.25.1 setuptools/52.0.0 requests-toolbelt/0.9.1 tqdm/4.56.0 CPython/3.9.1

File hashes

Hashes for pyhmmer-0.2.1-cp37-cp37m-manylinux1_x86_64.whl
Algorithm Hash digest
SHA256 cc1eea8ab080f06f9599d00ebf7bdbd256232211841983dd151e106aa504c249
MD5 77cc35fe5fdc2ae357a83faa7cd691f3
BLAKE2b-256 497cd3f741a6021ad34a0330c4b025dcc5b6a149e453d19b1a6955e1a6840e69

See more details on using hashes here.

Provenance

File details

Details for the file pyhmmer-0.2.1-cp36-cp36m-manylinux2010_x86_64.whl.

File metadata

  • Download URL: pyhmmer-0.2.1-cp36-cp36m-manylinux2010_x86_64.whl
  • Upload date:
  • Size: 3.9 MB
  • Tags: CPython 3.6m, manylinux: glibc 2.12+ x86-64
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/3.3.0 pkginfo/1.7.0 requests/2.25.1 setuptools/52.0.0 requests-toolbelt/0.9.1 tqdm/4.56.0 CPython/3.9.1

File hashes

Hashes for pyhmmer-0.2.1-cp36-cp36m-manylinux2010_x86_64.whl
Algorithm Hash digest
SHA256 aa4e462b35bebd4d4139ec3ff2f4861b2fc9212ddbdd5f292924a9179c705cca
MD5 f71756513f5cbe75180e22f08e2f4878
BLAKE2b-256 9143c2a90dd0e06d7dc09d952d80410751b02282b4ca4280fde3bde442795ce0

See more details on using hashes here.

Provenance

File details

Details for the file pyhmmer-0.2.1-cp36-cp36m-manylinux1_x86_64.whl.

File metadata

  • Download URL: pyhmmer-0.2.1-cp36-cp36m-manylinux1_x86_64.whl
  • Upload date:
  • Size: 3.9 MB
  • Tags: CPython 3.6m
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/3.3.0 pkginfo/1.7.0 requests/2.25.1 setuptools/52.0.0 requests-toolbelt/0.9.1 tqdm/4.56.0 CPython/3.9.1

File hashes

Hashes for pyhmmer-0.2.1-cp36-cp36m-manylinux1_x86_64.whl
Algorithm Hash digest
SHA256 f5d27aa2adc0c632bfc06af97d8ba2934c7fdff3cf224e12f2f29d56b18ea872
MD5 4e97df12d075a982654f8edc153216ab
BLAKE2b-256 3eedb6538d599c328f880b949b57c432be2701ba5a6f1c032eb055d3acafd32b

See more details on using hashes here.

Provenance

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page