Skip to main content

Machine-learning prediction of residues driving homotypic transmembrane interactions.

Project description

https://raw.githubusercontent.com/bojigu/thoipapy/develop/thoipapy/docs/THOIPA_banner.png

THOIPApy

The Transmembrane HOmodimer Interface Prediction Algorithm (THOIPA) is a machine learning method for the analysis of protein-protein-interactions.

THOIPA predicts transmembrane homodimer interface residues from evolutionary sequence information.

THOIPA helps predict potential homotypic transmembrane interface residues, which can then be verified experimentally. THOIPA also aids in the energy-based modelling of transmembrane homodimers.

Important links:

How does thoipapy work?

  • downloads protein homologues with BLAST

  • extracts residue properties (e.g. residue conservation and polarity)

  • trains a machine learning classifier

  • validates the prediction performance

  • creates heatmaps of residue properties and THOIPA prediction

Installation

pip install thoipapy

THOIPA has only been tested on Linux, due to reliance on external dependencies such as FreeContact, Phobius, CD-HIT and rate4site. For predictions only, a dockerised version is available that runs on Windows or MacOS. Please see the THOIPA webserver for the latest information.

Dependencies

We recommend the Anaconda python distribution, which contains all the required python modules (numpy, scipy, pandas,biopython and matplotlib). THOIPApy is currently tested for python 3.8.5. The requirements.txt contains a snapshot of compatible dependencies.

Development status

The code has been extensively updated and annotated for public release. However is released “as is” with some known issues, limitations and legacy code.

Usage as a standalone predictor

from thoipapy.thoipa import get_md5_checksum, run_THOIPA_prediction
from thoipapy.utils import make_sure_path_exists

protein_name = "ERBB3"
TMD_seq = "MALTVIAGLVVIFMMLGGTFL"
full_seq = "MVQNECRPCHENCTQGCKGPELQDCLGQTLVLIGKTHLTMALTVIAGLVVIFMMLGGTFLYWRGRRIQNKRAMRRYLERGESIEPLDPSEKANKVLA"
out_dir = "/path/to/your/desired/output/folder"
make_sure_path_exists(out_dir)
md5 = get_md5_checksum(TMD_seq, full_seq)
run_THOIPA_prediction(protein_name, md5, TMD_seq, full_seq, out_dir)

Example Output

  • the output includes a csv showing the THOIPA prediction for each residue, as well as a heatmap figure as a summary

  • below is a heatmap showing the THOIPA prediction, and underlying conservation, relative polarity, and coevolution

https://raw.githubusercontent.com/bojigu/thoipapy/master/thoipapy/docs/standalone_heatmap_example.png

Create your own machine learning predictor

  • THOIPA can be retrained to any dataset of your choice

  • the original set of training sequences and other resources are available via the Open Science Foundation

  • the THOIPA feature extraction, feature selection, and training pipeline is fully automated

  • contact us for an introduction to the THOIPA software pipeline and settings

python path/to/thoipapy/run.py -s home/user/thoipa/THOIPA_settings.xlsx

License

THOIPApy is free software distributed under the permissive MIT License.

Contribute

  • Contributors are welcome.

  • For feedback or troubleshooting, please email us directly and initiate an issue in Github.

Contact

https://raw.githubusercontent.com/bojigu/thoipapy/develop/thoipapy/docs/signac_seine_bei_samois_mt.png https://raw.githubusercontent.com/bojigu/thoipapy/develop/thoipapy/docs/signac_notredame_bz.png

Citation

Yao Xiao, Bo Zeng, Nicola Berner, Dmitrij Frishman, Dieter Langosch, and Mark George Teese (2020) Experimental determination and data-driven prediction of homotypic transmembrane domain interfaces, Computational and Structural Biotechnology Journal

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

thoipapy-1.2.0.tar.gz (156.6 kB view details)

Uploaded Source

Built Distribution

thoipapy-1.2.0-py3-none-any.whl (329.4 kB view details)

Uploaded Python 3

File details

Details for the file thoipapy-1.2.0.tar.gz.

File metadata

  • Download URL: thoipapy-1.2.0.tar.gz
  • Upload date:
  • Size: 156.6 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/3.2.0 pkginfo/1.5.0.1 requests/2.24.0 setuptools/49.6.0.post20200814 requests-toolbelt/0.9.1 tqdm/4.49.0 CPython/3.8.5

File hashes

Hashes for thoipapy-1.2.0.tar.gz
Algorithm Hash digest
SHA256 66225fbff340faf0cfb30d546f75ba1f12c52d8aba19da6afc99355e2899e8bd
MD5 ff8c0f1096ee621b9f27b6a5f1c94535
BLAKE2b-256 38c8a4d382dd3dbbaaa2eab379566063406ce5fd4bcd3027b75e89a5601841dc

See more details on using hashes here.

File details

Details for the file thoipapy-1.2.0-py3-none-any.whl.

File metadata

  • Download URL: thoipapy-1.2.0-py3-none-any.whl
  • Upload date:
  • Size: 329.4 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/3.2.0 pkginfo/1.5.0.1 requests/2.24.0 setuptools/49.6.0.post20200814 requests-toolbelt/0.9.1 tqdm/4.49.0 CPython/3.8.5

File hashes

Hashes for thoipapy-1.2.0-py3-none-any.whl
Algorithm Hash digest
SHA256 52403c9767e53bd34dedd9a82d3988aac89590aa255da171d58d708f5cadcd42
MD5 811fd544ac37ed9d2d5993ee7f7c6c24
BLAKE2b-256 5a1666431b1409242506772452e8e0ef0ad1e341c982ca8162a92bb041805456

See more details on using hashes here.

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page