Skip to main content

CathPy - Python Bioinformatics Toolkit for CATH (Protein Classification).

Project description

cathpy

Documentation Status Build Status codecov

cathpy is a Bioinformatics toolkit written in Python. It is developed and maintained by the Orengo Group at UCL and is used for maintaining the CATH protein structure database (and associated research).

Getting Started

The easiest way to use this code is by installing the latest version into a virtual environment via pip:

$ python3 -m venv venv
$ source venv/bin/activate
$ pip install cathpy

If everything is installed and working properly then the following should work:

$ cath-align-summary -d tests/data/funfams/
file aln_len seq_count dops gap_per
tests/data/funfams/1.10.8.10-ff-14534.reduced.sto                          69     51  61.53  12.53
tests/data/funfams/1.10.8.10-ff-15516.reduced.sto                          66    429 100.00  13.04
tests/data/funfams/1.10.8.10-ff-5069.reduced.sto                           59     14   7.81   3.15
tests/data/funfams/1.10.8.10-ff-15593.reduced.sto                          63    203  95.88  17.70

Now go and have a look at the documentation.

Contributing

There are many ways to contribute, all of which are most welcome.

  • If something is not clear then you have identified a gap in the documentation, please let us know by raising a new issue
  • If it looks like you should be able to do something that you can't then you've either identified a new feature request or a documentation gap - please let us know by raising a new issue
  • If you have noticed some unexpected behaviour, you may have found a bug - please let us know by raising a new issue

When you do raise an issue, it is extremely helpful if you first check that a similar issue has not already been registered. It would also be great if you can be as clear, concise and specific as possible. If you are reporting a potential bug, please try to provide steps that will allow us to reproduce the unexpected behaviour.

If you accompany your issue with a Pull Request that actually solves the documentation / feature request / bug fix then you may well be eligible for doughnuts.

Development

If you are developing, then this is the general recommended flow:

Get access to the latest version of the code and create a new branch (with a descriptive summary of your new feature/bugfix):

$ git clone git@github.com:UCL/cathpy.git
$ cd cathpy
$ git checkout -b my-awesome-new-feature

Install the code (as editable package) into virtual environment

$ python3 -m venv venv
$ source venv/bin/activate
$ pip install -e .

Write your tests, make your changes then make sure your tests (and all the other tests) still pass:

$ vim tests/my_new_feature_test.py
$ vim cathpy/my_new_feature.py
$ pytest

Then push your changes back to GitHub and raise a pull request through the web pages.

$ git push

FAQ

What is cathpy?

cathpy is a python package that contains bioinformatics tools and libraries used in CATH (protein structure classification resource at UCL).

Hmmm.. that sounds like Yet Another Python Bioinformatics Toolkit?

Well it is... sort of.

Should I be using it?

If you are looking for a general Bioinformatics toolkit, you should look at BioPython first.

The cathpy project does contain some generic functionality that may overlap with BioPython, however we are definitely not trying to rewrite that library. It has been published mainly for internal use (within CATH), however it has been released as open source in case others find the tools helpful.

External software

This code base contains external tools that are not written and maintained by the authors of this project. If you use the results of these tools, please reference the relevant papers.

GroupSim

Characterization and Prediction of Residues Determining Protein Functional Specificity. Capra JA and Singh M (2008).
Bioinformatics, 24(13): 1473-1480, 2008.

Scorecons

Scoring residue conservation. Valdar WSJ (2002)
Proteins: Structure, Function, and Genetics. 43(2): 227-241, 2002.

References

The most recent paper describing the CATH protein structure database:

CATH: expanding the horizons of structure-based functional annotations for genome sequences. Sillitoe I, et al (2018)
Nucleic Acids Research, Volume 47, Issue D1, 08 January 2019, Pages D280–D284, https://doi.org/10.1093/nar/gky1097

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

cathpy-0.3.4.tar.gz (53.8 kB view details)

Uploaded Source

Built Distribution

cathpy-0.3.4-py3-none-any.whl (483.6 kB view details)

Uploaded Python 3

File details

Details for the file cathpy-0.3.4.tar.gz.

File metadata

  • Download URL: cathpy-0.3.4.tar.gz
  • Upload date:
  • Size: 53.8 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/3.1.1 pkginfo/1.5.0.1 requests/2.22.0 setuptools/42.0.1 requests-toolbelt/0.9.1 tqdm/4.38.0 CPython/3.7.4

File hashes

Hashes for cathpy-0.3.4.tar.gz
Algorithm Hash digest
SHA256 ab40592e6d2421b1ae9a8854c1bb5a6a75339e9ecdd8a256c89566ab4e9389a8
MD5 b08d5478e4ade608d02d6b13ee7f1175
BLAKE2b-256 367610c48490a41aa87c5488251d94ceb33efc8e6d2fd06156c9285e133b3004

See more details on using hashes here.

File details

Details for the file cathpy-0.3.4-py3-none-any.whl.

File metadata

  • Download URL: cathpy-0.3.4-py3-none-any.whl
  • Upload date:
  • Size: 483.6 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/3.1.1 pkginfo/1.5.0.1 requests/2.22.0 setuptools/42.0.1 requests-toolbelt/0.9.1 tqdm/4.38.0 CPython/3.7.4

File hashes

Hashes for cathpy-0.3.4-py3-none-any.whl
Algorithm Hash digest
SHA256 6b945f77b87aebe8b7b1e69daaa5d9c8b40fd1a31a248629290e4ff3c0ec77c9
MD5 a3d8c6cb7c220f913651b7bd413bae4a
BLAKE2b-256 36821546985c5b7936e05a84e31b448f8866e2768db4d74d49acfe812ddc44e2

See more details on using hashes here.

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page