Skip to main content

A library for searching for PDB structures using the official APIs.

Project description

travis coveralls pypi version commit

pdbsearch is a Python library for searching for PDB structures using the RCSB web services.

Example

>>> import pdbsearch
>>> codes = pdbsearch.search(limit=5, ligand_name="CU")
>>> codes
['3HW7', '2WKO', '2WOF', '2WOH', '2WO0']

Installing

pip

pdbsearch can be installed using pip (you may need to use pip3):

$ pip install pdbsearch

If you get permission errors, try using sudo:

$ sudo pip install pdbsearch

Development

The repository for pdbsearch, containing the most recent iteration, can be found here. To clone the pdbsearch repository directly from there, use:

$ git clone git://github.com/samirelanduk/pdbsearch.git

Requirements

pdbsearch requires requests.

Testing

To test a local version of pdbsearch, cd to the pdbsearch directory and run:

$ python -m unittest discover tests

You can opt to only run unit tests or integration tests:

$ python -m unittest discover tests.unit $ python -m unittest discover tests.integration

Overview

pdbsearch is a Python library for searching for PDB structures using the RCSB web services.

Returning all PDB Codes

You can get all PDB codes without any particular search expression like so:

>>> import pdbsearch
>>> codes = pdbsearch.search(limit=None)
>>> len(codes)
174994

This will take a few seconds, and requires downloading a rather large JSON object over the network. Generally it is better to paginate the results:

>>> first_ten_codes = pdbsearch.search(limit=10)
>>> second_ten_codes = pdbsearch.search(start=10, limit=10)
>>> third_ten_codes = pdbsearch.search(start=20, limit=10)

You can sort the results by any of the terms at https://search.rcsb.org/structure-search-attributes.html:

>>> most_recent_codes = pdbsearch.search(sort="rcsb_accession_info.deposit_date")
>>> earliest_codes = pdbsearch.search(sort="-rcsb_accession_info.deposit_date")

As these are somewhat cumbersome, some of them have a shorthand:

>>> pdbsearch.search(limit=5, sort="code")
['9XIM', '9XIA', '9WGA', '9RUB', '9RSA']
>>> pdbsearch.search(limit=5, sort="-resolution")
['3NIR', '5D8V', '1EJG', '3P4J', '5NW3']

You can sort by multiple criteria:

>>> pdbsearch.search(limit=5, sort=["-atoms", "released"])
['1ANP', '6UOU', '6UOW', '1Q7O', '6QTF']

Search Criteria

You can search by passing keywords to the search function:

>>> pdbsearch.search(limit=5, ligand_name="ZN")
['3HW7', '3I7I', '3I7G', '2WFX', '2WGT']

You can modify the operator used with double underscores:

>>> pdbsearch.search(limit=5, ligand_name__in=["ZN", "CU"])
['3HW7', '3I7I', '3I7G', '2WFX', '2WGT']
>>> pdbsearch.search(limit=5, resolution__lt=2)
['3HW3', '3I83', '3HVS', '3HW4', '3HW5']
>>> pdbsearch.search(limit=5, atoms__within=[200, 300])
['2WH9', '2WPY', '395D', '396D', '2X8Q']

These are some shorthands, but you can search by any of the terms in the above linked list by replacing the dot with a double underscore:

>>> pdbsearch.search(limit=5, citation__rcsb_authors="Sula, A.")
['4CAH', '4CAI', '4X8A', '4X88', '4X89']

If you use more than one term, they will be combined with AND operators:

>>> pdbsearch.search(limit=5, ligand_name="ZN", atoms__within=[200, 300])
['3WUP', '3ZNF', '2YTA', '2YTB', '2YSV']

Changelog

Release 0.4.0

24 Jul 2022

  • Updated library for v2 of the RCSB search API.

Release 0.3.0

29 May 2021

  • Added search criteria.

  • Added AND chaining for search criteria.

Release 0.2.0

25 April 2021

  • Added ability to sort results.

  • Created shorthand system for common sort criteria.

Release 0.1.0

2 March 2021

  • Started library.

  • Added ability to fetch all PDB codes.

  • Basic pagination.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distributions

No source distribution files available for this release.See tutorial on generating distribution archives.

Built Distribution

pdbsearch-0.4.0-py3-none-any.whl (6.2 kB view hashes)

Uploaded Python 3

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page