Skip to main content

Python client for CellBase

Project description

PyCellBase

  • This Python package makes use of the exhaustive RESTful Web service API that has been implemented for the CellBase database.

  • It enables to query and obtain a wealth of biological information from a single database, saving a lot of time.

  • As all information is integrated, queries about different biological topics can be easily and all this information can be linked together.

  • Currently Homo sapiens, Mus musculus and a total of 48 species are available and many others will be included soon.

  • More info about this package in the Python client section of the CellBase Wiki

Installation

Cloning

PyCellBase can be cloned in your local machine by executing in your terminal:

$ git clone https://github.com/opencb/cellbase.git

Once you have downloaded the project you can install the library:

$ cd cellbase/clients/python
$ python setup.py install

Usage

Getting started

The first step is to import the module and initialize the CellBaseClient:

>>> from pycellbase.cbclient import CellBaseClient
>>> cbc = CellBaseClient()

The second step is to create the specific client for the data we want to query (in this example we want to obtain information for a gene):

>>> gc = cbc.get_gene_client()

And now, you can start asking to the CellBase RESTful service by providing a query ID:

>>> tfbs_responses = gc.get_tfbs('BRCA1')  # Obtaining TFBS for this gene

Responses are retrieved as JSON formatted data. Therefore, fields can be queried by key:

>>> tfbs_responses = gc.get_tfbs('BRCA1')
>>> tfbs_responses[0]['result'][0]['tfName']
'E2F4'

>>> transcript_responses = gc.get_transcript('BRCA1')
>>> 'Number of transcripts: %d' % (len(transcript_responses[0]['result']))
'Number of transcripts: 27'

>>> for tfbs_response in gc.get_tfbs('BRCA1,BRCA2,LDLR'):
...     print('Number of TFBS for "%s": %d' % (tfbs_response['id'], len(tfbs_response['result'])))
'Number of TFBS for "BRCA1": 175'
'Number of TFBS for "BRCA2": 43'
'Number of TFBS for "LDLR": 141'

Data can be accessed specifying comma-separated IDs or a list of IDs:

>>> tfbs_responses = gc.get_tfbs('BRCA1')
>>> len(tfbs_responses)
1

>>> tfbs_responses = gc.get_tfbs('BRCA1,BRCA2')
>>> len(tfbs_responses)
2

>>> tfbs_responses = gc.get_tfbs(['BRCA1', 'BRCA2'])
>>> len(tfbs_responses)
2

If there is an available resource in the CellBase Webservices, but there is not an available method in this python package, the CellBaseClient can be used to create the URL of interest and query the RESTful service:

>>> tfbs_responses = cbc.get(category='feature', subcategory='gene', query_id='BRCA1', resource='tfbs')
>>> tfbs_responses[0]['result'][0]['tfName']
'E2F4'

Optional filters and extra options can be added as key-value parameters (value can be a comma-separated string or a list):

>>> tfbs_responses = gc.get_tfbs('BRCA1')
>>> len(res[0]['result'])
175

>>> tfbs_responses = gc.get_tfbs('BRCA1', include='name,id')  # Return only name and id
>>> len(res[0]['result'])
175

>>> tfbs_responses = gc.get_tfbs('BRCA1', include=['name', 'id'])  # Return only name and id
>>> len(res[0]['result'])
175

>>> tfbs_responses = gc.get_tfbs('BRCA1', **{'include': 'name,id'])  # Return only name and id
>>> len(res[0]['result'])
175

>>> tfbs_responses = gc.get_tfbs('BRCA1', limit=100)  # Limit to 100 results
>>> len(res[0]['result'])
100

>>> tfbs_responses = gc.get_tfbs('BRCA1', skip=100)  # Skip first 100 results
>>> len(res[0]['result'])
75

What can I ask for?

The best way to know which data can be retrieved for each client is either checking out the RESTful web services section of the CellBase Wiki or the CellBase web services

Configuration

Configuration stores the REST services host, API version and species.

Getting the default configuration:

>>> ConfigClient().get_default_configuration()
{'version': 'v4',
 'species': 'hsapiens',
 'rest': {'hosts': ['http://bioinfo.hpc.cam.ac.uk:80/cellbase']}}

Showing the configuration parameters being used at the moment:

>>> cbc.show_configuration()
{'host': 'bioinfo.hpc.cam.ac.uk:80/cellbase',
 'version': 'v4',
 'species': 'hsapiens'}

A custom configuration can be passed to CellBaseClient using a ConfigClient object. JSON and YML files are supported:

>>> from pycellbase.cbconfig import ConfigClient
>>> from pycellbase.cbclient import CellBaseClient

>>> cc = ConfigClient('config.json')
>>> cbc = CellBaseClient(cc)

A custom configuration can also be passed as a dictionary:

>>> from pycellbase.cbconfig import ConfigClient
>>> from pycellbase.cbclient import CellBaseClient

>>> custom_config = {'rest': {'hosts': ['bioinfo.hpc.cam.ac.uk:80/cellbase']}, 'version': 'v4', 'species': 'hsapiens'}
>>> cc = ConfigClient(custom_config)
>>> cbc = CellBaseClient(cc)

If you want to change the configuration on the fly you can directly modify the ConfigClient object:

>>> cc = ConfigClient()
>>> cbc = CellBaseClient(cc)

>>> cbc.show_configuration()['version']
'v4'
>>> cc.version = 'v3'
>>> cbc.show_configuration()['version']
'v3'

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

pycellbase-4.5.1.tar.gz (22.4 kB view details)

Uploaded Source

Built Distribution

pycellbase-4.5.1-py2.py3-none-any.whl (25.0 kB view details)

Uploaded Python 2 Python 3

File details

Details for the file pycellbase-4.5.1.tar.gz.

File metadata

  • Download URL: pycellbase-4.5.1.tar.gz
  • Upload date:
  • Size: 22.4 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No

File hashes

Hashes for pycellbase-4.5.1.tar.gz
Algorithm Hash digest
SHA256 0c9798f3cb31904a57f0c8999e1b6b74f830773504ee980d8f50b4e8b4313f90
MD5 73b8d28c1a83cdec5e2ff7adfa7c6b19
BLAKE2b-256 da52db3c08320bf200e50b6999d2c8084a72c8b0a470cd6234d5dcedc67d82f8

See more details on using hashes here.

Provenance

File details

Details for the file pycellbase-4.5.1-py2.py3-none-any.whl.

File metadata

File hashes

Hashes for pycellbase-4.5.1-py2.py3-none-any.whl
Algorithm Hash digest
SHA256 21b7c3bd403c22a827c7bef2a4b6e2111d78a1183575be0cf5358b1882b03ef6
MD5 2dc3bba734c42290d7112a653030fbd2
BLAKE2b-256 24cd48cdca4f50eb7a3a774bf5defe2fdb732b66d76a30fec4c4a39e86c2d50b

See more details on using hashes here.

Provenance

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page