Skip to main content

A REST client for OpenCGA REST web services

Project description

PyOpenCGA

This Python client package makes use of the comprehensive RESTful web services API implemented for the OpenCGA platform. OpenCGA is an open-source project that implements a high-performance, scalable and secure platform for Genomic data analysis and visualisation

OpenCGA implements a secure and high performance platform for Big Data analysis and visualisation in current genomics. OpenCGA uses the most modern and advanced technologies to scale to petabytes of data. OpenCGA is designed and implemented to work with few million genomes. It is built on top of three main components: Catalog, Variant and Alignment Storage and Analysis.

More info about this project in the OpenCGA Docs

Installation

Cloning

PyOpenCGA can be cloned in your local machine by executing in your terminal:

$ git clone https://github.com/opencb/opencga.git

Once you have downloaded the project you can install the library. We recommend to install it inside a virtual environment:

$ cd opencga/tree/develop/opencga-client/src/main/python/pyOpenCGA
$ python setup.py install

Pip install

Run the following command in the shell:

$ pip install pyopencga

Usage

Import pyOpenCGA package

The first step is to import the ClientConfiguration and OpenCGAClient from pyOpenCGA:

>>> from pyopencga.opencga_config import ClientConfiguration
>>> from pyopencga.opencga_client import OpenCGAClient

Setting up server host configuration

The second step is to generate a ClientConfiguration instance by passing a configuration dictionary containing the host to point to or a client-configuration.yml file:

>>> config = ClientConfiguration('/opt/opencga/conf/client-configuration.yml')
>>> config = ClientConfiguration({
        "rest": {
                "host": "http://bioinfo.hpc.cam.ac.uk/opencga-demo"
        }
    })

Log in to OpenCGA host server

With this configuration you can initialize the OpenCGAClient, and log in:

>>> oc = OpenCGAClient(config)
>>> oc.login('user')

For scripting or using Jupyter Notebooks is preferable to load user credentials from an external JSON file.

Once you are logged in, it is mandatory to use the token of the session to propagate the access of the clients to the host server:

>>> token = oc.token
>>> print(token)
eyJhbGciOi...

>>> oc = OpenCGAClient(configuration=config_dict, token=token)

Examples

The next step is to get an instance of the clients we may want to use:

>>> projects = oc.projects # Project client
>>> studies = oc.studies   # Study client
>>> samples = oc.samples # Sample client
>>> cohorts = oc.cohorts # Cohort client

Now you can start asking to the OpenCGA RESTful service with pyOpenCGA:

>>> for project in projects.search(owner=user).get_results():
...    print(project['id'])
project1
project2
[...]

There are two different ways to access to the query response data:

>>> foo_client.method().get_results() # Iterates over all the results of all the QueryResults
>>> foo_client.method().get_responses() # Iterates over all the responses

Data can be accessed specifying comma-separated IDs or a list of IDs:

>>> samples = 'NA12877,NA12878,NA12879'
>>> samples_list = ['NA12877','NA12878','NA12879']
>>> sc = oc.samples

>>> for result in sc.info(query_id=samples, study='user@project1:study1').get_results():
...     print(result['id'], result['attributes']['OPENCGA_INDIVIDUAL']['disorders'])
NA12877 [{'id': 'OMIM6500', 'name': "Chron's Disease"}]
NA12878 []
NA12879 [{'id': 'OMIM6500', 'name': "Chron's Disease"}]

>>> for result in sc.info(query_id=samples_list, study='user@project1:study1').get_results():
...     print(result['id'], result['attributes']['OPENCGA_INDIVIDUAL']['disorders'])
NA12877 [{'id': 'OMIM6500', 'name': "Chron's Disease"}]
NA12878 []
NA12879 [{'id': 'OMIM6500', 'name': "Chron's Disease"}]

Optional filters and extra options can be added as key-value parameters (where the values can be a comma-separated string or a list).

What can I ask for?

The best way to know which data can be retrieved for each client check OpenCGA web services swagger.

Project details


Release history Release notifications | RSS feed

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

pyopencga-2.1.0rc2.tar.gz (49.1 kB view details)

Uploaded Source

Built Distribution

pyopencga-2.1.0rc2-py3-none-any.whl (71.6 kB view details)

Uploaded Python 3

File details

Details for the file pyopencga-2.1.0rc2.tar.gz.

File metadata

  • Download URL: pyopencga-2.1.0rc2.tar.gz
  • Upload date:
  • Size: 49.1 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/3.4.2 importlib_metadata/3.10.0 pkginfo/1.5.0.1 requests/2.23.0 requests-toolbelt/0.9.1 tqdm/4.41.1 CPython/3.6.9

File hashes

Hashes for pyopencga-2.1.0rc2.tar.gz
Algorithm Hash digest
SHA256 2125aa85f46d080bdae464847cc496b5656105743b0be3d5f24cb8d0d4c724cc
MD5 7cba47828aedb840bb2f756035415a2e
BLAKE2b-256 d2123a2a916cd0ebb15e6a68725b046a2c3a798edfd9ff6e332a7fbc47434afb

See more details on using hashes here.

Provenance

File details

Details for the file pyopencga-2.1.0rc2-py3-none-any.whl.

File metadata

  • Download URL: pyopencga-2.1.0rc2-py3-none-any.whl
  • Upload date:
  • Size: 71.6 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/3.4.2 importlib_metadata/3.10.0 pkginfo/1.5.0.1 requests/2.23.0 requests-toolbelt/0.9.1 tqdm/4.41.1 CPython/3.6.9

File hashes

Hashes for pyopencga-2.1.0rc2-py3-none-any.whl
Algorithm Hash digest
SHA256 302ad2c7f12ab95e9c156d294c93d40e9a4f794c763a41900b72042610ae0432
MD5 9ff5dcf676c53a96a09e53315de94bfa
BLAKE2b-256 d919f2fa63575d154773d5d45625a286f540960f4ae6704ba3f3217ded2117a0

See more details on using hashes here.

Provenance

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page