A REST client for OpenCGA REST web services
Project description
PyOpenCGA
This Python client package makes use of the comprehensive RESTful web services API implemented for the OpenCGA platform. OpenCGA is an open-source project that implements a high-performance, scalable and secure platform for Genomic data analysis and visualisation
OpenCGA implements a secure and high performance platform for Big Data analysis and visualisation in current genomics. OpenCGA uses the most modern and advanced technologies to scale to petabytes of data. OpenCGA is designed and implemented to work with few million genomes. It is built on top of three main components: Catalog, Variant and Alignment Storage and Analysis.
More info about this project in the OpenCGA Docs
Installation
Cloning
PyOpenCGA can be cloned in your local machine by executing in your terminal:
$ git clone https://github.com/opencb/opencga.git
Once you have downloaded the project you can install the library. We recommend to install it inside a virtual environment:
$ cd opencga/tree/develop/opencga-client/src/main/python/pyOpenCGA $ python setup.py install
Pip install
Run the following command in the shell:
$ pip install pyopencga
Usage
Import pyOpenCGA package
The first step is to import the ConfigClient and OpenCGAClient from pyOpenCGA:
>>> from pyopencga.opencga_config import ConfigClient
>>> from pyopencga.opencga_client import OpenCGAClient
Setting up server host configuration
The second step is to set up the OpenCGA host server configuration you can get a basic configuration dictionary specifying your OpenCGA server host:
>>> host = 'http://bioinfodev.hpc.cam.ac.uk/opencga-test' # Use a server host where you have an account
>>> cc = ConfigClient()
>>> config_dict = cc.get_basic_config_dict(host)
>>> print(config_dict)
{'version': 'v1', 'rest': {'hosts': ['http://bioinfodev.hpc.cam.ac.uk/opencga-test']}}
Log in to OpenCGA host server
With this configuration you can initialize the OpenCGAClient and log into an OpenCGA user account, specifying a user and password:
>>> oc = OpenCGAClient(configuration=config_dict,user='user_id',pwd='user_password')
For scripting or using Jupyter Notebooks is preferable to load user credentials from an external JSON file.
Once you are logged in, it is mandatory to use the token of the session to propagate the access of the clients to the host server:
>>> token = oc.session_id
>>> print(token)
eyJhbGciOi...
>>> oc = OpenCGAClient(configuration=config_dict, session_id=token)
Examples
The next step is to create the specific client for the data we want to query:
>>> projects = oc.projects # Query for projects
>>> studies = oc.studies # Query for studies
>>> samples = oc.samples() # Query for samples
>>> cohorts = oc.cohorts() # Query for cohorts
Now you can start asking to the OpenCGA RESTful service with pyOpenCGA:
>>> for project in projects.search(owner=user).results():
... print(project['id'])
project1
project2
[...]
There are four different ways to access to the query response data:
>>> foo_client.method().first() # Returns the first QueryResult
>>> foo_client.method().result(position=0) # Returns the result from all QueryResults in a given position
>>> foo_client.method().results() # Iterates over all the results of all the QueryResults
>>> foo_client.method().response # Returns the raw response of the QueryResponse
Data can be accessed specifying comma-separated IDs or a list of IDs:
>>> samples = 'NA12877,NA12878,NA12879'
>>> samples_list = ['NA12877','NA12878','NA12879']
>>> sc = oc.samples
>>> for result in sc.info(query_id=samples, study='user@project1:study1').results():
... print(result['id'], result['attributes']['OPENCGA_INDIVIDUAL']['disorders'])
NA12877 [{'id': 'OMIM6500', 'name': "Chron's Disease"}]
NA12878 []
NA12879 [{'id': 'OMIM6500', 'name': "Chron's Disease"}]
>>> for result in sc.info(query_id=samples_list, study='user@project1:study1').results():
... print(result['id'], result['attributes']['OPENCGA_INDIVIDUAL']['disorders'])
NA12877 [{'id': 'OMIM6500', 'name': "Chron's Disease"}]
NA12878 []
NA12879 [{'id': 'OMIM6500', 'name': "Chron's Disease"}]
Optional filters and extra options can be added as key-value parameters (where the values can be a comma-separated string or a list).
What can I ask for?
The best way to know which data can be retrieved for each client check OpenCGA web services swagger.
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
File details
Details for the file pyopencga-1.4.0.tar.gz
.
File metadata
- Download URL: pyopencga-1.4.0.tar.gz
- Upload date:
- Size: 26.6 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/1.13.0 pkginfo/1.5.0.1 requests/2.22.0 setuptools/41.0.1 requests-toolbelt/0.9.1 tqdm/4.32.1 CPython/3.6.7
File hashes
Algorithm | Hash digest | |
---|---|---|
SHA256 | af054d92635bcc728cf55809b28ea471c0fbca8b3b334fd096d144b83dd3fd8f |
|
MD5 | 5112d8fbcf8d8f5f275be6645157f95a |
|
BLAKE2b-256 | ac63573027d19d58ad9a03642776dcd3ac2cd77a86411396dc63d4a8de28bd4c |
Provenance
File details
Details for the file pyopencga-1.4.0-py3-none-any.whl
.
File metadata
- Download URL: pyopencga-1.4.0-py3-none-any.whl
- Upload date:
- Size: 43.8 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/1.13.0 pkginfo/1.5.0.1 requests/2.22.0 setuptools/41.0.1 requests-toolbelt/0.9.1 tqdm/4.32.1 CPython/3.6.7
File hashes
Algorithm | Hash digest | |
---|---|---|
SHA256 | fd4374fbfd5854f2b7c8345fd0c852a0f693842ca7962433c454690fb8c7b844 |
|
MD5 | 47f11a18d6408cfc5148c6871fd2205d |
|
BLAKE2b-256 | 1bf05c59e3b1d7859b8b83eb93eb871227bea7b7c40a70a7a4cb7a282108a6d8 |