Skip to main content

A Python library interface to the International Cancer Genome Consortium's Web Portal

Project description

The ICGC REST client is a simple python module that allows you to access the International Consortium for Cancer and Genomics web portal (<https://dcc.icgc.org/>) directly through Python, with a minimum of coding effort.

It lets you write queries in our Portal Query Language ( PQL ) that fetch data from the ICGC web portal as JSON objects. From there, you can use the power of Python to process and analyze the data within those objects however you see fit.

Here’s an example that shows you how easy it is to get started!

"""
query.py

This script demonstrates running a simple PQL query against the ICGC data
portal with the icgc module.
"""
from __future__ import absolute_import, print_function

import icgc


def run():
    """
    Demonstrate PQL by displaying 1 of each request type as JSON output
    """
    for request_type in icgc.request_types():
        response = icgc.query(request_type=request_type,
                              pql='select(*),limit(1)')
        print(request_type, "===\n\n", response)


if __name__ == '__main__':
    run()

Here’s an a simple program that demonstrates how Python can be used with the icgc Python module to automate decision making: in this case, which files we want to download from the ICGC web portal.

from __future__ import absolute_import, print_function
import icgc

KB = 1024
MB = 1024 * KB


def run():
    """
    Show an example of a PQL download with automated decision making.

    We download up to a maximum of 10 MB of data from the portal, of any type
    that will fit within our download limit, and save our the results as a
    tarfile named 'test.tar'.
    """
    pql = 'eq(donor.primarySite,"Brain")'

    # Find which items are available that match our pql query, and how big
    # each of the result file are.

    sizes = icgc.download_size(pql)
    print("Sizes are: {}".format(sizes))

    # We'll only include  a file in our tarfile if the total is below our
    # 10 MB limit. Our tarfile size calculation is approximate; the
    # files inside the tarfile get compressed; so the total size of the tarfile
    # that we download might be smaller than we calculate.

    max_size = 10 * MB
    current_size = 0

    includes = []
    for k in sizes:
        item_size = sizes[k]
        if current_size + item_size < max_size:
            includes.append(k)
            current_size += item_size

    print("Including items {}".format(includes))
    print("Approximate download size={:.2f} MB".format(current_size / MB))

    # Download the information, and save the results in the file "test.tar"
    icgc.download(pql, includes, "test")


if __name__ == "__main__":
    run()

Installation

You can install icgc using pip by running:

pip install icgc

If you prefer, you can also download the source code from the url below.

Contribute

If you’d like to contribute to this project, it’s hosted on github.

See https://github.com/icgc-dcc/icgc-python

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

icgc-0.1.3.tar.gz (6.4 kB view details)

Uploaded Source

Built Distribution

icgc-0.1.3-py2.py3-none-any.whl (7.8 kB view details)

Uploaded Python 2 Python 3

File details

Details for the file icgc-0.1.3.tar.gz.

File metadata

  • Download URL: icgc-0.1.3.tar.gz
  • Upload date:
  • Size: 6.4 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No

File hashes

Hashes for icgc-0.1.3.tar.gz
Algorithm Hash digest
SHA256 e6ef9f0815126f6c388ce144937015f1f8f2572d12099695c22963010c53506f
MD5 7bb5725d5deed5eb08bbad26725f8058
BLAKE2b-256 a135fc95fed0e78e981f8a184b2f809af0397a9d1ccb6b946e8b547b77c71a88

See more details on using hashes here.

File details

Details for the file icgc-0.1.3-py2.py3-none-any.whl.

File metadata

File hashes

Hashes for icgc-0.1.3-py2.py3-none-any.whl
Algorithm Hash digest
SHA256 014371acb2165deb92e6a4b4ebd9cb77eaf00711a33b783ddb9384897b863020
MD5 ea4b7a65d326f1c2b9a6f9bb54f25658
BLAKE2b-256 170526e025827c7bb6ca56513adedb6eeff5facd7f570c5443c49a72fb4295eb

See more details on using hashes here.

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page