Skip to main content

Package to query and download data from an index of ImagingDataCommons

Project description

idc-index

Actions Status Documentation Status

PyPI version PyPI platforms

Discourse Forum

About

idc-index is a Python package that enables query of the basic metadata and download of DICOM files hosted by the NCI Imaging Data Commons (IDC).

👷 🚧 This package is in its early development stages. Its functionality and API will change. Stay tuned for the updates and documentation, and please share your feedback about it by opening issues in this repository, or by starting a discussion in IDC User forum.🚧

Usage

There are no prerequisites - just install the package ...

$ pip install idc-index

... and download files corresponding to any collection, DICOM PatientID/Study/Series as follows:

from idc_index import index

client = index.IDCClient()

all_collection_ids = client.get_collections()

client.download_from_selection(collection_id="rider_pilot", downloadDir="/some/dir")

... or run queries against the "mini" index of Imaging Data Commons data!

from idc_index import index

client = index.IDCClient()

query = """
SELECT
  collection_id,
  STRING_AGG(DISTINCT(Modality)) as modalities,
  STRING_AGG(DISTINCT(BodyPartExamined)) as body_parts
FROM
  index
GROUP BY
  collection_id
ORDER BY
  collection_id ASC
"""

client.sql_query(query)

Details of the attributes included in the index are in the release notes.

Tutorial

This package was first presented at the 2023 Annual meeting of Radiological Society of North America (RSNA) Deep Learning Lab IDC session.

Please check out this tutorial notebook for the introduction into using idc-index for navigating IDC data.

Resources

  • Imaging Data Commons Portal can be used to explore the content of IDC from the web browser
  • s5cmd is a highly efficient, open source, multi-platform S3 client that we use for downloading IDC data, which is hosted in public AWS and GCS buckets. Distributed on PyPI as s5cmd.
  • SlicerIDCBrowser 3D Slicer extension that relies on idc-index for search and download of IDC data

Acknowledgment

This software is maintained by the IDC team, which has been funded in whole or in part with Federal funds from the NCI, NIH, under task order no. HHSN26110071 under contract no. HHSN261201500003l.

If this package helped your research, we would appreciate if you could cite IDC paper below.

Fedorov, A., Longabaugh, W. J. R., Pot, D., Clunie, D. A., Pieper, S. D., Gibbs, D. L., Bridge, C., Herrmann, M. D., Homeyer, A., Lewis, R., Aerts, H. J. W., Krishnaswamy, D., Thiriveedhi, V. K., Ciausu, C., Schacherer, D. P., Bontempi, D., Pihl, T., Wagner, U., Farahani, K., Kim, E. & Kikinis, R. National Cancer Institute Imaging Data Commons: Toward Transparency, Reproducibility, and Scalability in Imaging Artificial Intelligence. RadioGraphics (2023). https://doi.org/10.1148/rg.230180

Project details


Release history Release notifications | RSS feed

This version

0.4.0

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

idc_index-0.4.0.tar.gz (18.8 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

idc_index-0.4.0-py3-none-any.whl (11.5 kB view details)

Uploaded Python 3

File details

Details for the file idc_index-0.4.0.tar.gz.

File metadata

  • Download URL: idc_index-0.4.0.tar.gz
  • Upload date:
  • Size: 18.8 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/5.0.0 CPython/3.12.2

File hashes

Hashes for idc_index-0.4.0.tar.gz
Algorithm Hash digest
SHA256 b7b3e02024811832cee4a697ca6396a0d940f84492e879a469622947f76ad696
MD5 1e5763c5228de4dcea5bcc3fcef49ee8
BLAKE2b-256 91ca4b013fac50c9d364bf118a9c2f3098c6bc8b05486a37cfb2ecfd69ee3160

See more details on using hashes here.

File details

Details for the file idc_index-0.4.0-py3-none-any.whl.

File metadata

  • Download URL: idc_index-0.4.0-py3-none-any.whl
  • Upload date:
  • Size: 11.5 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/5.0.0 CPython/3.12.2

File hashes

Hashes for idc_index-0.4.0-py3-none-any.whl
Algorithm Hash digest
SHA256 4b1ef67119ac1c314fb4257bd4942867fb90c16a1aa80a374835f0bf8e9ddcd0
MD5 0cca9d88a2089e51fd427ec2836de745
BLAKE2b-256 7ccd4e55a282cb305210699600ac3e4afbdcd729378b98b4480d32060406477d

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page