Skip to main content

TileDB-backed data structure for radiology data at scale

Project description

RadiObject

What? A TileDB-backed data structure for radiology data at scale.

Why? NIfTI/DICOM must be read from local disk and don't support partial reads. TileDB enables cloud-native storage (S3), efficient partial reads, and hierarchical organization of multi-volume datasets.

Thoughts

Installation

pip install radiobject

Quick Start

from radiobject import RadiObject

# Create from NIfTI files using images dict (recommended)
radi = RadiObject.from_niftis(
    uri="./my-dataset",
    images={
        "CT": "./imagesTr/*.nii.gz",      # Glob pattern
        "seg": "./labelsTr",               # Directory path
    },
    validate_alignment=True,               # Ensure matching subjects across collections
    obs_meta=metadata_df,                  # Optional subject-level metadata
)

# Access data (pandas-like)
vol = radi.CT.iloc[0]            # First CT volume
data = vol[100:200, :, :]        # Partial read (only loads needed tiles)

# Filtering (returns views)
subset = radi.filter("age > 40")       # Query expression
subset = radi.head(10)                 # First 10 subjects
subset.materialize("./subset")         # Write to storage

Works with local paths or S3 URIs (s3://bucket/dataset).

How It Works

NIfTI requires decompressing entire volumes; TileDB reads only the tiles needed. This enables 200-660x faster partial reads. See benchmarks →

Sample Data

Download sample datasets for tutorials and testing:

# Install download dependencies
pip install radiobject[download]

# Download BraTS brain tumor data (for tutorials 00-04)
python scripts/download_dataset.py msd-brain-tumour

# List all available datasets
python scripts/download_dataset.py --list

Documentation

License

MIT

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

radiobject-0.1.0.tar.gz (2.1 MB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

radiobject-0.1.0-py3-none-any.whl (75.0 kB view details)

Uploaded Python 3

File details

Details for the file radiobject-0.1.0.tar.gz.

File metadata

  • Download URL: radiobject-0.1.0.tar.gz
  • Upload date:
  • Size: 2.1 MB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.11.14

File hashes

Hashes for radiobject-0.1.0.tar.gz
Algorithm Hash digest
SHA256 e3e71aa74ecc2950c4d3bd27c252676c5994449c01791371812d40358d4ff3bc
MD5 2bda3000e7732ad603b0f8ee40f83a66
BLAKE2b-256 18d5847970a88f3fe93ed673aad3d702458a2c047dd5f0717946d2fe32a6f25c

See more details on using hashes here.

File details

Details for the file radiobject-0.1.0-py3-none-any.whl.

File metadata

  • Download URL: radiobject-0.1.0-py3-none-any.whl
  • Upload date:
  • Size: 75.0 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.11.14

File hashes

Hashes for radiobject-0.1.0-py3-none-any.whl
Algorithm Hash digest
SHA256 6b57e168d3922853c5255badfa33c6ed117a05252c21803406b4d1f9aef16e0d
MD5 427ffb2eaa2f757b59ad8bf1e8be42ab
BLAKE2b-256 afd4ee5c51532fc0dabd3a9bf325f6e357d432d003a17d4350a6ced141f48755

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page