Skip to main content

Tools for handling DICOM based whole scan images

Project description

wsidicom

wsidicom is a Python package for reading DICOM WSI. The aims with the project are:

  • Easy to use interface for reading and writing WSI DICOM images and annotations either from file or through DICOMWeb.
  • Support the latest and upcoming DICOM standards.
  • Platform independent installation via PyPI.

Installing wsidicom

wsidicom is available on PyPI:

pip install wsidicom

And through conda:

conda install -c conda-forge wsidicom

Important note

Please note that this is an early release and the API is not frozen yet. Function names and functionality is prone to change.

Requirements

wsidicom uses pydicom, numpy, Pillow (with jpeg and jpeg2000 plugins), and dicomweb-client.

Limitations

Levels are required to have (close to) 2 factor scale and same tile size.

Only JPEGBaseline8Bit, JPEG2000 and JPEG2000Lossless transfer syntax is supported.

Optical path identifiers needs to be unique across file set.

Basic usage

Load a WSI dataset from files in folder.

from wsidicom import WsiDicom
slide = WsiDicom.open(path_to_folder)

Or load a WSI dataset from opened streams.

from wsidicom import WsiDicom

slide = WsiDicom.open([file_stream_1, file_stream_2, ... ])

Or load a WSI dataset from DICOMWeb.

from wsidicom import WsiDicom, WsiDicomWebClient
from requests.auth import HTTPBasicAuth

auth = HTTPBasicAuth('username', 'password')
client = WsiDicomWebClient.create_client(
    'dicom_web_hostname',
    '/qido',
    '/wado,
    auth
)
slide = WsiDicom.open_web(
    client,
    "study uid to open",
    "series uid to open" or ["series uid 1 to open", "series uid 2 to open"]
)

Alternatively, if you have already created an instance of dicomweb_client.DICOMwebClient, that may be used to create the WsiDicomWebClient like so:

dw_client = DICOMwebClient(url)
client = WsiDicomWebClient(dw_client)

Then proceed to call WsiDicom.open_web() with this as in the first example.

Use as a context manager.

from wsidicom import WsiDicom
with WsiDicom.open(path_to_folder) as slide:
    ...

Read a 200x200 px region starting from px 1000, 1000 at level 6.

region = slide.read_region((1000, 1000), 6, (200, 200))

Read a 2000x2000 px region starting from px 1000, 1000 at level 4 using 4 threads.

region = slide.read_region((1000, 1000), 6, (200, 200), threads=4)

Read 3x3 mm region starting at 0, 0 mm at level 6.

region_mm = slide.read_region_mm((0, 0), 6, (3, 3))

Read 3x3 mm region starting at 0, 0 mm with pixel spacing 0.01 mm/px.

region_mpp = slide.read_region_mpp((0, 0), 0.01, (3, 3))

Read a thumbnail of the whole slide with maximum dimensions 200x200 px.

thumbnail = slide.read_thumbnail((200, 200))

Read an overview image (if available).

overview = slide.read_overview()

Read a label image (if available).

label = slide.read_label()

Read (decoded) tile from position 1, 1 in level 6.

tile = slide.read_tile(6, (1, 1))

Read (encoded) tile from position 1, 1 in level 6.

tile_bytes = slide.read_encoded_tile(6, (1, 1))

Close files

slide.close()

Saving files

An opened WsiDicom instance can be saved to a new path using the save()-method. The produced files will be:

  • Fully tiled. Any sparse tiles will be replaced with a blank tile with color depending on the photometric interpretation.
  • Have a basic offset table (or optionally an extended offset table or no offset table).
  • Not be concatenated.

The frames are copied as-is, i.e. without re-compression.

with WsiDicom.open(path_to_folder) as slide:
    slide.save(path_to_output)

The output folder must already exists. Be careful to specify a unique folder folder to avoid mixing files from different images.

Settings

wsidicom can be configured with the settings variable. For example, set the parsing of files to strict:

from wsidicom import settings
settings.strict_uid_check = True
settings._strict_attribute_check = True

Annotation usage

Annotations are structured in a hierarchy:

  • AnnotationInstance Represents a collection of AnnotationGroups. All the groups have the same frame of reference, i.e. annotations are from the same wsi stack.
  • AnnotationGroup Represents a group of annotations. All annotations in the group are of the same type (e.g. PointAnnotation), have the same label, description and category and type. The category and type are codes that are used to define the annotated feature. A good resource for working with codes is available here.
  • Annotation Represents a annotation. An Annotation has a geometry (currently Point, Polyline, Polygon) and an optional list of Measurements.
  • Measurement Represents a measurement for an Annotation. A Measurement consists of a type-code (e.g. "Area"), a value and a unit-code ("mm")

Codes that are defined in the 222-draft can be created using the create(source, type) function of the ConceptCode-class.

Load a WSI dataset from files in folder.

from wsidicom import WsiDicom
slide = WsiDicom.open(path_to_folder)

Create a point annotation at x=10.0, y=20.0 mm.

from wsidicom import Annotation, Point
point_annotation = Annotation(Point(10.0, 20.0))

Create a point annotation with a measurement.

from wsidicom import ConceptCode, Measurement
# A measurement is defined by a type code ('Area'), a value (25.0) and a unit code ('Pixels).
area = ConceptCode.measurement('Area')
pixels = ConceptCode.unit('Pixels')
measurement = Measurement(area, 25.0, pixels)
point_annotation_with_measurment = Annotation(Point(10.0, 20.0), [measurement])

Create a group of the annotations.

from wsidicom import PointAnnotationGroup
# The 222 suplement requires groups to have a label, a category and a type
group = PointAnnotationGroup(
    annotations=[point_annotation, point_annotation_with_measurment],
    label='group label',
    categorycode=ConceptCode.category('Tissue'),
    typecode=ConceptCode.type('Nucleus'),
    description='description'
)

Create a collection of annotation groups.

from wsidicom import AnnotationInstance
annotations = AnnotationInstance([group], 'volume', slide.uids)

Save the collection to file.

annotations.save('path_to_dicom_dir/annotation.dcm')

Reopen the slide and access the annotation instance.

slide = WsiDicom.open(path_to_folder)
annotations = slide.annotations

Setup environment for development

Requires poetry installed in the virtual environment.

git clone https://github.com/imi-bigpicture/wsidicom.git
poetry install

To watch unit tests use:

poetry run pytest-watch -- -m unittest

The integration tests uses test images from nema.org that's needs to be downloaded. The location of the test images can be changed from the default tests\testdata\slides using the environment variable WSIDICOM_TESTDIR. Download the images using the supplied script:

python .\tests\download_test_images.py

If the files are already downloaded the script will validate the checksums.

To run integration tests:

poetry run pytest -m integration

Data structure

A WSI DICOM pyramid is in wsidicom represented by a hierarchy of objects of different classes, starting from bottom:

  • WsiDicomFile, represents a WSI DICOM file, used for accessing WsiDicomFileImageData and WsiDataset.
  • WsiDicomFileImageData, represents the image data in one or several WSI DICOM files.
  • WsiDataset, represents the image metadata in one or several WSI DICOM files.
  • WsiInstance, represents image data and image metadata.
  • Level, represents a group of instances with the same image size, i.e. of the same level.
  • Levels, represents a group of levels, i.e. the pyrimidal structure.
  • WsiDicom, represents a collection of levels, labels and overviews.

Labels and overviews are structured similarly to levels, but with somewhat different properties and restrictions. For DICOMWeb the WsiDicomFile* classes are replaced with WsiDicomWeb* classes.

A Source is used to create WsiInstances, either from files (WsiDicomFileSource) or DICOMWeb (WsiDicomWebSource), and can be used to to Initiate a WsiDicom object. A source is easiest created with the open() and open_web() helper functions, e.g.:

slide = WsiDicom.open(path_to_folder)

Code structure

  • wsidicom.py - Main class with methods to open DICOM WSI objects.
  • source.py - Metaclass Source for serving WsiInstances to WsiDicom.
  • series - Series implementations Levels, Labels, and Overview.
  • group - Group implementations, e.g. Level.
  • instance - Instance implementations WsiIsntance and WsiDataset, the metaclass ImageData and ImageData implementations WsiDicomImageData and PillowImageData.
  • file - Implementation for reading and writing DICOM WSI files.
  • web - Implementation for reading DICOM WSI from DICOMWeb.
  • graphica_annotations - Handling graphical annotations.
  • conceptcode.py - Handling of DICOM concept codes.
  • config.py - Handles configuration settings.
  • errors.py - Custom errors.
  • geometry.py - Classes for geometry handling.
  • optical.py - Handles optical paths.
  • uid.py - Handles DICOM uids.
  • stringprinting.py - For nicer string printing of objects.

Adding support for other file formats

Support for other formats (or methods to access DICOM data) can be implemented by creating a new Source implementation, that should create WsiInstances for the implemented formats. A format specific implementations of the ImageData is likely needed to access the WSI image data. Additionally a WsiDataset needs to be created that returns matching metadata for the WSI.

The implemented Source can then create a instance from the implemented ImageData (and a method returning a WsiDataset):

image_data = MyImageData('path_to_image_file')
dataset = create_dataset_from_image_data(image_data)
instance = WsiInstance(dataset, image_data)

The source should arrange the created instances and return them at the level_instances, label_instances, and overview_instances properties. WsiDicom can then open the source object and arrange the instances into levels etc as described in 'Data structure'.

Other DICOM python tools

Contributing

We welcome any contributions to help improve this tool for the WSI DICOM community!

We recommend first creating an issue before creating potential contributions to check that the contribution is in line with the goals of the project. To submit your contribution, please issue a pull request on the imi-bigpicture/wsidicom repository with your changes for review.

Our aim is to provide constructive and positive code reviews for all submissions. The project relies on gradual typing and roughly follows PEP8. However, we are not dogmatic. Most important is that the code is easy to read and understand.

Acknowledgement

wsidicom: Copyright 2021 Sectra AB, licensed under Apache 2.0.

This project is part of a project that has received funding from the Innovative Medicines Initiative 2 Joint Undertaking under grant agreement No 945358. This Joint Undertaking receives support from the European Union’s Horizon 2020 research and innovation programme and EFPIA. IMI website: <www.imi.europa.eu>

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

wsidicom-0.13.0.tar.gz (76.7 kB view hashes)

Uploaded Source

Built Distribution

wsidicom-0.13.0-py3-none-any.whl (102.9 kB view hashes)

Uploaded Python 3

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page