Skip to main content

Tools for handling DICOM based whole scan images

Project description

wsidicom

wsidicom is a Python package for reading DICOM WSI file sets. The aims with the project are:

  • Easy to use interface for reading and writing WSI DICOM images and annotations using the DICOM Media Storage Model.
  • Support the latest and upcomming DICOM standards.
  • Platform independent installation via PyPI.

Installing wsidicom

wsidicom is available on PyPI:

$ pip install wsidicom

Important note

Please note that this is an early release and the API is not frozen yet. Function names and functionality is prone to change.

Requirements

wsidicom uses pydicom, numpy and Pillow (with jpeg and jpeg2000 plugins).

Limitations

Levels are required to have (close to) 2 factor scale and same tile size.

Only JPEGBaseline8Bit, JPEG2000 and JPEG2000Lossless transfer syntax is supported.

Optical path identifiers needs to be unique across file set.

Basic usage

Load a WSI dataset from files in folder.

from wsidicom import WsiDicom
slide = WsiDicom.open(path_to_folder)

Read a 200x200 px region starting from px 1000, 1000 at level 6.

region = slide.read_region((1000, 1000), 6, (200, 200))

Read 3x3 mm region starting at 0, 0 mm at level 6.

region_mm = slide.read_region_mm((0, 0), 6, (3, 3))

Read 3x3 mm region starting at 0, 0 mm with pixel spacing 0.01 mm/px.

region_mpp = slide.read_region_mpp((0, 0), 0.01, (3, 3))

Read a thumbnail of the whole slide with maximum dimensions 200x200 px.

thumbnail = slide.read_thumbnail(200, 200)

Read an overview image (if available).

overview = slide.read_overview()

Read a label image (if available).

label = slide.read_label()

Read (decoded) tile from position 1, 1 in level 6.

tile = slide.read_tile(6, (1, 1))

Read (encoded) tile from position 1, 1 in level 6.

tile_bytes = slide.read_encoded_tile(6, (1, 1))

Close files

slide.close()

Settings

wsidicom can be configured with the settings variable. For example, set the parsing of files to strict:

from wsidicom import settings
settings.strict_uid_check = True
settings._strict_attribute_check = True

Data structure

A WSI DICOM pyramid is in wsidicom represented by a hierarchy of objects of different classes, starting from bottom:

  • WsiDicomFile, represents a WSI DICOM file, used for accessing DicomImageData and WsiDataset.
  • DicomImageData, represents the image data in one or several WSI DICOM files.
  • WsiDataset, represents the image metadata in one or several WSI DICOM files.
  • WsiInstance, represents image data and image metadata.
  • WsiDicomLevel, represents a group of instances with the same image size, i.e. of the same level.
  • WsiDicomLevels, represents a group of levels, i.e. the pyrimidal structure.
  • WsiDicom, represents a collection of levels, labels and overviews.

Labels and overviews are structured similarly to levels, but with somewhat different properties and restrictions.

The structure is easiest created using the open() helper functions, e.g. to create a WsiDicom-object:

slide = WsiDicom.open(path_to_folder)

But the structure can also be created manually from the bottom:

file = WsiDicomFile(path_to_file)
instance = WsiInstance(file.dataset, DicomImageData(files))
level = WsiDicomLevel([instance])
levels = WsiDicomLevels([level])
slide = WsiDicom([levels])

Adding support for other file formats.

By subclassing ImageData and implementing the required properties (transfer_syntax, image_size, tile_size, and pixel_spacing) and methods (get_tile() and close()) wsidicom can be used to access wsi images in other file formats than DICOM. In addition to a ImageData-object, image data, specified in a DICOM dataset, must also be created. For example, assuming a implementation of MyImageData exists that takes a path to a image file as argument and create_dataset() produces a DICOM dataset (see is_wsi_dicom() of WsiDataset for required attributes), WsiInstancees could be created for each pyramidal level, label, or overview:

image_data = MyImageData('path_to_image_file')
dataset = create_dataset()
instance = WsiInstance(dataset, image_data)

The created instances can then be arranged into levels etc, and opened as a WsiDicom-object as described in 'Data structure'.

Annotation usage

Annotations are structured in a hierarchy:

  • AnnotationInstance Represents a collection of AnnotationGroups. All the groups have the same frame of reference, i.e. annotations are from the same wsi stack.
  • AnnotationGroup Represents a group of annotations. All annotations in the group are of the same type (e.g. PointAnnotation), have the same label, description and category and type. The category and type are codes that are used to define the annotated feature. A good resource for working with codes is avaiable here.
  • Annotation Represents a annotation. An Annotation has a geometry (currently Point, Polyline, Polygon) and an optional list of Measurements.
  • Measurement Represents a measurement for an Annotation. A Measurement consists of a type-code (e.g. "Area"), a value and a unit-code ("mm")

Codes that are defined in the 222-draft can be created using the create(source, type) function of the ConceptCode-class.

Load a WSI dataset from files in folder.

from wsidicom import WsiDicom
slide = WsiDicom.open(path_to_folder)

Create a point annotation at x=10.0, y=20.0 mm.

from wsidicom import Annotation, Point
point_annotation = Annotation(Point(10.0, 20.0))

Create a point annotation with a measurement.

from wsidicom import ConceptCode, Measurement
# A measurement is defined by a type code ('Area'), a value (25.0) and a unit code ('Pixels).
area = ConceptCode.measurement('Area')
pixels = ConceptCode.unit('Pixels')
measurement = Measurement(area, 25.0, pixels)
point_annotation_with_measurment = Annotation(Point(10.0, 20.0), [measurement])

Create a group of the annotations.

from wsidicom import PointAnnotationGroup
# The 222 suplement requires groups to have a label, a category and a type
group = PointAnnotationGroup(
    annotations=[point_annotation, point_annotation_with_measurment],
    label='group label',
    categorycode=ConceptCode.category('Tissue'),
    typecode=ConceptCode.type('Nucleus'),
    description='description'
)

Create a collection of annotation groups.

from wsidicom import AnnotationInstance
annotations = AnnotationInstance([group], 'volume', slide.uids)

Save the collection to file.

annotations.save('path_to_dicom_dir/annotation.dcm')

Reopen the slide and access the annotation instance.

slide = WsiDicom.open(path_to_folder)
annotations = slide.annotations

Setup environment for development

Requires poetry installed in the virtual environment.

$ git clone https://github.com/imi-bigpicture/wsidicom.git
$ poetry install

To watch unit tests use:

$ poetry run pytest-watch -- -m unittest

The integration tests uses test images from nema.org thats needs to be downloaded. The location of the test images can be changed from the default tests\testdata\slides using the enviroment variable WSIDICOM_TESTDIR. Download the images using the supplied script:

$ python .\tests\download_test_images.py

If the files are already downloaded the script will validate the checksums.

To run integration tests:

$ poetry run pytest -m integration

Other DICOM python tools

Contributing

We welcome any contributions to help improve this tool for the WSI DICOM community!

We recommend first creating an issue before creating potential contributions to check that the contribution is in line with the goals of the project. To submit your contribution, please issue a pull request on the imi-bigpicture/wsidicom repository with your changes for review.

Our aim is to provide constructive and positive code reviews for all submissions. The project relies on gradual typing and roughly follows PEP8. However, we are not dogmatic. Most important is that the code is easy to read and understand.

Acknowledgement

wsidicom: Copyright 2021 Sectra AB, licensed under Apache 2.0.

This project is part of a project that has received funding from the Innovative Medicines Initiative 2 Joint Undertaking under grant agreement No 945358. This Joint Undertaking receives support from the European Union’s Horizon 2020 research and innovation programme and EFPIA. IMI website: www.imi.europa.eu

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

wsidicom-0.3.2.tar.gz (62.9 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

wsidicom-0.3.2-py3-none-any.whl (65.6 kB view details)

Uploaded Python 3

File details

Details for the file wsidicom-0.3.2.tar.gz.

File metadata

  • Download URL: wsidicom-0.3.2.tar.gz
  • Upload date:
  • Size: 62.9 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: poetry/1.1.5 CPython/3.8.9 Windows/10

File hashes

Hashes for wsidicom-0.3.2.tar.gz
Algorithm Hash digest
SHA256 d4824fcd1ed3756e44639cf1192b1ae6fc1430d2767465e3afd283ee2edc47e1
MD5 84c98ae3009c56f1f5da3c29d48fc50e
BLAKE2b-256 2ea3dedffe7ed06fb6da13bab4b07de7bd03f999facd3c424953c08edaef6323

See more details on using hashes here.

File details

Details for the file wsidicom-0.3.2-py3-none-any.whl.

File metadata

  • Download URL: wsidicom-0.3.2-py3-none-any.whl
  • Upload date:
  • Size: 65.6 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: poetry/1.1.5 CPython/3.8.9 Windows/10

File hashes

Hashes for wsidicom-0.3.2-py3-none-any.whl
Algorithm Hash digest
SHA256 f01c3ade6b382cc597d1a11b35f28d097b2bb68aa108f7e8ff0e8f52ac027c56
MD5 34c1dc7f86820319c88f22953d8de9b8
BLAKE2b-256 ae22d83d2e58c0e81e7fe6df60b32594d0488d6f8a7c594126815a607c2dfa5a

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page