Medical image analysis toolkit
Project description
scikit-rt
A suite of tools for loading, plotting, and analysing medical imaging data.
This work was supported by Cancer Research UK RadNet Cambridge [C17918/A28870].
Table of contents
1. Images
The Image class can be used to read and write medical images from DICOM, nifti, or NumPy format. These images can be plotted and compared.
To load the Image class:
from skrt import Image
Images will be processed into a consistent format:
- The
Image.data
property contains a numpy array, which stores (y, x, z) in (row, column, slice) respectively. Note that numpy elements are indexed in order (row, column, slice); so if you didImage.data[i, j, k]
,i
would correspond to y index,j
would correspond to x index,k
would correspond to z index. - The
Image.affine
property contains a 4x4 matric that can convert a (row, column, slice) index to an (x, y, z) position. This will always be diagonal, so (0, 0) contains x voxel size etc, (0, 3) contains x origin. - The
voxel_size
andorigin
properties are the diagonal and third column, respectively; they give voxel sizes and origin position in order (x, y, z). - The
n_voxels
property containins the number of voxels in the (x, y, z) directions (same asImage.data.shape
, but with 0 and 1 swapped).
In the standard dicom-style configuration (Left, Posterior, Superior):
- The x-axis increases along each column, and points towards the patient's left (i.e. towards the heart, away from the liver).
- The y-axis increase down each row, and points from the patient's front to back (posterior).
- The z-axis increases along the slice index, and points from the patient's feet to head (superior).
A canonical nifti-style array and affine can be obtained by running Image.get_nifti_array_and_affine()
. By convention, this points in the same z direction but has x and y axes reversed (Right, Anterior, Superior). In the affine matrix, the x and y origins are therefore defined as being at the opposite end of the scale.
Note that positions can also be specified in terms of slice number:
- For x and y, slice number is just numpy array index + 1 (slice number ranges from 1 - n_voxels, whereas array index ranges from 0 - n_voxels-1)
- For z, by convention the slice numbers increases from 1 at the head to n_voxels at the feet, so it is in the opposite direction to the array index (convert as n_voxels[2] - idx).
Loading from a file
An image can be loaded from a dicom, nifti, or numpy file via:
im = Image(filepath)
If the dicom file is part of a series, any other files in that series in the same directory will also be loaded. Alternatively, you can give the path to a directory containing multiple dicom files. The first dicom file in that directory alphabetically will be loaded along with any other files in its series.
Loading from an array
An image can also be loaded from a numpy array. By default, it will be taken to have origin (0, 0, 0) and voxel sizes (1, 1, 1)mm; otherwise, these can be set manually, either via:
im = Image(array, voxel_size=(1.5, 1.5, 3), origin=(-100, -100, 40))
where the origin/voxel size lists are in order (x, y, z).
The origin and voxel sizes can also be specified via an affine matrix, e.g.
affine = np.array([
[1.5, 0, 0, -100],
[0, 1.5, 0, -100],
[0, 0, 3, 40],
[0, 0, 0, 1]
])
im = Image(array, affine=affine)
where the first row of the affine matrix contains the x
voxel size and origin, second row contains y
, third row contains z
.
Plotting
To plot a slice of the image, you need to specify the orientation (x-y
, y-z
, or x-z
; default x-y
) and either the slice number, array index, or position in mm (by default, the central slice in the chosen orientation will be plotted).
e.g.
im.plot('y-z', idx=5)
Writing out image data
Images can be written out with the Image.write(filename)
function. The output filetype will be inferred from the filename.
Writing to dicom
If filename
ends in .dcm
or is a directory (i.e. has no extension), the image will be written in dicom format. Each x-y
slice will be written to a separate file labelled by slice number, e.g. slice 1 (corresponding to [:, :, -1]
in the image array) would be saved to 1.dcm
.
The path to a dicom file from which to take the header can be specified via the header_source
parameter. If no path is given but the input source for the Image was a dicom file, the header will be taken from the source. Otherwise (e.g. if the file was loaded from a nifti and no header_source
is given), a brand new header with new UIDs will be created. In that case, you can set the following info for that header:
patient_id
modality
root_uid
(an ID unique to your institue while will prefix the generated dicom UIDs so that they are globally unique; one can be obtained here: https://www.medicalconnections.co.uk/FreeUID/)
Writing to nifti
If filename
ends in .nii
or .nii.gz
, the image will be written to nifti. The nifti will be in canonical format, i.e. in Right, Anterior, Superior configuration. (Note that this means the nifti you write out may not be the same as the one you read in).
Writing to a numpy array
If filename
ends in .npy
, the image array will be written to a numpy binary file. To write a canonical nifti-style array instead of the dicom-style array, set nifti_array=True
. If set_geometry
is True
(which it is by default), a text file will be written to the same directory as the .npy
file containing the voxel sizes and origin.
2. Structures and structure sets
A structure/region of interest (ROI) can be represented by either a set of contour points or a binary mask. The ROI
class allows a structure to be loaded from either of these sources and converted from one to the other.
The ROI class
An ROI
object, which contains a single structure, can be created via:
from skrt import ROI
roi = ROI(source)
The source
can be from any of the following:
- A dicom RTStruct file containing one or more sets of contours;
- A dictionary of contours, where the keys are slice positions in mm and the values are lists of lists of contour points for each contour on that slice;
- A nifti file containing a binary mask;
- A numpy file containing a binary mask;
- A numpy array.
If the input object is a dicom file, the name of the ROI within that file must be given. In addition, if the input source is a dicom file or contour dictionary, in order to convert the contours to a mask, the user must provide either of the following arguments to the ROI
creation:
image
: Assign anImage
object associated with this ROI. The created mask will have the same dimensions as this image.shape
,origin
, andvoxel_sizes
: This tells the ROI the dimensions of the pixel array to create when making the mask.
Additional useful arguments are:
color
: set the colour of this ROI for plotting. If not specified, this will either be read from the dicom file (if given) or the ROI will be assigned a unique colour.name
: set the name of this ROI. If the input is a dicom RTStruct file, this name will be used to select the correct ROI from the file. If not given, the ROI will be given a unique name ("ROI 1" etc)load
: ifTrue
(default), the input ROI will be converted into a mask/contours automatically upon creation. This can be time consuming if you are creating many ROIs, to setting it toFalse
can save time and allow ROIs to be processed later on-demand.
An Image
object associated with this ROI can be set at any time via:
roi.image = image
This image can optionally be plotted in the background when the ROI is plotted.
Plotting an ROI
ROIs can be plotted as contours or a mask or both, and with or without an associate image behind them. The simplest plot command is:
roi.plot()
This will plot a contour in the x-y plane, e.g.:
Additional options are:
include_image
: boolean, set toTrue
to plot the associated image (if one is assigned) in the background;view
: specify the orientation (x-y
,y-z
, orx-z
);sl
,pos
, oridx
: specify the slice number, slice position in mm, or slice index;zoom
: amount by which to zoom in (will automatically zoom in on the centre of the structure).
The following plot types are available, set via the plot_type
argument:
contour
: plot a contourmask
: plot a binary maskfilled
: plot a contour on top of a semi-transparent maskcentroid
: plot a contour with the centroid marked by a crossfilled centroid
: plot a contour and centroid on top of a semi-transparent mask
Example contour plot with include_image=True
:
Example mask plot:
Writing out an ROI
An ROI can be written to a nifti or numpy file as a binary mask. This can either be done by specifying a filename with an appropriate extension, e.g.
roi.write('my_roi.nii')
would automatically write to a nifti file.
Alternatively, the ROI's name can be used to create the filename. This is done by just providing an extension, e.g. if an ROI were called "heart", the following command:
roi.write(ext='.nii')
would produce a file called heart.nii
containing the ROI as a binary mask. An output directory can also be optionally provided, e.g.
roi.write(outdir='my_rois', ext='.npy')
would produce a binary numpy file at my_rois/heart.npy
.
Getting geometric properties
The ROI
class has various methods for obtaining geometric properties of the ROI:
-
Centroid: get the 3D centroid coordinates (i.e. the centre-of-mass of the ROI) in mm via
roi.get_centroid()
, or get the 2D coordinates for a single slice viaroi.get_centroid(view='x-y', sl=10)
, for example (the slice position in mmpos
or slice indexidx
can be given instead of slice numbersl
) -
Centre: get the 3D midpoint coordinates (i.e. the mean of the maximum extents in each direction, rather than centre of mass) via
roi.get_centre()
. The 2D midpoint of a slice is obtained in a similar way to the centroid, e.g.roi.get_centre(view='x-y', sl=10)
. -
Volume:
roi.get_volume(units)
, whereunits
can either bemm
orvoxels
(defaultmm
). -
Area: get the area of a given slice of a structure by running e.g.
roi.get_area(view='x-y', sl=10, units='mm')
. To get the area of the central slice, can simply runroi.get_area()
. -
Length: get structure length along a given axis by running
roi.get_length(axis, units)
whereaxis
isx
,y
, orz
andunits
ismm
orvoxels
.
Structure Sets: the RtStruct class
A structure set is an object that contains multiple ROIs. This is done via the RtStruct
class.
Loading a structure set
A structure set is created via
from skrt import RtStruct
rtstruct = RtStruct(source)
The source can be:
- The path to a dicom RtStruct file containing multiple ROIs;
- The path to a directory containing one or more nifti or numpy files, each containing a binary ROI mask;
- A list of paths to nifti or numpy ROI mask files.
In addition, more ROIs can be added later via:
rtstruct.add_structs(source)
where source
is any of the above source types.
Alternatively, single ROIs can be added at a time via any of the valid ROI
sources (see above), and with any of the ROI
initialisation arguments. An empty can RtStruct
can be created and then populated, e.g.
rtstruct = RtStruct()
rtstruct.add_struct('heart.nii', color='red')
rtstruct.add_struct('some_structs.dcm', name='lung')
The RtStruct
can also be associated with an Image
object by specifying the image
argument upon creation, or running rtstruct.set_image(image)
. This image will be assigned to all ROIs in the structure set.
Filtering a structure set
Sometimes you may wish to load many ROIs (e.g. from a dicom RtStruct file) and then filter them by name. This is done via:
rtstruct.filter_structs(to_keep, to_remove)
where to_keep
and to_remove
are optional lists containing structure names, or wildcards with the *
character. First, all of the ROIs belonging to rtstruct
are checked and only kept if they match the names or wildcards in to_keep
. The remaining ROIs are then removed if their names match the names or wildcards in to_remove
.
To restore a structure set to its original state (i.e. reload it from its source), run rtstruct.reset()
.
Renaming ROIs
ROIs can be renamed by mapping from one or more possible original names to a single final name. In this way, multiple structure sets where the same ROI might have different names can be standardised to have the same ROI names.
For example, let's say you wish to rename the right parotid gland to right_parotid
, but you know that it has a variety of names across different structure sets. You could do this with the following (assuming my_rtstructs is a list of RtStruct
objects:
names_map = {
'right_parotid': ['right*parotid', 'parotid*right', 'R parotid', 'parotid_R']
}
for rtstruct in my_rtstructs:
rtstruct.rename_structs(names_map)
By default, only one ROI per structure set will be renamed; for example, if a structure set for some reason contained both right parotid
and R parotid
, onl the first in the list (right parotid
) would be renamed. This behaviour can be turned off by setting first_match_only=False
; beware this could lead to duplicate structure names.
You can also choose to discard any structures that aren't in your renaming map by setting keep_renamed_only=True
.
Getting ROIs
Get a list of the ROI
objects belonging to the structure set:
rtstruct.get_structs()
Get a list of names of the ROI
objects:
rtstruct.get_struct_names()
Get a dictionary of ROI
objects with their names as keys:
rtstruct.get_struct_dict()
Get an ROI
object with a specific name:
rtstruct.get_struct(name)
Print the ROI
names:
rtstruct.print_structs()
Copying a structure set
An RtStruct
object can be copied to a new RtStruct
, optionally with some structures filtered/renamed (you might want to do this if you want to preserve the original structure set, while making a filtered version too), via:
rtstruct_filtered = rtstruct.copy(names, to_keep, to_remove, keep_renamed_only)
Writing a structure set
The ROIs in a structure set can be written to a directory of nifti or numpy files, via:
rtstruct.write(outdir, ext)
where outdir
is the output directory and ext
is either .nii
, .nii.gz
or .npy
. Dicom writing will be supported in future.
Assigning RtStructs to an Image
Just as ROIs and structure sets can be associated with an image, the Image
object can be associated with one or more RtStruct
objects. This is done via:
from skrt import Image, RtStruct
image = Image("some_image.nii")
rtstruct = RtStruct("roi_directory")
image.add_structs(rtstruct)
Now, when the Image
is plotted, the ROIs in its structure set(s) can be plotted on top. To plot all structure sets, run:
image.plot(structure_set='all')
Note that this could be slow if there are many structure sets containing many structures.
To plot just one structure set, you can also provide the index of the structure set in the list of structure sets belonging to the image, e.g. to plot the most recently added structure set:
image.plot(structure_set=-1)
To add a legend to the plot, set struct_legend=True
.
The image's structure sets can be cleared at any time via
image.clear_structs()
3. Patients and studies
The Patient
and Study
classes allow multiple medical images and structure sets associated with a single patient to be read into one object.
Expected file structure
Patient and study file structure
The files for a patient should be sorted into a specific structure in order for the Patient
class to be able to read them in correctly.
The top level directory represents the entire patient; this should be a directory whose name is the patient's ID.
The next one or two levels represent studies. A study is identified by a directory whose name is a timestamp, which is a string with format YYYYMMDD_hhmmss
. These directories can either appear within the patient directory, or be nested in a further level of directories, for example if you wished to separate groups of studies.
A valid file structure could look like this:
mypatient1
--- 20200416_120350
--- 20200528_100845
This would represent a patient with ID mypatient1
, with two studies, one taken on 16/04/2020 and one taken on 28/05/2020.
Another valid file structure could be:
mypatient1
--- planning
------ 20200416_120350
------ 20200528_100845
--- relapse
------ 20211020_093028
This patient would have three studies, two in the "planning" category and two in the "relapse" category.
Files within a study
Each study can contain images of various imaging modalities, and associated structure sets. Within a study directory, there can be three "special" directories, named RTSTRUCT
, RTDOSE
, and RTPLAN
(currently only RTSTRUCT
does anything; the others will be updated soon), containing structure sets, dose fields, and radiotherapy plans, respectively.
Any other directories within the study directory are taken to represent an imaging modality. The structure sets associated with this modality should be nested inside the RTSTRUCT
directory inside directories with the same name as the image directories. The images and structure sets themselves should be further nested inside a timestamp directory representing the time at which that image was taken.
For example, if a study containined both CT and MR images, as well as two structure sets associated with the CT image, the file structure should be as follows:
20200416_120350
--- CT
------ 20200416_120350
--------- 1.dcm
--------- 2.dcm ... etc
--- MR
------ 20200417_160329
--------- 1.dcm
--------- 2.dcm ... etc
--- RTSTRUCT
------ CT
---------- 20200416_120350
-------------- RTSTRUCT_20200512_134729.dcm
-------------- RTSTRUCT_20200517_162739.dcm
The Patient class
A Patient
object is created by providing the path to the top-level patient directory:
from skrt import Patient
p = Patient('mypatient1')
A list of the patient's associated studies is stored in p.studies
.
Additional properties can be accessed:
- Patient ID:
p.id
- Patient sex:
p.get_sex()
- Patient age:
p.get_age()
- Patient birth date:
p.get_birth_date()
Writing a patient tree
A patient's files can be written out in nifti or numpy format. By default, files will be written to compress nifti (.nii.gz
) files. This is done via the write
method:
p.write('some_dir')
where some_dir
is the directory in which the new patient folder will be created (if not given, it will be created in the current directory).
By default, all imaging modalities will be written. To ignore a specific modality, a list to_ignore
can be provided, e.g. to ignore any MR images:
p.write('some_dir', to_ignore=['MR'])
To write out structure sets as well as images, the structure_set
argument should be set. This can be either:
'all'
: write all structure sets.- The index of the structure set (e.g. to write only the newest structure set for each image, set
structure_set=-1
) - A list of indices of structure sets to write (e.g. to write only the oldest and newest, set
structure_set=[0, -1]
)
By default, no structure sets will be written, as conversion of structures from contours to masks can be slow for large structures.
The Study class
A Study
object stores images and structure sets. A list of studies can be extracted from a Patient
object via the property Patient.studies
. You can access the study's path via Study.path
. If the study was nested inside a subdirectory, the name of that subdirectory is accessed via Study.subdir
.
Images
For each imaging modalitiy subdirectory inside the study, a new class property will be created to contain a list of images of that modality, called {modality}_scans
, where the modality is taken from the subdirectory name (note, this is always converted to lower case). E.g. if there were directories called CT
and MR
, the Study
object would have properties ct_scans
and mr_scans
containing lists of Image
objects.
Structure sets
The study's structure sets can be accessed in two ways. Firstly, the structure sets associated with an image can be extracted from the structs
property of the Image
itself; this is a list of RtStruct
objects. E.g. to get the newest structure set for the oldest CT image in the oldest study, you could run:
p = Patient('mypatient1')
s = p.studies[0]
structure_set = s.ct_scans[0].structs[-1]
In addition, structures associated with each imaginging modality will be stored in a property of the Study
object called {modality}_structs
. E.g. to get the oldest CT-related structure set, you could run:
structure_set = s.ct_structs[0]
The RtStruct
object also has an associated image
property (structure_set.image
), which can be used to find out which Image
is associated with that structure set.
4. Synthetic images
The SyntheticImage
class enables the creation of images containing simple geometric shapes.
Creating a synthetic image
To create an empty image, load the SyntheticImage
class and specify the desired image shape in order (x, y, z), e.g.
from skrt.simulation import SyntheticImage
sim = SyntheticImage((250, 250, 50))
The following arguments can be used to adjust the image's properties:
voxel_size
: voxel sizes in mm in order (x, y, z); default (1, 1, 1).origin
: position of the top-left voxel in mm; default (0, 0, 0).intensity
: value of the background voxels of the image.noise_std
: standard deviation of Gaussian noise to apply to the image. This noise will also be added on top of any shapes. (Can be changed later by altering thesim.noise_std
property).
Adding shapes
The SyntheticImage
object has various methods for adding geometric shapes. Each shape has the following arguments:
intensity
: intensity value with which to fill the voxels of this shape.above
: ifTrue
, this shape will be overlaid on top of all existing shapes; otherwise, it will be added below all other shapes.
The available shapes and their specific arguments are:
- Sphere:
sim.add_sphere(radius, centre=None)
radius
: radius of the sphere in mm.centre
: position of the centre of the sphere in mm (ifNone
, the sphere will be placed in the centre of the image).
- Cuboid:
sim.add_cuboid(side_length, centre=None)
side_length
: side length in mm. Can either be a single value (to create a cube) or a list of the (x, y, z) side lengths.centre
: position of the centre of the cuboid in mm (ifNone
, the cuboid will be placed in the centre of the image).
- Cube:
sim.add_cube(side_length, centre=None)
- Same as
add_cuboid
.
- Same as
- Cylinder:
sim.add_cylinder(radius, length, axis='z', centre=None)
radius
: radius of the cylinder in mm.length
: length of the cylinder in mm.axis
: either'x'
,'y'
, or'z'
; direction along which the length of the cylinder should lie.centre
: position of the centre of the cylinder in mm (ifNone
, the cylinder will be placed in the centre of the image).
- Grid:
sim.add_grid(spacing, thickness=1, axis=None)
spacing
: grid spacing in mm. Can either be a single value, or list of (x, y, z) spacings.thickenss
: gridline thickness in mm. Can either be a single value, or list of (x, y, z) thicknesses.axis
: if None, gridlines will be created in all three directions; if set to'x'
,'y'
, or'z'
, grids will only be created in the plane orthogonal to the chosen axis, such that a solid grid runs through the image along that axis.
To remove all shapes from the image, run
sim.reset()
Plotting
The SyntheticImage
class inherits from the Image
class, and can thus be plotted in the same way by calling
sim.plot()
along with any of the arguments available to the Image
plotting method.
Rotations and translations
Rotations and translations can be applied to the image using:
sim.translate(dx, dy, dz)
or
sim.rotate(pitch, yaw, roll)
Rotations and translations can be removed by running
sim.reset_transforms()
Getting the image array
To obtain a NumPy array containing the image data, run
array = sim.get_data()
This array will contain all of the added shapes, as well as having any rotations, translations, and noise applied.
Saving
The synthetic image can be written to an image file by running
sim.write(outname)
The output name outname
can be:
- A path ending in
.nii
or.nii.gz
: image will be written to a nifti file. - A path ending in
.npy
: image will be written to a binary NumPy file. - A path to a directory: image will be written to dicom files (one file per slice) inside the directory.
The write
function can also take any of the arguments of the Image.write()
function.
Adding structures
When shapes are added to the image, they can also be set as structures. This allows you to:
- Plot them as contours or masks on top of the image;
- Access an
RtStruct
object containing the structures; - Write structures out separately as masks or as a dicom RtStruct file.
Single structures
To assign a shape as a structure, you can either give it a name upon creation, e.g.:
sim.add_sphere(50, name='my_sphere')
or set the is_struct
property to True
(the structure will then be given an automatic named based on its shape type):
sim.add_sphere(50, is_struct=True)
When sim.plot()
is called, its structures will automatically be plotted as contours. Some useful extra options to the plot
function are:
struct_plot_type
: set plot type to any of the validROI
plotting types (mask, contour, filled, centroid, filled centroid).centre_on_struct
: name of structure on which the plotted image should be centred.struct_legend
: set toTrue
to draw a legend containing the strutcure names.
Grouped structures
Multiple shapes can be combined to create a single structure. To do this, set the group
argument to some string when creating structures. Any shapes created with the same group
name will be added to this group.
E.g. to create a single structure called "two_cubes" out of two cubes centred at different positions, run:
sim.add_cube(10, centre=(5, 5, 5), group='two_cubes')
sim.add_cube(10, centre=(7, 7, 5)), group='two_cubes')
Getting structures
To get an RtStruct
object containing all of the structures belonging to the image, run:
rtstruct = sim.get_rtstruct()
You can also access a single structure as an ROI
object by running:
roi = sim.get_struct(struct_name)
Saving structures
When the SyntheticImage.write()
function is called, the structures belonging to that image will also be written. If the provided outname
is that of a nifti or NumPy file, the structures will be written to nifti or Numpy files, respectively, inside a directory with the same name as outname
but with the extension removed.
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Hashes for scikit_rt-0.0.2-py3-none-any.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | 88a36f6d89b1a4702cec4b742d901dddacb01dca9b8bdff952d37d474ac5a28f |
|
MD5 | af47f42430d5957702c528d50e351da1 |
|
BLAKE2b-256 | c855df38ff48aba6fafab181bf6424750ffd2ecfb8d93df372c8c1040ebd7629 |