This gear reports on the Dicom files data elements and optionally fixes or enhances the problematic ones, generating a new archive.
Project description
Dicom Fixer
Overview
Summary
This gear is responsible for reporting on the Dicom files data elements and optionally fixing/enhancing the problematic ones and generating a new archive.
Cite
License: MIT
Classification
Category: Converter
Gear Level:
- Project
- Subject
- Session
- Acquisition
- Analysis
[[TOC]]
Inputs
-
dicom: Input dicom
-
dicom
- Name: dicom
- Type: file
- Optional: False
- Classification: DICOM file
- Description: Input DICOM to be fixed
Config
- debug
- Name: debug
- Type: boolean
- Description: Include debug statements in output.
- Default:
False
- dicom-standard
- Name: dicom-standard
- Type: string
- Description: Specify which DICOM standard edition to use.
local
refers to the locally-saved edition for faster processing.current
fetches the most up-to-date edition at runtime. - Default:
"local"
- force_decompress
- Name: force_decompress
- Type: boolean
- Description: Expert option: Force standardize_transfer_syntax fix even if filesize may be too big for available memory. WARNING: Choosing this option may result in the gear failing due to being out of memory.
- Default:
False
- standardize_transfer_syntax
- Name: standardize_transfer_syntax
- Type: boolean
- Description: Whether or not to change
TransferSyntaxUID
to ExplicitVRLittleEndian. - Default:
True
- strict-validation
- Name: strict-validation
- Type: boolean
- Description: Enforce strict DICOM validation if true; else, allow python-parsable values that may not meet DICOM standard.
- Default:
True
- tag
- Name: tag
- Type: string
- Description: The tag to be added on input file upon run completion.
- Default:
"dicom-fixer"
- unique
- Name: unique
- Type: boolean
- Description: Enforce DICOM uniqueness by SOPInstanceUID or file hash. Remove duplicate DICOMs.
- Default:
True
- zip-single-dicom
- Name: zip-single-dicom
- Type: string
- Description: Output a single DICOM as zip (.dcm.zip) instead of a DICOM (.dcm) or match input.
- Default:
"match"
- convert-palette
- Name: convert-palette
- Type: boolean
- Description: Convert palette color to RGB. Depending on the result of applying the
palette color Lookup tables, the pixel type may need to be changed to 16-bit unsigned int
to encode the pixel data without loss. Unfortunately some IODs do not allow 16-bit pixel representations, so for those modalities this will nee to be turned off. This is known to affect:
- Ultrasound Multiframe
- Default:
"true"
- new-uids-needed
- Name: new-uids-needed
- Type: boolean
- Description: Create new SeriesInstanceUID based on acquisition label and StudyInstanceUID based on session label.
- Default:
False
Metadata
This gear identifies and fixes invalid DICOM tags of the input file. Additionally, dicom-fixer will add a QC result titled fixed
to either the input file if no output is written, or the output file if there is an output. The state will be either PASS
or
FAIL
. For a PASS
result, any fixing events will be under the events
key.
Prerequisites
Prerequisite Gear Runs
- Splitter
- Level: Acquisition
- Note: If multiple SeriesInstanceUIDs are present within the DICOM collection, Splitter must be run before Dicom-Fixer.
Prerequisite Files
No prerequisite files.
Prerequisite Metadata
No prerequisite metadata.
Usage
Description
This gear checks the input DICOM's tags and makes corrections as needed, whether configured to adhere to DICOM Standard (when strict-validation=True
) or configured to only check for python parsability (when strict-validation=False
).
This gear utilizes the tracker
functionality of RawDataElements
in fw-file. Namely, it houses
a collection of default and custom fixers that are applied to RawDataElements
on
read through pydicom
.
Dicom fixer will also decompress compressed TransferSyntaxes so that no issues are met on the platform downstream.
Standardize transfer syntax
The standardize_transfer_syntax
option supports the following Transfer Syntaxes:
Name | UID | Supported |
---|---|---|
Explicit VR Little Endian | 1.2.840.10008.1.2.1 | :white_check_mark: |
Implicit VR Little Endian | 1.2.840.10008.1.2 | :white_check_mark: |
Explicit VR Big Endian | 1.2.840.10008.1.2.2 | :white_check_mark: |
Deflated Explicit VR Little Endian | 1.2.840.10008.1.2.1.99 | :white_check_mark: |
RLE Lossless | 1.2.840.10008.1.2.5 | :white_check_mark: |
JPEG Baseline (Process 1) | 1.2.840.10008.1.2.4.50 | :white_check_mark: |
JPEG Extended (Process 2 and 4) | 1.2.840.10008.1.2.4.51 | :white_check_mark: |
JPEG Lossless (Process 14) | 1.2.840.10008.1.2.4.57 | :white_check_mark: |
JPEG Lossless (Process 14, SV1) | 1.2.840.10008.1.2.4.70 | :white_check_mark: |
JPEG LS Lossless | 1.2.840.10008.1.2.4.80 | :white_check_mark: |
JPEG LS Lossy | 1.2.840.10008.1.2.4.81 | :white_check_mark: |
JPEG2000 Lossless | 1.2.840.10008.1.2.4.90 | :white_check_mark: |
JPEG2000 | 1.2.840.10008.1.2.4.91 | :white_check_mark: |
JPEG2000 Multi-component Lossless | 1.2.840.10008.1.2.4.92 | :x: |
JPEG2000 Multi-component | 1.2.840.10008.1.2.4.93 | :x: |
All transfer syntax decompressing is done with pydicom using either numpy
and
GDCM
, JPEG-LS
, Pillow
, or pylibjpeg
, see more info on pydicom's docs on
supported transfer syntaxes
Color space conversion
For color files, after decompression (usually jpeg) the color space will be converted to RGB for better downstream support. Currently color space conversion is supported for the following PhotometricInterpretations:
- YBR_FULL_422
- YBR_FULL
- PALETTE COLOR
For images with Modality of US
(ultrasound) or IVUS
(intravascular ultrasound) with PhotometricInterpretation of PALETTE COLOR
, the resulting conversion to RGB 16-bit is rescaled to RGB 8-bit to ensure the resulting DICOM files are valid.
File Specifications
Input
The input of this gear is a DICOM file.
Workflow
graph LR;
A[dicom]:::input --> C;
C[Upload] --> D[Acquisition];
D:::container --> E((Gear));
E:::gear --> F[Input replaced with fixed DICOM if fixes applied];
classDef container fill:#57d,color:#fff
classDef input fill:#7a9,color:#fff
classDef gear fill:#659,color:#fff
Description of workflow:
- Upload file to container
- Select file as input to gear
- Configure gear as needed
- If fixes are required and gear passes, gear replaces file with an updated version and QC is marked as
PASS
. If fixes are required and gear does not pass, QC is marked asFAIL
. If no fixes are identified, input file is retained and marked asPASS
.
Fixers
The following fixes are applied:
General fixes from fw-file
- Fix VR for SpecificCharacterSet and surplus SequenceDelimitationItem tags.
- Replace VR=None with VR found in the public or private dictionaries.
- Replace VR='UN' with VR found in the public or private dictionaries.
- Replace invalid \ characters with _ in string VR values of VM=1.
- Crop text VRs which are too long.
- Fix an invalid UID. Determine if UID is semi-valid (e.g. composed of a minimum 5 nodes including invalid node starting with 0), if semi-valid, generates a new UID with the semi-valid UID as entropy source (deterministic), else, generates a new UID.
- Fix date times tags. Attempt to parse an invalid date and format correctly.
- Fix AS strings. Ensure one of D,W,M,Y is at end of the string, and pad to 4 characters. If no time quantifier is present, assume Years.
- Fix number strings. Fix DS (Decimal String) and IS (Integer String) number strings by removing invalid characters and truncate floats in IS VR.
- Fix invalid character. Attempt to remove non-printable characters from byte decoding.
- Fix invalid VR value. Try to fix an invalid value for the given VR.
- Fix LUT Descriptor tags.
Custom fixes
- Fix incorrect units. Correct MagneticFieldStrength from milli-Tesla to Tesla.
- Remove file in archive that are not likely DICOM (defined as having at least 2 public dicom tags outside the file_meta (0000, 0002) group).
- Optionally, sets the TransferSyntaxUID to ExplicitVRLittleEndian.
Output
File
DICOM-fixer will either output a fixed file to overwrite the input or will output nothing
QC
DICOM-fixer will add a QC result titled fixed
to either the input file if no output is
written, or the output file if there is an output. The state will be either PASS
or
FAIL
:
PASS
if either fixes were attempted and write was successful, or if no fixes were attempted.FAIL
if fixes were attempted but write was not successful.
For a PASS
result, any fixing events will be under the events
key.
Use Cases
Use Case 1: Invalid DICOM tags, strict validation
DICOM has invalid tag values and needs to adhere to DICOM Standard:
For DICOMs that must adhere to DICOM Standard, the strict-validation
config option should be set to True
. dicom-standard
can be set to "local"
(default) for faster processing with the locally-saved DICOM Standard, or set to "current"
to retrieve the most recent DICOM Standard at runtime.
Use Case 2: Invalid DICOM tags, non-strict validation
DICOM has invalid tag values, but information retention is preferred over strict adherance to DICOM Standard:
For situations where information retention is preferred over strict DICOM standard adherance, the strict-validation
config option should be set to False
(default). This tells the gear to keep tag values unless they are not python parsable.
Logging
The gear logs events as it processes the input DICOM. If fixes are required, the gear log notates these as they are identified. The gear then logs whether or not output is written and reason for writing output. The .metadata.json
created by the gear is logged, including the parent file, gear configuration, QC results, and file tags. When debug
is set to True
, debugging-level statements are logged.
FAQ
Contributing
For more information about how to get started contributing to that gear, checkout CONTRIBUTING.md.
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distributions
Built Distribution
Hashes for fw_gear_dicom_fixer-0.10.3-py3-none-any.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | 5fedc921592a6ad3a3cf380e3fc12902ba07706a9d5fc92b58001fd69d49d7de |
|
MD5 | f878beb2704aaa6835edfda27b20b958 |
|
BLAKE2b-256 | d8156c578fb34b4b93a35eba2683b3ac80aee0981a3782c756d159333b78b730 |