Skip to main content

Unified data-file interface

Project description

fw-file

Unified interface for reading medical file types, exposing parsed fields as dict keys as well as attributes and for saving any modifications to disk or a buffer.

DICOM support - built on top of pydicom - is the primary goal of the library. fw-file also provides helpers for parsing DICOMs containing non-standard tags and utilities for organizing datasets and extracting metadata.

Additional file types supported:

  • NIfTI1 and NIfTI2 (.nii.gz)
  • Bruker ParaVision (subject/acqp/method)
  • GE MR RAW / PFile (P_NNNNN_.7)
  • Philips MR PAR/REC header (.par)
  • Siemens MR RAW (.dat)
  • Siemens MR Spectroscopy (.rda)
  • Siemens PET RAW (.ptd)

Installation

To install the package with all the optional dependencies:

pip install "fw-file[all]"

Alternatively, add as a poetry dependency to your project:

poetry add fw-file --extras all

Usage

Opening

from fw_file.dicom import DICOM
dcm = DICOM("dataset.dcm")  # also works with any readable file-like object

Fields

Attribute access on DICOMs works similarly to that in pydicom:

dcm.PatientAge == "060Y"
dcm.patientage == "060Y"   # attrs are case-insensitive
dcm.patient_age == "060Y"  # and snake_case compatible

Key access also returns values instead of pydicom.DataElement:

dcm["PatientAge"] == "060Y"
dcm["patientage"] == "060Y"   # keys are case-insensitive too
dcm["patient_age"] == "060Y"  # and snake_case compatible
dcm["00101010"] == "060Y"
dcm["0010", "1010"] == "060Y"
dcm[0x00101010] == "060Y"
dcm[0x0010, 0x1010] == "060Y"

Private tags can be accessed as keys when including the creator:

dcm["AGFA", "Zoom factor"] == 2
dcm["AGFA", "0019xx82"] == 2

Assignment and deletion works with attributes and keys alike:

dcm.PatientAge = "065Y"
del dcm["PatientAge"]

Metadata

Flywheel metadata can be extracted using the get_meta() method. To customize fields - eg. to parse group/project info from a routing string - init files with a MetaExtractor instance:

from fw_file.dicom import DICOM
dcm = DICOM("dataset.dcm")
dcm.get_meta(patterns={"[fw://]{group}[/{project}]": "StudyComments"}) == {
dcm.get_meta() == {
    "group._id": "neuro",  # parsed from StudyComments="fw://neuro/Amnesia"
    "project.label": "Amnesia",
    "subject.label": "PatientID",
    "session.label": "StudyDescription",
    "acquisition.label": "SeriesDescription",
    # and much, much more...
}

Saving

dcm.save()              # save to the original location
dcm.save("edited.dcm")  # save to a given filepath
dcm.save(io.BytesIO())  # save to any writable object

Collections and series

Handling multiple DICOM files together is a common use case, where the tags of more than one file need to be inspected in tandem for QA/validation or even modified for de-identification. DICOMCollection facilitates that and exposes convenience methods to be loaded from a list of files, a directory or a zip archive.

from fw_file.dicom import DICOMCollection
coll_dcm = DICOMCollection("001.dcm", "002.dcm")  # from a list of files
coll_dir = DICOMCollection.from_dir(".")          # from a directory
coll_zip = DICOMCollection.from_zip("dicom.zip")  # from a zip archive
coll = DICOMCollection()  # or start from scratch
coll.append("001.dcm")    # and add files later

To interact with the underlying DICOMs:

# access individual instances through list indexes
coll[0].SOPInstanceUID == "1.2.3"
# get tag value of all instances as a list, allowing different values
coll.bulk_get("SOPInstanceUID") == ["1.2.3", "1.2.4"]
# get a unique tag value, raising when encountering multiple values
coll.get("SeriesInstanceUID") == "1.2"
coll.get("SOPInstanceUID")  # raises ValueError
# set a tag value uniformly on all instances
coll.set("PatientAge", "060Y")
# delete a tag across all instances
coll.delete("PatientID")

Finally, a DICOMCollection can be saved in place, exported to a directory or packed as a zip archive:

coll.save()
coll.to_dir("/tmp/dicom")
coll.to_zip("/tmp/dicom.zip")

DICOMSeries is a subclass of DICOMCollection, intended to be used on files that belong to the same DICOM series. The instances normally have the same SeriesInstanceUID attribute and are uploaded together (zipped) into a Flywheel acquisition. In addition to the collection methods, DICOMSeries can be used to pack the instances into an appropriately named ZIP archive and extract Flywheel metadata from multiple files while also validating the values, checking for any discrepancies among the instances along the way.

from fw_file.dicom import DICOMSeries
series = DICOMSeries("001.dcm", "002.dcm")
filepath, metadata = series.to_upload()

Private dictionary

In addition to the private tags included in pydicom, fw-file ships with an extended dictionary to make accessing even more private tags that much simpler.

The private dictionary can be further extended by creating a DCMTK-style data dict file and setting the DCMDICTPATH environment variable to it's path.

DataElement decoding

DICOMs are often saved with non-standard and/or corrupt data elements. To enable loading these datasets, fw-file provides fixes for some common problems:

  • Fix VM=1 strings that contain \ by replacing with _ (default: enabled)
  • Fix VR for known data elements encoded as explicit UN (default: enabled)
  • Extend/improve handling of data elements with a VR mismatch (default: disabled)

These fixes can also be enabled/disabled via environment variables:

FW_DCM_REPLACE_UN_WITH_KNOWN_VR=false
FW_DCM_FIX_VM1_STRINGS=false
FW_DCM_FIX_VR_MISMATCH=true

To track any changes like VR inferences on (raw) data elements DICOMs can be instantiated with track=True:

dcm = DICOM("dataset.dcm", decode=True, track=True)
dcm.tracker.data_elements[0].events == ["Replace VR: UN -> CS"]

Development

Install the project using poetry and enable pre-commit:

poetry install --extras "all"
pre-commit install

License

MIT

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distributions

No source distribution files available for this release.See tutorial on generating distribution archives.

Built Distribution

fw_file-0.7.3-py3-none-any.whl (51.1 kB view hashes)

Uploaded Python 3

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page