Unified data-file interface

These details have not been verified by PyPI

Project links

GitHub Statistics

View statistics for this project via Libraries.io, or by using our public dataset on Google BigQuery

Project description

fw-file

Unified interface for reading medical file types, exposing parsed fields as dict keys as well as attributes and for saving any modifications to disk or a buffer.

DICOM support - built on top of pydicom - is the primary goal of the library. fw-file also provides helpers for parsing DICOMs containing non-standard tags and utilities for organizing datasets and extracting metadata.

Additional file types supported:

NIfTI1 and NIfTI2 (.nii.gz)
Bruker ParaVision (subject/acqp/method)
GE MR RAW / PFile (P_NNNNN_.7)
Philips MR PAR/REC header (.par)
Siemens MR RAW (.dat)
Siemens MR Spectroscopy (.rda)
Siemens PET RAW (.ptd)
PNG (.png)
JPEG/JPG (.jpeg/.jpg)
BrainVision EEG (.vhdr/.vmrk/.eeg)
EEGLAB EEG (.set/.fdt)
European Data Format EEG (.edf)
BioSemi Data Format EEG (.bdf)

Installation

To install the package with all the optional dependencies:

pip install "fw-file[all]"

Alternatively, add as a poetry dependency to your project:

poetry add fw-file --extras all

Usage

Opening

from fw_file.dicom import DICOM
dcm = DICOM("dataset.dcm")  # also works with any readable file-like object

Fields

Attribute access on DICOMs works similarly to that in pydicom:

dcm.PatientAge == "060Y"
dcm.patientage == "060Y"   # attrs are case-insensitive
dcm.patient_age == "060Y"  # and snake_case compatible

Key access also returns values instead of pydicom.DataElement:

dcm["PatientAge"] == "060Y"
dcm["patientage"] == "060Y"   # keys are case-insensitive too
dcm["patient_age"] == "060Y"  # and snake_case compatible
dcm["00101010"] == "060Y"
dcm["0010", "1010"] == "060Y"
dcm[0x00101010] == "060Y"
dcm[0x0010, 0x1010] == "060Y"

Private tags can be accessed as keys when including the creator:

dcm["AGFA", "Zoom factor"] == 2
dcm["AGFA", "0019xx82"] == 2

Assignment and deletion works with attributes and keys alike:

dcm.PatientAge = "065Y"
del dcm["PatientAge"]

Metadata

Flywheel metadata can be extracted using the get_meta() method:

from fw_file.dicom import DICOM
dcm = DICOM("dataset.dcm")
dcm.get_meta() == {
    "subject.label": "PatientID",
    "session.label": "StudyDescription",
    "session.uid": "1.2.3",  # StudyInstanceUID
    "acquisition.label": "SeriesDescription",
    "acquisition.uid": "4.5.6",  # SeriesInstanceUID
    # and much, much more...
}

Saving

dcm.save()              # save to the original location
dcm.save("edited.dcm")  # save to a given filepath
dcm.save(io.BytesIO())  # save to any writable object

Collections and series

Handling multiple DICOM files together is a common use case, where the tags of more than one file need to be inspected in tandem for QA/validation or even modified for de-identification. DICOMCollection facilitates that and exposes convenience methods to be loaded from a list of files, a directory or a zip archive.

from fw_file.dicom import DICOMCollection
coll_dcm = DICOMCollection("001.dcm", "002.dcm")  # from a list of files
coll_dir = DICOMCollection.from_dir(".")          # from a directory
coll_zip = DICOMCollection.from_zip("dicom.zip")  # from a zip archive
coll = DICOMCollection()  # or start from scratch
coll.append("001.dcm")    # and add files later

To interact with the underlying DICOMs:

# access individual instances through list indexes
coll[0].SOPInstanceUID == "1.2.3"
# get tag value of all instances as a list, allowing different values
coll.bulk_get("SOPInstanceUID") == ["1.2.3", "1.2.4"]
# get a unique tag value, raising when encountering multiple values
coll.get("SeriesInstanceUID") == "1.2"
coll.get("SOPInstanceUID")  # raises ValueError
# set a tag value uniformly on all instances
coll.set("PatientAge", "060Y")
# delete a tag across all instances
coll.delete("PatientID")

Finally, a DICOMCollection can be saved in place, exported to a directory or packed as a zip archive:

coll.save()
coll.to_dir("/tmp/dicom")
coll.to_zip("/tmp/dicom.zip")

DICOMSeries is a subclass of DICOMCollection, intended to be used on files that belong to the same DICOM series. The instances normally have the same SeriesInstanceUID attribute and are uploaded together (zipped) into a Flywheel acquisition. In addition to the collection methods, DICOMSeries can be used to pack the instances into an appropriately named ZIP archive and extract Flywheel metadata from multiple files while also validating the values, checking for any discrepancies among the instances along the way.

from fw_file.dicom import DICOMSeries
series = DICOMSeries("001.dcm", "002.dcm")
filepath, metadata = series.to_upload()

DICOM Standard Editions

As the DICOM Standard is typically revised multiple times throughout the year, fw-file provides the option to choose which edition is being utilized via environment variables. The default is "2023c", which utilizes the locally-saved 2023c edition. Additional options are "current" and any valid 5-character edition (i.e. "2022d"). Specifying "current" will fetch the most recent edition at runtime.

FW_DCM_STANDARD_REV=current
FW_DCM_STANDARD_REV=2022d

Private dictionary

In addition to the private tags included in pydicom, fw-file ships with an extended dictionary to make accessing even more private tags that much simpler.

The private dictionary can be further extended by creating a DCMTK-style data dict file and setting the DCMDICTPATH environment variable to it's path.

`DataElement` decoding

DICOMs are often saved with non-standard and/or corrupt data elements. To enable loading these datasets, fw-file provides fixes for some common problems:

Fix VM=1 strings that contain \ by replacing with _ (default: enabled)
Fix VR for known data elements encoded as explicit UN (default: enabled)
Extend/improve handling of data elements with a VR mismatch (default: disabled)

These fixes can also be enabled/disabled via environment variables:

FW_DCM_REPLACE_UN_WITH_KNOWN_VR=false
FW_DCM_FIX_VM1_STRINGS=false
FW_DCM_FIX_VR_MISMATCH=true

To extract as much information from a DICOM as possible, fw-file can be run in read-only mode. When enabled, invalid values are retained and the VR is set to OB. As it is not safe to write the DICOM back in this state, saving is disabled. This mode can be enabled via an environment variable. (default: disabled)

FW_DCM_READ_ONLY=true

Additionally, validation mode can be set via environment variables. Default is 1 (WARN), additional options are 2 (RAISE) and 0 (IGNORE).

FW_DCM_READING_VALIDATION_MODE=1
FW_DCM_WRITING_VALIDATION_MODE=1

EEG

Multiple EEG filetypes are supported including BrainVision, EEGLAB, EDF, and BDF files. These files are parsed using the MNE-Python library.

BrainVision data must contain both the header file (.vhdr) and the marker file (.vmrk) in the same directory.

If EEGLAB data is made up of two files (.set and .fdt), these files must be in the same directory.

A zip archive can also be used to instantiate a fw-file BrainVision or EEGLAB object.

from fw_file.eeg import BrainVision, EEGLAB
bv = BrainVision.from_zip("brainvision.zip") 
e = EEGLAB.from_zip("eeglab.zip")

Development

Install the project using poetry and enable pre-commit:

poetry install --extras "all"
pre-commit install

License

Project details

These details have not been verified by PyPI

Project links

GitHub Statistics

View statistics for this project via Libraries.io, or by using our public dataset on Google BigQuery

Release history Release notifications | RSS feed

This version

3.3.3

Apr 12, 2024

3.3.2

Jan 12, 2024

3.3.1

Dec 1, 2023

3.3.0

Nov 7, 2023

3.2.2

Nov 6, 2023

3.2.1

Oct 30, 2023

3.2.0

Oct 29, 2023

3.1.0

Oct 23, 2023

3.0.1

Oct 17, 2023

3.0.0

Aug 1, 2023

2.4.1

Jul 28, 2023

2.4.0

Jul 25, 2023

2.3.0

Jun 29, 2023

2.2.0

Apr 7, 2023

2.1.1

Feb 16, 2023

2.1.0

Feb 15, 2023

2.0.0

Dec 12, 2022

1.4.2

Dec 6, 2022

1.4.1

Dec 5, 2022

1.4.0

Nov 18, 2022

1.3.6

Jul 26, 2022

1.3.5

Jun 28, 2022

1.3.4

Jun 16, 2022

1.3.3

Apr 21, 2022

1.3.2

Apr 21, 2022

1.3.1

Apr 1, 2022

1.3.0

Mar 31, 2022

1.2.0

Jan 18, 2022

1.1.2

Dec 22, 2021

1.1.1

Nov 26, 2021

1.1.0

Nov 19, 2021

1.0.3

Nov 4, 2021

1.0.2

Oct 22, 2021

1.0.1

Oct 8, 2021

1.0.0

Aug 10, 2021

0.7.5

Jul 26, 2021

0.7.4

Jul 22, 2021

0.7.3

Jul 19, 2021

0.7.2

Jun 1, 2021

0.7.1

Jun 1, 2021

0.7.0

May 31, 2021

0.6.3

May 26, 2021

0.6.2

May 19, 2021

0.6.1

May 18, 2021

0.6.0

May 18, 2021

0.5.0

Apr 27, 2021

0.4.4

Apr 15, 2021

0.4.3

Mar 31, 2021

0.4.2

Mar 24, 2021

0.4.1

Mar 22, 2021

0.4.0

Mar 17, 2021

0.3.1

Mar 4, 2021

0.3.0

Feb 22, 2021

0.2.0

Feb 17, 2021

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distributions

No source distribution files available for this release.See tutorial on generating distribution archives.

Built Distribution

fw_file-3.3.3-py3-none-any.whl (360.8 kB view hashes)

Uploaded Apr 12, 2024 Python 3

Hashes for fw_file-3.3.3-py3-none-any.whl

Hashes for fw_file-3.3.3-py3-none-any.whl
Algorithm	Hash digest
SHA256	`ac1dd56825832c08dbcc4416bc68728661430032f589894083e3e1c109120c02`
MD5	`ea8d065d3ff6ec3b5aacc909867dbc2a`
BLAKE2b-256	`644d8482e25f067a96a96bb6344e9fbd1a72a6bd07481640ee161f657b40278a`

fw-file 3.3.3

Navigation

Verified details

Maintainers

Unverified details

Project links

GitHub Statistics

Meta

Classifiers

Project description

fw-file

Installation

Usage

Opening

Fields

Metadata

Saving

Collections and series

DICOM Standard Editions

Private dictionary

`DataElement` decoding

EEG

Development

License

Project details

Verified details

Maintainers

Unverified details

Project links

GitHub Statistics

Meta

Classifiers

Release history Release notifications | RSS feed

Download files

Source Distributions

Built Distribution

fw-file 3.3.3

Navigation

Verified details

Maintainers

Unverified details

Project links

GitHub Statistics

Meta

Classifiers

Project description

fw-file

Installation

Usage

Opening

Fields

Metadata

Saving

Collections and series

DICOM Standard Editions

Private dictionary

DataElement decoding

EEG

Development

License

Project details

Verified details

Maintainers

Unverified details

Project links

GitHub Statistics

Meta

Classifiers

Release history Release notifications | RSS feed

Download files

Source Distributions

Built Distribution

`DataElement` decoding