Parse the DICOM Standard into a human-friendly JSON format.
Project description
DICOM Standard Parser
This program parses the web version of the DICOM Standard into human and machine-friendly JSON files. The purpose of these JSON files is twofold:
-
To provide a standardized and machine-readable way to access DICOM Standard information for software applications
-
To provide a logical model for the relationships between cross-referenced sections in the DICOM Standard
The finalized JSON output of this program is in the standard
directory at the
top level of this project.
Usage
The DICOM Standard Parser is useful for modeling and understanding properties of the various abstractions defined by the DICOM Standard (IODs, modules, attributes, etc.) as well as the relationships between them.
The raw HTML or XML represents the DICOM Standard as a document but the data isn't easily machine-readable. We process the data from the HTML format into organized JSON files which follow a set of formatting guidelines and contain natural keys to represent relationships between abstractions.
If you only need the data generated by the parser, curl
can be used to pull the JSON files directly without installation or implementing Python.
To find or work with a smaller amount of data, e.g. a single IOD, the DICOM Standard Browser may be appropriate.
Installation
Install the latest release with pip install dicom-standard
.
JSON Data Format
The generated JSON files conform to these formatting rules:
- JSON files representing objects are lists of dictionaries that each contain information relevant to the object.
- JSON files containing relational data between objects contain "foreign keys" to the relevant objects. These field names end with
Id
, e.g.ciodId
andmoduleId
.
Occasionally, files may deviate from this format when there is a very compelling reason. For example, references.json
should be a list of reference objects where the href link is the id
for each object. However, since almost every use case for references.json
will use the href as a lookup, it makes more sense for the file to be set up as an object containing href to HTML pairs.
Applications that use the JSON files from this repository may need to re-organize data. A separate script must be written to join data from multiple tables into one file or prune out unnecessary fields.
JSON Data Guarantees
The JSON generated by this program adheres to the following four rules:
- New fields may be added
- Bugs or incorrect data will be fixed as the standard changes
- No fields are removed, maintaining backwards compatibility
- The shape and organization of the JSON files will remain the same
The JSON files can be viewed here.
Users
Please contact us if you use this software and would like your name or company listed here.
Current Status
This program currently parses the DICOM Standard sections related to Information Object Definitions, modules, and attributes, as well as cross-referenced sections in other parts of the standard. This translates to the following sections:
Completely processed:
- PS3.3
- PS3.4
- PS3.6
Processed for references:
- PS3.15
- PS3.16
- PS3.17
- PS3.18
Development Setup
The Python scripts used to generate the JSON files are designed to be as extensible as possible. If you want to run the code yourself or configure your own custom parsing stage, you'll need the following system-level dependencies:
- Python 3.7
- Make + Unix tools
You will probably also want to setup a "virtual environment" (e.g. using Conda, or Pyenv + Virtualenv) to install the project dependencies into. Once you are in your "virtual environment", you can run:
$ make
to install and compile everything. Add the -j
flag to speed this process up
significantly.
Updating the Standard
To download and parse the most up-to-date web version of the DICOM Standard, run the following commands:
$ make clean
$ make updatestandard
$ make
To download an older version of the DICOM Standard, run
$ make updatestandard VERSION=<version>
with the year and revision desired, e.g. 2018e
, 2019c
.
WARNING: Differences between previous versions and the current version may cause bugs when used with the current parser library. We recommend forking this repository if you need to use a specific version of the standard.
Using the Library
Parsing stages are indicated by prefixed names (i.e. extract_xxx.py
or
process_xxx.py
) and use a variety of utility functions from parse_lib.py
and other *_utils.py
modules.
Design Philosophy
The overall data flow of this program takes the following form:
extract (post)process
Raw HTML ---------> JSON intermediate ---------------> JSON final
During this process, the following invariants are maintained:
- Each step in the parsing process is classified as either an "extract" stage, or a "process" stage.
- Stages are python scripts that take one or more files as inputs, and write their output to standard out.
- "Extract" stages takes one more more HTML input files and print out JSON.
- "Process" stages take one or more JSON files as inputs and print out JSON.
In this way, raw HTML is not touched by any stage other than extract_*.py
,
and successive processing steps use increasingly refined JSON.
Parser Stages
A map of all extraction and processing pathways is shown below:
+-------+ +----------+ +-------+ +-------+
| PS3.3 | | Other | | PS3.4 | | PS3.6 |
+---+---+ | DICOM | +---+---+ +---+---+
| | Sections | | |
| +-----+----+ | |
| | +----v----+ +-----v------+
+-------------+--------+------+-------------+ | | Extract | | Extract |
| | | | | | SOPs | | Attributes |
+---v-----+ +----v-----+ +------v------+ +---v--v---+ +----+----+ +-----+------+
| Extract | | Extract | | Extract | | Extract | | |
| CIODs/ | | CIODs/FG | | Modules/ | | Sections | | |
| Modules | | Macros | | Macro Attrs | +--------+-+ v v
+----+----+ +----+-----+ +------+-----+ | sops.json attributes.json
| | | |
+-------------+ | +---------------+ +-----------+
| | | | | |
+-----v-----+ +----v----+ +----v------+ +-----v------+ +-----v------+ |
| Process | | Process | | Process | | Preprocess | | Preprocess | |
| CIOD/ | | CIODs | | CIOD/FG | | Modules/ | | Macros/ | |
| Module | +----+----+ | Macro | | Attributes | | Attributes | |
| Relations | | | Relations | +-----+------+ +-----+------+ |
+-----+-----+ | +----+------+ | | |
| v | +-------+ +-------+ |
| ciods.json | | | | | |
v | +----v----+ | +----v----+ | |
ciod_to_modules.json | | Process | | | Process | | |
v | Modules | | | Macros | | |
ciod_to_func_group_macros.json +----+----+ | +----+----+ | |
| | | | |
| | | | |
v | v | |
modules.json | macros.json | |
| | |
+-------v---+ +-------v---+ |
| Process | | Process | |
| Module | | Macro | |
| Attribute | | Attribute | |
| Relations | | Relations | |
+-------+---+ +-------+---+ |
| | |
+-v---------------v------v-+
| Postprocess |
| Add References |
+-----+-------+------+-----+
| | |
+--------+ | +--------+
| v |
| macros_to_attributes.json |
v v
modules_to_attributes.json references.json
To update the parser map, please use ASCIIFlow.
Testing
To run the full test suite, install and run tox
.
To run a specific test, run tox -e <testenv>
. Test environments include:
flake8
: check and enforce code style and formatmypy
: validate type hintspytest
: run a set of unit and end-to-end testsbuild-dist
: test building the backend into source and binary distributions
Contact
You can contact us directly through our website.
Reporting Issues and Bugs
If you find bugs or have suggestions for improvement, please open a GitHub issue or feel free to make a pull request.
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
File details
Details for the file dicom-standard-0.1.dev202004162131.tar.gz
.
File metadata
- Download URL: dicom-standard-0.1.dev202004162131.tar.gz
- Upload date:
- Size: 2.5 MB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/3.1.1 pkginfo/1.5.0.1 requests/2.23.0 setuptools/46.1.3 requests-toolbelt/0.9.1 tqdm/4.45.0 CPython/3.7.7
File hashes
Algorithm | Hash digest | |
---|---|---|
SHA256 | 6c091ffa88482fa1335c87fb41dd43ad9a71ca87145dc506e1bb9df391aaf135 |
|
MD5 | 5ea5b03d3c4af8ff7991ba76f20a69c3 |
|
BLAKE2b-256 | 34c535d75c6317785c85f157d046110f0649d60f99d1edaac756e154c0fc9ea4 |
File details
Details for the file dicom_standard-0.1.dev202004162131-py3-none-any.whl
.
File metadata
- Download URL: dicom_standard-0.1.dev202004162131-py3-none-any.whl
- Upload date:
- Size: 2.6 MB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/3.1.1 pkginfo/1.5.0.1 requests/2.23.0 setuptools/46.1.3 requests-toolbelt/0.9.1 tqdm/4.45.0 CPython/3.7.7
File hashes
Algorithm | Hash digest | |
---|---|---|
SHA256 | 13f24a84b6ec17802e67491b529bb17f7b3dba00dc82d689472555511ba41bab |
|
MD5 | 1865d4c2fae1fb1b637ccf917679ba18 |
|
BLAKE2b-256 | 7578206830605c117fbd391da4b49273bc4ada065695d1ae6f5b0f883b7a9791 |