A PDF to Dicom Converter
Project description
pdf2dcm
PDF to DICOM Converter
Convert PDFs into standards-compliant DICOM files for PACS, radiology, and healthcare interoperability workflows.
Features
- Convert PDFs to Encapsulated DICOM or RGB Secondary Capture DICOM
- Preserve patient/study metadata from template DICOMs
- Simple Python API built on pydicom
- Compatible with PACS workflows
SETUP
Python Package Setup
The python package is available for use on PyPI. It can be setup simply via pip
pip install pdf2dcm
To the check the setup, simply check the version number of the pdf2dcm package by
python -c 'import pdf2dcm; print(pdf2dcm.__version__)'
Poppler Setup
Poppler is a popular project that is used for the creation of Dicom RGB Secondary Capture. You can check if you already have it installed by calling pdftoppm -h in your terminal/cmd. To install poppler these are some of the recommended ways-
Conda
conda install -c conda-forge poppler
Ubuntu
sudo apt-get install poppler-utils
MacOS
brew install poppler
PDF to Encapsulated DCM
Stores the original PDF directly inside a DICOM object. This is useful for:
- Radiology or pathology or any structured clinical documents
- PACS archival workflows
Usage
from pdf2dcm import Pdf2EncapsDCM
converter = Pdf2EncapsDCM()
converted_dcm = converter.run(path_pdf='tests/test_data/test_file.pdf', path_template_dcm='tests/test_data/CT_small.dcm', suffix =".dcm")
print(converted_dcm)
# [ 'tests/test_data/test_file.dcm' ]
Parameters converter.run:
path_pdf (str): path of the pdf that needs to be encapsulatedpath_template_dcm (str, optional): Optional template DICOM used for metadata inheritance.suffix (str, optional): suffix of the dicom files. Defaults to ".dcm".
Returns:
List[Path]: list of path of the stored encapsulated dcm
PDF to RGB Secondary Capture DCM
Renders PDF pages as RGB images and stores them as Secondary Capture DICOM instances. Useful when:
- Encapsulated PDFs are unsupported
- Image-based viewing is preferred
- Legacy PACS compatibility is required
Usage
from pdf2dcm import Pdf2RgbSC
converter = Pdf2RgbSC()
converted_dcm = converter.run(path_pdf='tests/test_data/test_file.pdf', path_template_dcm='tests/test_data/CT_small.dcm', suffix =".dcm")
print(converted_dcm)
# [ 'tests/test_data/test_file_0.dcm', 'tests/test_data/test_file_1.dcm' ]
Parameters converter.run:
path_pdf (str): path of the pdf that needs to be convertedpath_template_dcm (str, optional): Optional template DICOM used for metadata inheritance.suffix (str, optional): suffix of the dicom files. Defaults to ".dcm".
Returns:
List[Path]: list of paths of the stored secondary capture dcm
Notes
- Output DICOM filenames are derived from the input PDF filename.
- If no template is provided no repersonalisation takes place
- It is possible to produce dicoms without a suffix by simply passing
suffix=""to theconverter.run()
Metadata Inheritance
Metadata can optionally be copied from a template DICOM file to preserve patient and study context. Currently, the fields that is inherited by default are-
- PatientName
- PatientID
- PatientSex
- StudyInstanceUID
SeriesInstanceUIDSOPInstanceUID
The fields SeriesInstanceUID and SOPInstanceUID have been removed from the inheritance by copying as it violates the DICOM standards.
You can set the fields to repersonalize by passing repersonalisation_fields into Pdf2EncapsDCM(), or Pdf2RgbSC()
Example:
fields = [
"PatientName",
"PatientID",
"PatientSex",
"StudyInstanceUID",
"AccessionNumber"
]
converter = Pdf2RgbSC(repersonalisation_fields=fields)
note: this will overwrite the default fields.
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file pdf2dcm-0.6.0.tar.gz.
File metadata
- Download URL: pdf2dcm-0.6.0.tar.gz
- Upload date:
- Size: 7.1 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.2.0 CPython/3.12.3
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
39c2c3350159888404c8249fb5c8bd902e56a1ed029d10eb9f586015cc5e4242
|
|
| MD5 |
66931323ecc2b0e8fc6dcc7befc3fa87
|
|
| BLAKE2b-256 |
74ef203269f486b428e0a6cb1920b4d40a8b6006865da9162ffced2f55d2927d
|
File details
Details for the file pdf2dcm-0.6.0-py3-none-any.whl.
File metadata
- Download URL: pdf2dcm-0.6.0-py3-none-any.whl
- Upload date:
- Size: 9.4 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.2.0 CPython/3.12.3
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
ff27aefb2e26acc24ab1e01dab2a4705fccff7c4728bd08dccdced27f1d0a7e6
|
|
| MD5 |
05e348b27e7bdc371056b291926e8a0e
|
|
| BLAKE2b-256 |
e9fdc91b34a36910ca162236c1b07bda59068a21a01749477de0508158f19023
|