Skip to main content

The complete DICOM toolkit for Python — every transfer syntax built in (no plugins), NumPy/PyTorch native; DIMSE + DICOMweb, NIfTI/BIDS, RT dose/DVH, WSI, SEG/SR, radiomics in one wheel

Project description

pydcm

The complete DICOM toolkit for Python. A native reader/writer with a built-in decoder for every transfer syntax (no codec plugins), plus zero-copy NumPy / PyTorch pixel access — a compiled extension does the work in-process.

  • Decodes JPEG, JPEG-2000, JPEG-LS, RLE, JPEG-XL and HTJ2K out of the box, no plugins, and returns Hounsfield units and 3-D volumes.
  • One wheel replaces a whole stack of single-purpose tools: read / decode / write every transfer syntax; DIMSE (pydcm.dimse) and DICOMweb (pydcm.dicomweb) networking; NIfTI / BIDS / DWI (Volume.to_nifti); segmentations, parametric maps and structured reports; radiomics (pydcm.radiomics); whole-slide imaging (pydcm.wsi); RT dose & DVH (pydcm.rt); waveforms; FHIR / HL7 bridges. See the capability map.
  • The API is clean and Pythonic, and most existing Python DICOM code runs against it unchanged. SimpleITK images are accepted / returned where natural (radiomics, seg masks) without being a dependency.

Not a medical device. pydcm is not intended or cleared for clinical or diagnostic use. Decoded pixels and HU are for research/engineering only; validate any output for your intended use.

Install

pip install pydcm

A platform wheel ships the compiled extension — no build step, no DICOM library on your side. (A macOS-arm64 wheel is published today; other platforms are being added.)

Use

import pydcm

arr = pydcm.decode("scan.dcm")             # ndarray [frames, rows, cols(, samples)]
frame = pydcm.decode("ct.dcm", frame=2)    # 1-based; 0 (default) = all frames
arr, meta = pydcm.decode("scan.dcm", with_meta=True)

Read & edit

dcmread returns a familiar Dataset. Attribute, item and sequence access, PersonName, file_meta, and a lazy pixel_array all behave the way Python DICOM users expect — so much that import pydcm as pydicom is usually enough:

ds = pydcm.dcmread("scan.dcm")
ds.PatientName              # PersonName — .family_name / .given_name / .ideographic
ds.Rows, ds.PixelSpacing    # 64, [0.3125, 0.3125]   (MultiValue)
ds[0x0010, 0x0010].value    # element access by tag / keyword / (group, element)
ds.file_meta.TransferSyntaxUID
for elem in ds: ...         # iterate in tag order
px = ds.pixel_array         # NumPy (any transfer syntax, no plugins)

Editing preserves everything. save_as patches the original file bytes via the native editor, so the Transfer Syntax, PixelData (including compressed J2K / RLE), private tags and every untouched element survive byte-for-byte:

ds = pydcm.dcmread("ct.dcm")
ds.PatientName = "Anon^Patient"
del ds.PatientBirthDate
ds.save_as("ct_anon.dcm")   # pixels + TS intact; only the named tags change

Verified for near-total element-value fidelity over a large real-world corpus (the lone difference is a private sequence pydcm parses where others leave it opaque — pydcm exposes more). The keyword↔tag↔VR dictionary holds 17,699 entries — a strict superset of the standard dictionaries — so attribute names resolve broadly. Known limits: file_meta surfaces the three mandatory UIDs (not optional group-0002 elements); a from-scratch (no source file) save_as uses a metadata-only path that omits pixels.

HU / real-world values

rescale=False (default) returns the stored integers (lossless). rescale=True returns the modality-LUT output — real-world values, i.e. HU for CT — as float32 (per-frame rescale applied for Enhanced multi-frame):

hu = pydcm.decode("ct.dcm", rescale=True)         # float32 Hounsfield Units
ds = pydcm.DICOMDataset("ct_series/", rescale=True)

Geometry sidecar

with_meta=True also returns the geometry the engine parsed — no extra read:

_, m = pydcm.decode("ct.dcm", with_meta=True)
m["rescale_slope"], m["rescale_intercept"]
m["pixel_spacing"]                  # [row, col] mm
m["image_position_patient"]         # (0020,0032) [x, y, z]
m["image_orientation_patient"]      # (0020,0037) 6-vector
m["slice_thickness"], m["window_center"], m["window_width"]
m["modality"], m["series_instance_uid"], m["sop_instance_uid"]

A directory → PyTorch

DICOMDataset walks a directory and decodes one image per item. It is DataLoader-compatible without importing torch (torch stays optional):

from torch.utils.data import DataLoader

ds = pydcm.DICOMDataset("study_dir/", to_torch=True)   # finds .dcm + extension-less DICOM
for batch in DataLoader(ds, batch_size=8, num_workers=4, shuffle=True):
    ...   # [B, H, W] or [B, H, W, C]
  • One sample = one file. Single-frame files yield [H, W] / [H, W, C]; multi-frame files yield [frames, H, W(, C)].
  • transform=fn reshapes each sample (e.g. [C, H, W], windowing, scaling).
  • pattern="*.dcm" selects by name; the default also detects DICOM by the DICM preamble (catching extension-less clinical exports).

A directory → one 3D volume

load_series assembles a directory of slices into a single spatially-ordered 3D HU volume (IOP clustering + IPP-projection sort, all in the native engine):

vol = pydcm.load_series("ct_series/")
vol.pixels          # ndarray [depth, rows, cols], float32 HU, sorted by position
vol.spacing         # (z, y, x) mm — slice spacing computed from IPP deltas
vol.affine          # 4×4 voxel→world
vol.series_instance_uid

The largest coherent volume in the directory is returned, so a stray localizer or second series does not corrupt the stack.

Beyond the core — capability map

Everything below ships in the same wheel, over the same native engine, with Python kept to thin marshalling — each area verified for correctness against reference data.

Area Import What it does
DIMSE networking pydcm.dimse SCU + full SCP, persistent associations
DICOMweb client pydcm.dicomweb QIDO / WADO / STOW / DELETE, streaming, TS negotiation
DICOM ↔ NIfTI Volume.to_nifti / from_nifti gantry-tilt-correct affine, vendor quirks handled; NIfTI → DICOM too
DWI / diffusion load_dwi / save_dwi / diffusion_table FSL .bval/.bvec (feeds FSL / MRtrix / dipy); Siemens CSA + mosaic, enhanced-MF, GE / Philips / UIH private
BIDS sidecar bids_sidecar BIDS .json (PhaseEncodingDirection, SliceTiming, EffectiveEchoSpacing…)
Preprocessing transforms pydcm.transforms resample / normalize / sliding-window; Tier 1 bit-exact (B-spline convention), Tier 2 ≤ 1 ULP (deep-learning convention)
Whole-slide imaging pydcm.wsi tile / region reads on the DICOM WSI pyramid + viewer tiles / total pixel matrix; bit-exact multi-vendor
RT dosimetry pydcm.rt read_rtdose / write_rtdose / dvhcalc — full ROI DVH coverage
Radiomics pydcm.radiomics 104 IBSI features / 7 classes, both aggregation conventions
SEG write_seg / write_seg_fractional / read_seg coded SEG, binary + fractional, SEG → labelmap
Parametric Map write_paramap / read_paramap author / read float parametric maps
Constructor-style object classes pydcm.sc / seg / pm / ko / pr / ann / sr SCImage, Segmentation, ParametricMap, KO, GSPS, the MeasurementReport content-tree classes, MicroscopyBulkSimpleAnnotations
SR / TID 1500 write_sr / write_report / read_report / sr_validate generic content trees + TID 1500 measurement reports + conformance checks
KO / PR / annotations write_ko / write_pr / read_ann Key Object Selection, Presentation State, Bulk Annotations
Encapsulated documents write_encapsulated / read_encapsulated PDF / CDA / STL / OBJ / MTL (PS3.3 A.104)
Waveforms (ECG / EEG) pydcm.waveforms 12-lead ECG / EEG read & write; arrays ready for analysis tools (MNE / neurokit2)
FHIR / HL7 pydcm.fhir / pydcm.hl7 DICOM → FHIR R4 ImagingStudy; HL7 v2.5 parse + ORU^R01 build
Agent / MCP pydcm.mcp in-process MCP server over live pydcm objects
File sets pydcm.fileset read a DICOMDIR / File-set, iterate instances

DIMSE networking

import pydcm.dimse as pynetdicom          # drop-in module shape

ae = pynetdicom.AE(ae_title="PYDCM")
assoc = ae.associate("pacs.local", 11112, ae_title="ANY-SCP")
assoc.send_c_echo()
assoc.send_c_store(pydcm.dcmread("ct.dcm"))   # persistent: many ops, one association
assoc.release()

AE.start_server runs the SCP side — EVT_C_STORE / ECHO / FIND / GET / MOVE handlers plus the DIMSE-N set.

DICOMweb client

from pydcm import dicomweb

studies = dicomweb.search_studies("https://pacs.example.com", matches={"PatientID": "42"})
for part10 in dicomweb.iter_study("https://pacs.example.com", study_uid):
    ...                                    # streaming retrieve, bounded memory
dicomweb.store_instances("https://pacs.example.com", [open("ct.dcm", "rb").read()])

DICOM ↔ NIfTI, BIDS, DWI

vol = pydcm.load_series("ct_series/")
vol.to_nifti("ct.nii.gz")                  # validated affine (incl. gantry tilt)
meta = pydcm.bids_sidecar("ep2d_diff/")    # standard BIDS .json fields
dwi = pydcm.load_dwi("ep2d_diff/")         # 4D stack + b-values/b-vectors (all vendors)

Preprocessing transforms

from pydcm import transforms as T

out = T.resample_cubic(vol, out_shape)     # bit-exact B-spline order-3
seg = T.sliding_window_inference(vol.pixels, roi_size=(96, 96, 96), predictor=model)

Tier 1 ops are bit-exact for the classic B-spline convention; Tier 2 ops match the deep-learning (grid-sample) convention to ≤ 1 float32 ULP — same numbers in training and serving, no Python image stack required.

Whole-slide imaging

from pydcm import wsi

slide = wsi.open_slide("wsi_dir/")         # DICOM WSI pyramid (one or many files)
region = slide.read_region((x, y), level=0, size=(512, 512))   # RGBA, level-0 coordinates
slide.associated_images["LABEL"]

RT dosimetry

grid = pydcm.read_rtdose("rtdose.dcm")     # dose grid + scaling + grid geometry
dvh = pydcm.dvhcalc("rtstruct.dcm", "rtdose.dcm", roi_number)   # ROI-for-ROI DVH
pydcm.write_rtdose(dose, affine=grid.affine, output="out.dcm")  # conformance-clean

License

pydcm is licensed under Apache-2.0 (see LICENSE / NOTICE). The high-performance DICOM engine ships as a compiled binary inside the extension. Third-party components linked into the extension are listed in THIRD-PARTY-LICENSES — all permissive (BSD / MIT / Zlib / Apache / IJG) except FFmpeg, which is included under LGPL-2.1 (full text in LGPL-2.1.txt) for embedded-video DICOM decode, with a §6 relink offer.

pydcm distributes as wheels only (no sdist) — the engine ships as a compiled binary inside the extension, and parts of it are not open source.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distributions

No source distribution files available for this release.See tutorial on generating distribution archives.

Built Distributions

If you're not sure about the file name format, learn more about wheel file names.

pydcm-0.1.2-cp312-abi3-manylinux_2_28_x86_64.whl (12.2 MB view details)

Uploaded CPython 3.12+manylinux: glibc 2.28+ x86-64

pydcm-0.1.2-cp312-abi3-manylinux_2_28_aarch64.whl (13.2 MB view details)

Uploaded CPython 3.12+manylinux: glibc 2.28+ ARM64

pydcm-0.1.2-cp312-abi3-macosx_14_0_x86_64.whl (8.7 MB view details)

Uploaded CPython 3.12+macOS 14.0+ x86-64

pydcm-0.1.2-cp312-abi3-macosx_14_0_arm64.whl (7.8 MB view details)

Uploaded CPython 3.12+macOS 14.0+ ARM64

File details

Details for the file pydcm-0.1.2-cp312-abi3-manylinux_2_28_x86_64.whl.

File metadata

File hashes

Hashes for pydcm-0.1.2-cp312-abi3-manylinux_2_28_x86_64.whl
Algorithm Hash digest
SHA256 3a84caa2a85ceaeeb1061e69dd54d51ea1d1ba6dbcf4aa32029d8a5ca5af275c
MD5 de0810fd954063b1e6aac48bb1730185
BLAKE2b-256 63491e1929e25fb7b997293abf62cdccfdf6da348d535afc81dec49f2c610f14

See more details on using hashes here.

File details

Details for the file pydcm-0.1.2-cp312-abi3-manylinux_2_28_aarch64.whl.

File metadata

File hashes

Hashes for pydcm-0.1.2-cp312-abi3-manylinux_2_28_aarch64.whl
Algorithm Hash digest
SHA256 793309eaa3b82d051fa1914792385853b9441c668ceb3f057de660f159fc1ab0
MD5 7f0dcb4eadb905d78599b84d6f3f9941
BLAKE2b-256 d50eab29deff11a8f271809a13926077b858ef244c8e100ab935275f8cbdf72f

See more details on using hashes here.

File details

Details for the file pydcm-0.1.2-cp312-abi3-macosx_14_0_x86_64.whl.

File metadata

File hashes

Hashes for pydcm-0.1.2-cp312-abi3-macosx_14_0_x86_64.whl
Algorithm Hash digest
SHA256 afea912eb2f854ef284859bf2bbe9254b42f8809180fb2fc9c59f7d34a075f44
MD5 75431971f03ed6fb8af90132c7aae7d8
BLAKE2b-256 f3be0048897a40a86943fd5050c9302b42e8928172c94ef937af8dcb30fee461

See more details on using hashes here.

File details

Details for the file pydcm-0.1.2-cp312-abi3-macosx_14_0_arm64.whl.

File metadata

File hashes

Hashes for pydcm-0.1.2-cp312-abi3-macosx_14_0_arm64.whl
Algorithm Hash digest
SHA256 2079a67c831b2baaeb5785e3bef5c0b7814aa7463f92b5370240fe263813ee91
MD5 54d7c7043e67eb544e3670ba6dc2b1d6
BLAKE2b-256 b08ac4d07676c2c2bbcef4e944cedeae5790af328dc626a19a395cd20b4156b8

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page