Open-source, self-hosted DICOM file-security scanner + Content Disarm & Reconstruction (CDR)

These details have not been verified by PyPI

Project links

Project description

DicomLock

Open-source, self-hosted security for DICOM medical-image files. It scans a file for the ways it can be weaponized, then disarms it by rebuilding a clean, clinically identical copy.

A DICOM file is not just a picture. It is input for a parser, and hospital software has to read and decode all of it before anyone sees an image. That step is the attack surface. DicomLock checks the file for polyglot malware, parser-exploit constructions, and pixel data that routes through vulnerable image or video codecs, then rebuilds a clean version before the file reaches a PACS, viewer, or model. It runs inside your network, so no patient data leaves the building.

This is Content Disarm and Reconstruction (CDR) for DICOM, built to be open and auditable.

Install

pip install dicomlock

The core install pulls in the decoder backends (gdcm and pylibjpeg) so disarm works out of the box. Two optional extras:

pip install "dicomlock[server]"   # web UI and REST API
pip install "dicomlock[full]"     # PHI / de-identification audit and legacy forensics

Python 3.10 or newer.

Usage

dicomlock file.dcm                 # scan one file
dicomlock folder/                  # scan every .dcm in a folder
dicomlock folder/ --disarm         # scan, then disarm or quarantine each file
dicomlock file.dcm --deid          # add the PHI / de-identification audit

As a library:

from scanner.pipeline import run_security_scan, disarm_or_quarantine, is_dangerous

report = run_security_scan("file.dcm")
if is_dangerous(report):
    result = disarm_or_quarantine("file.dcm")   # {"action": "disarmed" | "quarantined", ...}

Web UI and API:

python server.py    # http://localhost:8899

Uploads are scanned in a temp directory and deleted right after, so PHI is never persisted.

How it works

Scan. Deterministic, rule-based checks, no ML: preamble and polyglot signatures, length amplification, sequence-nesting depth, pixel-dimension and decompression bombs, private-tag payloads, codec-CVE exposure, and metadata integrity.
Disarm. For files that are dangerous but recoverable, it zeroes the preamble, transcodes compressed pixels to native off the vulnerable codec (in a sandboxed subprocess), and filters private tags against a vendor allowlist. Lossless sources come out bit-exact. Lossy sources are decoded once with no new compression.
Quarantine. Anything it cannot safely rebuild, such as length bombs and files no backend can decode, is held back. It re-scans its own output, so it never emits a file that still fails a check.

The point of CDR is that it rebuilds from a validated canonical form instead of matching a known signature, so it neutralizes attacks it has never seen. That is the defense that holds up when vulnerabilities turn up faster than anyone can patch them, which is the situation for the systems a patch cycle reaches slowly: legacy and embedded medical devices, and software locked behind FDA recertification.

Why this is a real problem

The 128-byte preamble can hold an executable header, so one file can be a valid scan and working malware at the same time (CVE-2019-11687, extended to Linux devices by ELFDICOM). Attacker-controlled length fields turn a 140-byte file into a multi-gigabyte allocation request. Encapsulated pixel and video data decodes through libjpeg, OpenJPEG, CharLS, and FFmpeg-class libraries that carry long CVE histories. Live examples in clinical software include the Orthanc auth bypass (CVE-2025-0896, CVSS 9.8) and MicroDicom remote code execution (CVE-2025-5943).

DicomLock works on the file. It does not break or weaken encryption and makes no claim to.

Results

All reproducible from the scripts in _attack_test/:

Zero false positives across 575 real clinical CT files, and zero on a separate mixed-compression corpus spanning 12 transfer syntaxes.
20 of 20 crafted attack fixtures flagged by the expected check.
pydicom, GDCM, and dcmtk accept the weaponized files without complaint; DicomLock flags every one.
Disarmed pixels are bit-exact against two independent decoders (GDCM and pylibjpeg) on every lossless sample.
The codec decode is sandboxed, so a crashing or hanging decoder is contained and the file is quarantined, not the tool.

The attack fixtures in this repo are inert. Polyglots carry only magic bytes, and payload tags carry a header plus zero padding. No working malware ships here.

Where it fits

Commercial DICOM CDR already exists (OPSWAT, Votiro), and there is academic prior art, so "nobody does this" is not the pitch. DicomLock's reason to exist is that it is open, self-hosted, auditable, and PACS-depth. The transcoding, the vendor allowlist, and the parser-bomb rejection are all readable in source and run inside your network. The natural first users are research-imaging and data-engineering teams that already ingest untrusted external DICOM.

It is a security and sanitization tool, not a medical device. It carries no diagnostic claim and no FDA clearance. See THREAT_MODEL.md for what it does and does not defend.

Documentation

THREAT_MODEL.md, attacks defended and explicit non-claims
ARCHITECTURE.md, module specs
CONTRIBUTING.md, how to help with the CVE map, vendor allowlist, and codecs
SECURITY.md, vulnerability disclosure

License

Apache-2.0, © 2026 Vijay Thakore. Provided as is, without warranty. DicomLock is a sanitization tool, not a medical device, and makes no diagnostic claim. Validate disarmed files in your own environment before any clinical use, and run on de-identified data where you can.

Project details

These details have not been verified by PyPI

Project links

Release history Release notifications | RSS feed

0.8.0

May 25, 2026

This version

0.7.0

May 25, 2026

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

dicomlock-0.7.0.tar.gz (52.3 kB view details)

Uploaded May 25, 2026 Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

The dropdown lists show the available interpreters, ABIs, and platforms. Enable javascript to be able to filter the list of wheel files.

dicomlock-0.7.0-py3-none-any.whl (59.6 kB view details)

Uploaded May 25, 2026 Python 3

File details

Details for the file dicomlock-0.7.0.tar.gz.

File metadata

Download URL: dicomlock-0.7.0.tar.gz
Upload date: May 25, 2026
Size: 52.3 kB
Tags: Source
Uploaded using Trusted Publishing? No
Uploaded via: twine/6.2.0 CPython/3.12.12

File hashes

Hashes for dicomlock-0.7.0.tar.gz
Algorithm	Hash digest
SHA256	`a2bcb47a01fa04e3ecfa5fa7bfcb941c42aad05db99882eb712537a351d4526d`
MD5	`088c76e6da5c775fd29e6ad2aebda990`
BLAKE2b-256	`ab7868d9621bd43c2a69af913c6f9b19a393c519632a2ecd29207f419ee7c1dc`

See more details on using hashes here.

File details

Details for the file dicomlock-0.7.0-py3-none-any.whl.

File metadata

Download URL: dicomlock-0.7.0-py3-none-any.whl
Upload date: May 25, 2026
Size: 59.6 kB
Tags: Python 3
Uploaded using Trusted Publishing? No
Uploaded via: twine/6.2.0 CPython/3.12.12

File hashes

Hashes for dicomlock-0.7.0-py3-none-any.whl
Algorithm	Hash digest
SHA256	`9d6358d352d231f6ca5e99fb935e0bfd24e4f14dfaf85df02b36e4f2c4378751`
MD5	`538ca3e3b2af4bda6e55b9de4610c7f4`
BLAKE2b-256	`6ec68bb487fdb207edeb4e2f0ccc45fbf18c308de08df2389407e6b9498c0803`

See more details on using hashes here.

dicomlock 0.7.0

Navigation

Verified details

Maintainers

Unverified details

Project links

Meta

Classifiers

Project description

DicomLock

Install

Usage

How it works

Why this is a real problem

Results

Where it fits

Documentation

License

Project details

Verified details

Maintainers

Unverified details

Project links

Meta

Classifiers

Release history Release notifications | RSS feed

Download files

Source Distribution

Built Distribution

File details

File metadata

File hashes

File details

File metadata

File hashes