De-Identification of Medical Imaging Data: A Comprehensive Tool for Ensuring Patient Privacy
Project description
De-Identification of Medical Imaging Data: A Comprehensive Tool for Ensuring Patient Privacy
[!IMPORTANT]
The package is now available on PyPI:pip install mede
[!NOTE]
MEDE now supports the Enhanced DICOM format!
This repository contains the De-Identification of Medical Imaging Data: A Comprehensive Tool for Ensuring Patient Privacy, which enables users to anonymize a wide variety of medical imaging types, including:
- Magnetic Resonance Imaging (MRI)
- Computer Tomography (CT)
- Ultrasound (US)
- Whole Slide Images (WSI)
- MRI raw data (twix)
This tool combines multiple anonymization steps, including metadata deidentification, defacing, and skull-stripping, while being faster than current state-of-the-art deidentification tools.
Getting Started
You can install the anonymization tool directly via pip or Docker.
Installation via pip
Our tool is available via pip. You can install it with the following command:
pip install mede
Additional Dependencies for Text Removal
If you want to use the text removal feature, you also need to install Google's Tesseract OCR engine. Follow the installation instructions for your operating system here.
-
On Ubuntu:
sudo apt install tesseract-ocr sudo apt install libtesseract-dev
-
On macOS (via Homebrew):
brew install tesseract
Installation via Docker
Alternatively, this tool is distributed via Docker. You can find the Docker images here. The Docker image is available for Linux-based (including macOS) amd64 and arm64 platforms.
Steps:
-
Pull the Docker image:
docker pull morrempe/mede:[tag] # Replace [tag] with either arm64 or amd64
-
Run the Docker container with an attached volume:
Your data will be mounted in thedatafolder:docker run --rm -it -v [Path/to/your/data]:/data morrempe/mede:[tag]
-
Run the script with the corresponding CLI parameters:
mede-deidentify [your flags]
Usage
De-Identification CLI
The mede-deidentify command-line interface (CLI) allows you to de-identify medical imaging data with various options. Below is the detailed usage guide:
mede-deidentify [-h] [-v | --verbose] [-t | --text-removal] [-i | --input]
[-o OUTPUT] [--gpu] [-s | --skull_strip] [-de | --deface]
[-tw | --twix] [-w | --wsi] [-r | --rename]
[-p PROCESSES]
[-d {basicProfile,cleanDescOpt,cleanGraphOpt,cleanStructContOpt,
rtnDevIdOpt,rtnInstIdOpt,rtnLongFullDatesOpt,
rtnLongModifDatesOpt,rtnPatCharsOpt,rtnSafePrivOpt,
rtnUIDsOpt} ...]
Options
| Option | Description |
|---|---|
-h, --help |
Show the help message and exit. |
-v, --verbose |
Enable verbose output. |
-t, --text-removal |
Perform text removal. |
-i INPUT, --input INPUT |
Path to the input data. |
-o OUTPUT, --output OUTPUT |
Path to save the output data. |
--gpu GPU |
Specify the GPU device number (default: 0). |
-s, --skull_strip |
Perform skull stripping. |
-de, --deface |
Perform defacing. |
-tw, --twix |
Process MRI raw data (twix format) and anonymize metadata. |
-w, --wsi |
Process Whole Slide Images (WSI). |
-r, --rename |
Rename files during processing. |
-p PROCESSES, --processes PROCESSES |
Number of processes to use for multiprocessing. |
-d, --deidentification-profile |
Specify one or more DICOM deidentification profiles to apply (see below). |
De-Identification Profiles
The -d or --deidentification-profile option allows you to specify one or more DICOM deidentification profiles. Available profiles include:
basicProfilecleanDescOptcleanGraphOptcleanStructContOptrtnDevIdOptrtnInstIdOptrtnLongFullDatesOptrtnLongModifDatesOptrtnPatCharsOptrtnSafePrivOptrtnUIDsOpt
You can specify multiple profiles by separating them with spaces. For example:
mede-deidentify -d basicProfile cleanDescOpt
Example Usage
Here’s an example of how to use the CLI:
mede-deidentify -i /path/to/input -o /path/to/output -s -d basicProfile
This command will:
- Take input data from
/path/to/input. - Save the output to
/path/to/output. - Apply skull stripping.
- Use the
basicProfiledeidentification profile.
Citation
If you use our tool in your work, please cite us with the following BibTeX entry.
@article{rempe2025identification,
title={De-identification of medical imaging data: a comprehensive tool for ensuring patient privacy},
author={Rempe, Moritz and Heine, Lukas and Seibold, Constantin and H{\"o}rst, Fabian and Kleesiek, Jens},
journal={European Radiology},
pages={1--10},
year={2025},
publisher={Springer}
}
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file mede-0.0.10.tar.gz.
File metadata
- Download URL: mede-0.0.10.tar.gz
- Upload date:
- Size: 76.7 MB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.2.0 CPython/3.11.13
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
419b49282a78275429691b027dab0d11b37acb6660b919cacdd52d46bce43eb4
|
|
| MD5 |
0c979d47186c056b22a6406d6fcb397a
|
|
| BLAKE2b-256 |
3f901b0163399a19fcb6cf9f6f30c5e752365b3365f10adeb6ff6c0c490d542e
|
File details
Details for the file mede-0.0.10-py3-none-any.whl.
File metadata
- Download URL: mede-0.0.10-py3-none-any.whl
- Upload date:
- Size: 41.3 MB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.2.0 CPython/3.11.13
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
f13e89a3dfa2baaaa1eb441c723afbe4c8e06b626f370eb24d3f923f63a252ab
|
|
| MD5 |
8c91a5420ffa837c56fd5eac93b79a9c
|
|
| BLAKE2b-256 |
854423b52a5845247205777a75c85f11ac114c60bc5df9c9e9876517d2fd42bc
|