Skip to main content

MRI defacing pipeline with skull-stripping and affine registration from cai4cai

Project description

caideface

MRI defacing and text anonymisation toolkit from the cai4cai research group (Contextual Artificial Intelligence for Computer Assisted Interventions).

This package provides two complementary anonymisation capabilities:

  • Image defacing -- removes facial features from head MRI scans while preserving brain structures, as described in the paper "A Generalisable Head MRI Defacing Pipeline: Evaluation on 2,566 Meningioma Scans" (arXiv:2505.12999).
  • Text anonymisation -- detects personal names in medical reports using a trained spaCy NER model and replaces them with realistic fake names (Hiding in Plain Sight / HIPS technique).

Pipeline overview

Image defacing pipeline

The defacing pipeline consists of three steps:

  1. Reorientation -- Aligns NIfTI scans to LAS canonical orientation (MNI152 standard) using nibabel.
  2. Skull-stripping -- Extracts brain masks using HD-BET, then applies dynamic dilation to preserve peripheral brain structures.
  3. Registration & Defacing -- Registers each scan to the MNI152 template using BRAINSFit (affine), warps a face mask into the scan's space, and applies it to remove facial features.

Text anonymisation (NER + HIPS)

The text anonymisation module uses a trained spaCy Named Entity Recognition (NER) model to identify personal names (PER entities) in .txt files and replaces them with realistic fake names generated by the Faker library. This "Hiding in Plain Sight" (HIPS) approach produces anonymised reports that remain naturally readable. Consistent name mapping ensures that the same real name is always replaced with the same fake name within a document.

All required models and data are bundled with the package, so no additional downloads are needed.

Requirements

Python

  • Python >= 3.9

External tools (not pip-installable)

Tool Used in Install
BRAINSFit & BRAINSResample Step 3 Bundled with 3D Slicer

Note: Step 1 (reorientation) no longer requires FSL -- it uses nibabel's orientation tools to reorient scans to LAS (equivalent to fslreorient2std).

Finding BRAINSFit and BRAINSResample

These executables are included with 3D Slicer. Common locations:

  • macOS: /Applications/Slicer.app/Contents/lib/Slicer-5.8/cli-modules/BRAINSFit
  • Linux: /path/to/Slicer/lib/Slicer-5.8/cli-modules/BRAINSFit

Replace 5.8 with your installed Slicer version if different. To verify the executables are found and working:

# Check they exist
ls /Applications/Slicer.app/Contents/lib/Slicer-5.8/cli-modules/BRAINSFit
ls /Applications/Slicer.app/Contents/lib/Slicer-5.8/cli-modules/BRAINSResample

# Check they run (should print usage/help info)
/Applications/Slicer.app/Contents/lib/Slicer-5.8/cli-modules/BRAINSFit --help
/Applications/Slicer.app/Contents/lib/Slicer-5.8/cli-modules/BRAINSResample --help

You can also build them from source via BRAINSTools.

Installation

We recommend using a conda environment:

conda create -n caideface python=3.10 -y
conda activate caideface
pip install caideface

Or install from GitHub:

pip install "caideface @ git+https://github.com/cai4cai/defacing_pipeline.git#subdirectory=caideface"

Or install from source:

git clone https://github.com/cai4cai/defacing_pipeline.git
cd defacing_pipeline/caideface
pip install -e .

Note: caideface requires numpy<2 (enforced automatically). Some dependencies (HD-BET / nnU-Net) are not yet compatible with NumPy 2.x.

Usage

CLI -- Full defacing pipeline

Run all three steps in one command:

caideface run ./input_nifti ./output \
  --brainsfit /path/to/BRAINSFit \
  --brainsresample /path/to/BRAINSResample

This creates three subdirectories under ./output:

  • reoriented/ -- Step 1 outputs
  • hdbet/ -- Step 2 outputs (skull-stripped, masks, dilated)
  • defaced/ -- Step 3 outputs (final defaced scans)

Options

Flag Default Description
--device auto-detected cpu or cuda for HD-BET
--no-tta on Disable HD-BET test-time augmentation (faster but less accurate)
--dilation-mm 14.0 Brain mask dilation in mm
--background 0 Fill value for defaced regions (0 for MRI, -1024 for CT)
--template bundled Custom MNI152 skull-stripped template
--face-mask bundled Custom face mask in MNI152 space
--steps all Run specific steps: reorient, skull_strip, deface (comma-separated)
-v off Verbose/debug logging

CLI -- Individual defacing steps

Run each step separately for more control:

# Step 1: Reorientation
caideface reorient ./raw_nifti ./reoriented

# Step 2: Skull-stripping
caideface skull-strip ./reoriented ./hdbet --device cpu

# Step 3: Registration & Defacing
caideface deface ./reoriented ./hdbet ./defaced \
  --brainsfit /path/to/BRAINSFit \
  --brainsresample /path/to/BRAINSResample

CLI -- Text anonymisation

Single file

caideface anonymize-single ./reports/report_1.txt ./anonymized/report_1.txt

Batch (all .txt files in a directory)

caideface anonymize ./reports ./anonymized_reports

Options

Both commands accept the same options:

Flag Default Description
--model bundled Path to a custom spaCy NER model directory
--n-names 50 Size of the fake name pool
--seed none Random seed for reproducible output
-v off Verbose/debug logging

Example

Input (reports/report_1550.txt):

Reported by Danielle Smith and William Stuart on 03/10/2014

Output (anonymized_reports/report_1550.txt):

Reported by Ryan Munoz and Holly Wood on 03/10/2014

The batch command saves an anonymization_log.csv alongside the output files with a summary of replacements per file.

Python API -- Text anonymisation

Single file

from caideface.anonymize import load_ner_model, generate_fake_names, anonymize_single

# Load model and generate fake name pool (do this once)
nlp = load_ner_model()                        # uses bundled model
fake_names = generate_fake_names(n=50, seed=42)

# Anonymise a single report
result = anonymize_single(
    input_file="reports/report_1.txt",
    output_file="anonymized/report_1.txt",
    nlp=nlp,
    fake_names=fake_names,
)
print(result["replacements"])   # number of names replaced
print(result["names_found"])    # list of original names detected
print(result["name_mapping"])   # {original_name: fake_name} mapping

Batch processing

from caideface import anonymize_batch

# Anonymise all .txt files in a directory
log_df = anonymize_batch(
    input_dir="reports/",
    output_dir="anonymized_reports/",
    seed=42,
)
print(log_df)  # DataFrame with file, replacements, names_found per file

All available imports

from caideface import (
    DefacePipeline,           # Full image defacing pipeline
    reorient_batch,           # Step 1
    skull_strip_batch,        # Step 2
    deface_batch,             # Step 3
    anonymize_batch,          # Text anonymisation (batch)
    anonymize_single,         # Text anonymisation (single file)
    default_ner_model_path,   # Path to bundled NER model
)

Output structure

Image defacing

output/
├── reoriented/
│   ├── reorientation_log.csv
│   └── <subject>/<scan>.nii.gz
├── hdbet/
│   ├── hd_bet_log.csv
│   └── <subject>/
│       ├── hd_bet_<scan>.nii.gz           # Skull-stripped
│       ├── hd_bet_mask_<scan>.nii.gz      # Dilated brain mask
│       └── hd_bet_dilated_<scan>.nii.gz   # Dilated skull-stripped
└── defaced/
    ├── not_defaced_scans.csv              # Only if failures occurred
    └── <subject>/
        └── hd_bet_dilated_<scan>_masked.nii.gz  # Final defaced scan

Text anonymisation

anonymized_reports/
├── anonymization_log.csv                  # Replacements per file
├── report_1.txt                           # Anonymised report
├── report_2.txt
└── ...

Existing transforms

If you have pre-computed registration transforms (e.g. from 3D Slicer), place a file named Transform_to_template.txt in the same directory as the dilated skull-stripped scan. The pipeline will use it instead of running BRAINSFit. Both plain 4x4 text matrices and ITK/Slicer transform formats are supported.

Citation

If you use this tool, please cite:

@article{caideface2025,
  title={A Generalisable Head MRI Defacing Pipeline: Evaluation on 2,566 Meningioma Scans},
  year={2025},
  url={https://arxiv.org/abs/2505.12999}
}

If you use HD-BET (skull-stripping, Step 2), please also cite:

@article{Isensee2019,
  author={Isensee, F. and Schell, M. and Tursunova, I. and Brugnara, G. and Bonekamp, D. and Neuberger, U. and Wick, A. and Schlemmer, H. P. and Heiland, S. and Wick, W. and Bendszus, M. and Maier-Hein, K. H. and Kickingereder, P.},
  title={Automated brain extraction of multi-sequence MRI using artificial neural networks},
  journal={Human Brain Mapping},
  year={2019},
  pages={1--13},
  doi={10.1002/hbm.24750}
}

If you use the text anonymisation (NER + HIPS), please also cite:

@article{garcia2025ner,
  title={Evaluation of Named Entity Recognition for Automated Extraction of Present Tumor Size and Personal Names from Radiology Reports Using Spacy},
  author={Garcia-Foncillas Macias, Lorena and Barfoot, Theodore and Vercauteren, Tom and Shapey, Jonathan},
  journal={Journal of Neurological Surgery Part B: Skull Base},
  volume={86},
  number={S 01},
  year={2025},
  doi={10.1055/s-0045-1803715}
}

License

This project is licensed under the Apache License 2.0 -- see the LICENSE file for details.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

caideface-0.3.2.tar.gz (12.3 MB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

caideface-0.3.2-py3-none-any.whl (12.3 MB view details)

Uploaded Python 3

File details

Details for the file caideface-0.3.2.tar.gz.

File metadata

  • Download URL: caideface-0.3.2.tar.gz
  • Upload date:
  • Size: 12.3 MB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.12.0

File hashes

Hashes for caideface-0.3.2.tar.gz
Algorithm Hash digest
SHA256 c67237bfe8b0a3843f38cdd24a19935649b8e1edba7208770cb29ae1eb8b6083
MD5 95c08a22a042b9a68b9f66a628922d6b
BLAKE2b-256 386e3e0248b07dc33cabe62eff629e5354710e4a7d28e7bd0f806636e7070c44

See more details on using hashes here.

File details

Details for the file caideface-0.3.2-py3-none-any.whl.

File metadata

  • Download URL: caideface-0.3.2-py3-none-any.whl
  • Upload date:
  • Size: 12.3 MB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.12.0

File hashes

Hashes for caideface-0.3.2-py3-none-any.whl
Algorithm Hash digest
SHA256 190c84b6664f661197831f089e9d1bcaa36effba7b4c15f514046961afdae8a2
MD5 6bc36ec27acdc213a8aeb466b9db8e61
BLAKE2b-256 2b25be32d33ee5ad3a4d8ff53e9a5b4a74ca2bfcccde6a95cb7405279cc3a9d8

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page