Production-tested WSI anonymizer for pathology slide files
Project description
PathSafe
Remove patient information from pathology slide files... safely, automatically, and verifiably.
When pathology scanners create digital slide files, they often embed hidden patient data inside the file: accession numbers, scan dates, operator names, and even photographs of the slide label. This information is invisible when viewing the slide image, but anyone with the right tools can extract it. PathSafe finds and removes all of this hidden data so your slides are safe to share for research or education.
PathSafe works with all major scanner brands, can process thousands of files at once, and can double-check its own work to make sure nothing was missed.
v1.1.0 released — File renaming/serialization (auto-sequential, CSV mapping, custom patterns), auto-worker detection, update checker with toast notifications, DICOM lazy loading, dynamic PDF report columns, type annotations across all modules, ARM64 Linux support, and hardened error handling.
v1.0.5 — Type annotations, ruff linter integration, hardened error handling, multi-size Windows icon, ARM64 AppImage, custom PHI pattern examples.
v1.0.4 — Faster default runs (verify and checksum now opt-in), real-time per-phase progress in the GUI, I/O concurrency control for large batches, and a new SHA-256 checksum option.
What PathSafe Does
| Finds hidden patient data | Accession numbers, dates, and names can be buried inside files in places you can't see when viewing the slide |
| Works with all major scanners | Hamamatsu (NDPI), Aperio (SVS), 3DHISTECH (MRXS), Roche/Ventana (BIF), Leica (SCN), DICOM, and other TIFF-based files |
| Erases label photos | Many scanners take a photo of the physical slide label (which may show patient names) and hide it inside the file. PathSafe erases these photos |
| Keeps your originals safe | PathSafe creates cleaned copies in a separate folder, so your original files are never touched |
| Double-checks everything | Use Scan or Verify on your output folder to confirm all patient data was removed |
| Creates compliance reports | Generates PDF reports and certificates documenting exactly what was found, what was removed, and proof that each file is clean |
| Easy to use | A visual interface guides you through four simple steps with no typing commands required |
| Handles large batches | Process hundreds or thousands of slides at once, with parallel processing to speed things up |
Installation
Option 1: Installer (recommended)
Download the installer for your platform. It creates a desktop shortcut and adds PathSafe to your Start Menu (Windows) or Applications folder (macOS).
| Platform | Installer |
|---|---|
| Windows | PathSafe-Setup.exe |
| macOS | pathsafe-gui-macos.dmg |
| Linux (x64) | pathsafe-gui-linux.AppImage |
| Linux (ARM64) | pathsafe-gui-linux-arm64.AppImage |
General note: Code-signing certificates for Windows and macOS are in the process of being added. Until then, your OS may show a first-run warning.
Windows note: Windows may show a "Windows protected your PC" warning the first time. Click "More info" then "Run anyway".
macOS note: After downloading, drag into your applications folder and input the following into the terminal:
sudo xattr -rd com.apple.quarantine "/Applications/PathSafe.app"
open "/Applications/PathSafe.app"
Linux note: You may need to right-click > properties, and choose "Open/execute as program" the first time, or run chmod +x on the downloaded file.
Option 2: Standalone executable (no installation needed)
A single portable file you can run from anywhere, including USB drives.
| Platform | Standalone |
|---|---|
| Windows | pathsafe-gui-windows.exe |
| macOS | pathsafe-gui-macos.dmg |
| Linux (x64) | pathsafe-gui-linux.AppImage |
| Linux (ARM64) | pathsafe-gui-linux-arm64.AppImage |
Alternative: Install with Python
For users who have Python installed (version 3.9 or newer):
pip install pathsafe[gui]
Then launch with pathsafe gui or use pathsafe for the command line.
Uninstallation (Windows, macOS, Linux)
Use the steps below based on how PathSafe was installed.
Windows
- Installed with
PathSafe-Setup.exe: Open Settings > Apps > Installed apps, find PathSafe, then click Uninstall. - Standalone
pathsafe-gui-windows.exe: Delete the.exefile and any shortcuts you created. - Optional settings cleanup: Remove
HKEY_CURRENT_USER\Software\PathSafe\PathSafefrom the registry.
macOS
- Installed app (
PathSafe.app): Move/Applications/PathSafe.appto Trash, then empty Trash. - Delete any downloaded
.dmgfile if no longer needed. - Optional settings cleanup:
rm -f ~/Library/Preferences/com.PathSafe.PathSafe.plist
Linux
- AppImage install: Delete the
pathsafe-gui-linux.AppImagefile. - If you created desktop integration entries manually, delete those launcher/icon files from
~/.local/share/applicationsand~/.local/share/iconsif present. - Optional settings cleanup:
rm -f ~/.config/PathSafe/PathSafe.conf
Python install (any platform)
If PathSafe was installed with pip:
python -m pip uninstall pathsafe
If it was installed inside a virtual environment, activate that environment first, or remove the full environment directory.
How to Use PathSafe (GUI)
Most users should use the graphical interface. For step-by-step instructions with screenshots, see the full instructions guide.
Launch PathSafe and follow four steps:
Step 1: Select your files
Browse for files or a folder, or simply drag and drop them onto the window. You can select multiple files at once by holding Ctrl or Shift while clicking.
Step 2: Scan
Click Scan for PHI. PathSafe checks your files and shows you what patient data it found. Nothing is changed at this point. This is just a preview. A PDF report is saved automatically.
Step 3: Choose where to save
Pick the output folder where your cleaned copies will go. A default location is already filled in for you.
Step 4: Anonymize
Click Anonymize. PathSafe copies your files to the output folder and removes all patient data from the copies. Your original files are never modified.
A summary popup tells you exactly what happened after each step.
Other things you can do in the GUI
| Feature | How |
|---|---|
| Switch between dark and light theme | Use the View menu (your choice is remembered) |
| Drag and drop files | Drop files or folders directly onto the window |
| Select multiple files at once | Hold Ctrl or Shift when browsing |
| Use keyboard shortcuts | Ctrl+O (open files), Ctrl+Shift+O (open folder), Ctrl+S (scan), Ctrl+R (anonymize), Ctrl+E (verify), Ctrl+I (file info), Ctrl+T (convert), Ctrl+L (save log), Esc (stop) |
| Speed up large batches | Increase the Workers slider (try 2-4) |
| Preview without changing anything | Check the "Dry run" box |
| Generate SHA-256 checksums | Check the "SHA-256 checksum" box for audit-trail hashes |
| Add your institution name | Fill in the Institution field (it appears on PDF reports and is remembered) |
| Save your results | Use Save Log or Export JSON in the Actions menu |
| Convert file formats | Use the Convert tab to change between NDPI, SVS, TIFF, PNG, and JPEG |
| Right-click a slide file | On Linux, right-click any slide file and choose "Open with PathSafe" |
How to Use PathSafe (Command Line)
Install the command-line tool from PyPI:
pip install pathsafe
For users comfortable with a terminal. Three commands is all you need:
# 1. Scan your files (nothing is changed)
pathsafe scan /path/to/slides/ --verbose
# 2. Anonymize (copies to a new folder, originals safe)
pathsafe anonymize /path/to/slides/ --output /path/to/clean/
# 3. Verify the results
pathsafe verify /path/to/clean/
Generate compliance documentation
pathsafe anonymize /path/to/slides/ --output /path/to/clean/ \
--certificate certificate.json --institution "My Hospital"
Generate a scan report
pathsafe scan /path/to/slides/ --report scan_report.pdf --institution "My Hospital"
Export scan results as JSON (for integration with other tools)
pathsafe scan /path/to/slides/ --json-out results.json
Anonymize files in place (modifies originals, so make sure you have backups!)
pathsafe anonymize /path/to/slides/ --in-place
Convert between file formats
pathsafe convert slide.ndpi -o slide.tiff # Convert to pyramidal TIFF
pathsafe convert slide.ndpi -o slide.tiff --anonymize # Convert and anonymize
pathsafe convert slide.ndpi -o slide.png -t png # Convert to PNG
pathsafe convert slide.ndpi -o slide.jpg -t jpeg --quality 85 # Convert to JPEG
pathsafe convert slide.ndpi -o label.png --extract label # Extract label image
pathsafe convert /slides/ -o /converted/ -t tiff -w 4 # Batch convert with 4 workers
Full list of command line options (click to expand)
All commands
pathsafe scan PATH Check files for patient data (read-only)
pathsafe anonymize PATH Remove patient data from files
pathsafe verify PATH Confirm anonymization was successful
pathsafe convert PATH Convert WSI files between formats
pathsafe info FILE Show metadata for a single file
pathsafe gui Launch the graphical interface
Scan options
| Option | What it does |
|---|---|
--verbose / -v |
Show detailed output with finding locations |
--format FORMAT |
Only scan files of a specific format (ndpi, svs, mrxs, bif, scn, dicom, tiff) |
--workers N / -w N |
Scan N files in parallel (faster for large batches) |
--report FILE |
Generate a PDF scan report |
--json-out FILE |
Export scan results as machine-readable JSON |
--institution NAME |
Institution name for PDF report headers |
--log FILE |
Save all output to a log file |
Anonymize options
| Option | What it does |
|---|---|
--output DIR / -o |
Save cleaned copies to this directory (originals untouched) |
--in-place |
Modify files directly instead of copying (requires explicit opt-in) |
--dry-run |
Show what would be done without making any changes |
--verbose / -v |
Show detailed output |
--workers N / -w N |
Process N files in parallel (faster for large batches) |
--certificate FILE / -c |
Generate a compliance certificate (JSON + PDF) |
--institution NAME / -i |
Institution name for PDF certificate headers |
--format FORMAT |
Only process files of a specific format |
--verify-integrity |
Verify image tile data integrity via SHA-256 checksums (off by default) |
--checksum |
Compute SHA-256 checksum of each output file (off by default) |
--no-reset-timestamps |
Keep original file timestamps (reset by default) |
--log FILE |
Save all output to a log file |
Verify options
| Option | What it does |
|---|---|
--verbose / -v |
Show detailed output |
--format FORMAT |
Only verify files of a specific format |
Convert options
| Option | What it does |
|---|---|
--output FILE/DIR / -o |
Output file or directory (required) |
--target-format / -t |
Target format: tiff (default), png, or jpeg |
--tile-size N |
Tile size for pyramidal TIFF in pixels (default: 256) |
--quality N |
JPEG quality 1-100 (default: 90) |
--anonymize / -a |
Also anonymize the converted output |
--extract TYPE |
Extract a label, macro, or thumbnail image (single file only) |
--reset-timestamps |
Reset file timestamps on output files |
--workers N / -w N |
Number of parallel workers for batch conversion |
--format FORMAT |
Only convert files of a specific format (batch mode) |
--verbose / -v |
Show detailed output |
Compliance Certificate
When you anonymize files (through the GUI or with --certificate on the command line), PathSafe generates a report documenting everything it did. This report serves as proof that your files were properly de-identified.
What's in the certificate:
- Which version of PathSafe was used
- When the batch was processed
- For each file: what patient data was found, what was removed, and whether the file passed verification
- A unique fingerprint (hash) of each cleaned file, so you can later prove the file hasn't been modified
- A glossary explaining each type of finding
Why this matters: This documentation can be used for regulatory reviews, research ethics submissions, or institutional audit trails. Keep the certificate with your anonymized files.
What Patient Data Does PathSafe Remove?
PathSafe removes these types of hidden information from your slide files:
- Accession numbers and case IDs: The primary patient/case identifiers embedded in file metadata
- Patient names, IDs, and demographics: Found in DICOM files and some scanner formats
- Scan dates and times: Can be cross-referenced with hospital records to identify patients
- Operator and physician names: Who scanned or ordered the slide
- Label and macro images: Embedded photographs of the physical slide label, which may show patient names, barcodes, or handwritten notes
- Scanner and institution information: Serial numbers, software versions, and location data that could identify where a slide came from
- Hidden metadata: Technical data (EXIF, GPS coordinates, color profiles) that standard viewers don't show but can still be extracted
- Filenames: PathSafe detects patient data in filenames and warns you (filenames must be renamed manually)
PathSafe scans every layer of the file, not just the surface. It also does a final sweep of the raw file data to catch anything that might have been missed.
For a detailed technical breakdown of exactly which fields are cleaned in each format, see the compliance documentation.
Supported Scanner Formats
| Scanner brand | File type | Fully supported |
|---|---|---|
| Hamamatsu | .ndpi | Yes |
| Aperio | .svs | Yes |
| 3DHISTECH / MIRAX | .mrxs | Yes |
| Roche / Ventana | .bif | Yes |
| Leica | .scn | Yes |
| DICOM WSI | .dcm, .dicom | Yes |
| Other TIFF-based (Philips, QPTIFF, Trestle, OME-TIFF, etc.) | .tif, .tiff | Yes (generic handler) |
How PathSafe Compares
PathSafe implements Level IV anonymization as defined by Bisson et al. (2023), which covers filename detection, label/macro image destruction, and complete metadata removal.
| Capability | PathSafe | anonymize-slide | EMPAIA wsi-anon |
|---|---|---|---|
| Detects patient data in filenames | Yes | No | No |
| Erases label/macro photos | Yes | Yes | Yes |
| Removes all metadata | Yes | No | Partial |
| Scans every layer of the file | Yes | No | Unknown |
| Scans hidden sub-directories (EXIF, GPS) | Yes | No | No |
| Re-checks files after cleaning | Yes | No | No |
| Verifies image integrity | Yes | No | No |
| Generates compliance reports (PDF + JSON) | Yes | No | No |
| Number of formats supported | 7 | 3 | Multiple |
| Graphical interface | Yes | No | No |
Security
- No internet connection: PathSafe works entirely offline. No data ever leaves your computer.
- No code execution from files: PathSafe reads and overwrites bytes but never runs anything found inside slide files.
- Open source: The entire codebase is available for review.
Independent Verification
A standalone verification script is included in the tools/ directory. This script uses no PathSafe code and independently verifies that anonymized files contain no remaining patient data. It parses the TIFF binary structure from scratch, checks every IFD for label and macro images, scans all string and binary metadata tags, looks for EXIF and GPS sub-IFDs, and runs regex patterns against the raw file bytes.
python tools/independent_scanner.py /path/to/anonymized/file.svs
No dependencies required beyond Python 3.9+.
Dependencies
If you use the downloadable installers (.exe, .dmg, .AppImage), you do not need to install Python dependencies manually.
For PyPI installs, PathSafe is lightweight. The base install (CLI) only requires click (command-line framework) and fpdf2 (PDF generation). All file reading uses Python's built-in standard library.
- CLI only:
pip install pathsafe - GUI:
pip install pathsafe[gui]
Optional packages add extra features:
| Package | What it adds | Install with |
|---|---|---|
| PySide6 | Graphical interface | pip install pathsafe[gui] |
| pydicom | DICOM WSI support | pip install pathsafe[dicom] |
| openslide-python | Enhanced format detection | pip install pathsafe[openslide] |
| tifffile + numpy | Format conversion | pip install pathsafe[convert] |
Further Reading
- Instructions Guide: Step-by-step walkthrough of every feature
- User Guide: Detailed usage for CLI and GUI
- Compliance Documentation: Full technical breakdown of every field cleaned in every format
- Developer Guide: Architecture, testing, and how to add new formats
License
Apache 2.0
Maintained by DrSoma, MD.
Project details
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file pathsafe-1.1.0.tar.gz.
File metadata
- Download URL: pathsafe-1.1.0.tar.gz
- Upload date:
- Size: 171.4 kB
- Tags: Source
- Uploaded using Trusted Publishing? Yes
- Uploaded via: twine/6.1.0 CPython/3.13.7
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
afb0b49e8b8659dc690d184d6f99e9e4225a6eaee30d9a0de965b2f81071ded2
|
|
| MD5 |
66f6734e07eea229a84512dc7fdeeced
|
|
| BLAKE2b-256 |
d5743de10cfc007fbe06dfc1d4e2e3e641535db57e65efc6bea31fa746851804
|
Provenance
The following attestation bundles were made for pathsafe-1.1.0.tar.gz:
Publisher:
pypi-publish.yml on DrSoma/PathSafe
-
Statement:
-
Statement type:
https://in-toto.io/Statement/v1 -
Predicate type:
https://docs.pypi.org/attestations/publish/v1 -
Subject name:
pathsafe-1.1.0.tar.gz -
Subject digest:
afb0b49e8b8659dc690d184d6f99e9e4225a6eaee30d9a0de965b2f81071ded2 - Sigstore transparency entry: 1184090723
- Sigstore integration time:
-
Permalink:
DrSoma/PathSafe@939e8514b9957f36111f25cd20ecdf88cebf815a -
Branch / Tag:
refs/tags/v1.1.0 - Owner: https://github.com/DrSoma
-
Access:
public
-
Token Issuer:
https://token.actions.githubusercontent.com -
Runner Environment:
github-hosted -
Publication workflow:
pypi-publish.yml@939e8514b9957f36111f25cd20ecdf88cebf815a -
Trigger Event:
release
-
Statement type:
File details
Details for the file pathsafe-1.1.0-py3-none-any.whl.
File metadata
- Download URL: pathsafe-1.1.0-py3-none-any.whl
- Upload date:
- Size: 127.6 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? Yes
- Uploaded via: twine/6.1.0 CPython/3.13.7
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
277d7b5c74f4cd49c497537e3c90baa76fe7e081af1e6854bab5534e91554492
|
|
| MD5 |
12470337966014da14557df6ed6607fd
|
|
| BLAKE2b-256 |
2401f25e805a0e1d0909b3fcd3ced8e34bada0b7b2a976c188d98143cb6ca3f6
|
Provenance
The following attestation bundles were made for pathsafe-1.1.0-py3-none-any.whl:
Publisher:
pypi-publish.yml on DrSoma/PathSafe
-
Statement:
-
Statement type:
https://in-toto.io/Statement/v1 -
Predicate type:
https://docs.pypi.org/attestations/publish/v1 -
Subject name:
pathsafe-1.1.0-py3-none-any.whl -
Subject digest:
277d7b5c74f4cd49c497537e3c90baa76fe7e081af1e6854bab5534e91554492 - Sigstore transparency entry: 1184091155
- Sigstore integration time:
-
Permalink:
DrSoma/PathSafe@939e8514b9957f36111f25cd20ecdf88cebf815a -
Branch / Tag:
refs/tags/v1.1.0 - Owner: https://github.com/DrSoma
-
Access:
public
-
Token Issuer:
https://token.actions.githubusercontent.com -
Runner Environment:
github-hosted -
Publication workflow:
pypi-publish.yml@939e8514b9957f36111f25cd20ecdf88cebf815a -
Trigger Event:
release
-
Statement type: