Medical Imaging Topological Data Analysis - Extract TDA features from medical images for machine learning
Project description
Med-TDA: Medical Imaging Topological Data Analysis
Med-TDA is a Python library for extracting Topological Data Analysis (TDA) features from medical images. It provides a complete pipeline from image preprocessing to persistence barcode computation and feature vectorization, designed specifically for medical imaging applications in machine learning and radiomics research.
Installation
pip install medtda
Requirements: Python ≥ 3.8
Core Dependencies: NumPy ≥1.21, SciPy ≥1.7, GUDHI ≥3.5, cripser ≥0.0.32, SimpleITK ≥2.1, Pillow ≥9.0, scikit-image ≥0.19, scikit-learn ≥1.0, pandas ≥1.3, matplotlib ≥3.5, seaborn ≥0.11, PyYAML ≥6.0, tqdm ≥4.60
Supported Image Formats
- 2D Images: PNG, JPG, JPEG, TIFF, TIF, BMP
- 3D/4D Medical Images: NIfTI (.nii, .nii.gz), NRRD (.nrrd), MetaImage (.mha, .mhd)
- Mask Support: Single-label and multi-label segmentation masks in any supported format
Feature Extraction
The FeatureExtractor class provides an end-to-end solution that performs preprocessing, persistent homology computation, and feature vectorization in one step. This is the recommended way to extract TDA features.
from medtda import FeatureExtractor
# Initialize with desired settings
extractor = FeatureExtractor(
normalize=True,
normalize_method='minmax',
vectorization_method='persistence_stats'
)
# Extract features from image and mask
features = extractor.execute(
image='path/to/image.nii.gz',
mask='path/to/mask.nii.gz'
)
# features is a dictionary of TDA feature vectors ready for ML
Available Vectorization Methods:
persistence_stats: Statistical summaries (mean, std, min, max, percentiles)betti_curve: Betti number curves over filtration valuespersistence_image: 2D histogram representation of persistence diagramspersistence_landscape: Persistence landscape functionsentropy_summary: Entropy-based statistical featurespersistence_silhouette: Silhouette representationpersistence_lifespan: Lifespan distribution featurespersistence_tropical_coordinates: Tropical algebra coordinates
You can use multiple vectorization methods simultaneously:
extractor = FeatureExtractor(
normalize=True,
vectorization_method=['persistence_stats', 'betti_curve', 'entropy_summary']
)
features = extractor.execute(image, mask) # Returns combined features from all methods
Image Preprocessing
The Preprocessor class handles medical image preprocessing independently. Use this when you need standalone preprocessing or want to inspect preprocessed images before computing persistence.
Available Operations:
- Resampling: Resample 3D/4D images to target voxel spacing (e.g., isotropic resolution)
- Windowing: Apply intensity windowing (center/width) for CT images
- Normalization: Normalize intensity values (minmax, z-score, or robust scaling)
- Masking: Apply binary or multi-label masks with configurable background values
- ROI Cropping: Automatically crop to region of interest with padding to reduce computation
from medtda import Preprocessor
preprocessor = Preprocessor(
spacing=(1.0, 1.0, 1.0), # Resample to 1mm isotropic
normalize=True,
normalize_method='minmax',
crop_to_roi=True
)
preprocessed_image, metadata = preprocessor.preprocess(
image='ct_scan.nii.gz',
mask='roi_mask.nii.gz'
)
The metadata dictionary contains information about applied transformations, original and final image ranges, shapes, and cropping details.
Barcode Computation
The BarcodeExtractor class computes raw persistence barcodes (birth-death pairs) from medical images without vectorization. Use this when you need barcodes for custom analysis or visualization.
Persistent Homology Parameters:
- Filtration Type:
sublevel(default) orsuperlevelfiltration - Construction:
T(default, dual cubical complex) orV(Vietoris-Rips) - Max Dimension: Maximum homology dimension to compute (auto-detected from image dimensionality)
from medtda import BarcodeExtractor
extractor = BarcodeExtractor(
normalize=True,
filtration_type='sublevel',
max_dimension=2 # Compute H0, H1, H2
)
barcodes = extractor.execute(
image='image.nii.gz',
mask='mask.nii.gz'
)
# barcodes is a dict: {'H0': array, 'H1': array, 'H2': array}
# Each array has shape (n_features, 2) for (birth, death) pairs
Homology Dimensions:
- H0: Connected components (captures regions and holes)
- H1: Loops and tunnels (1-dimensional holes)
- H2: Voids and cavities (2-dimensional holes, 3D only)
- H3: 3-dimensional voids (4D images only)
Visualization
MedTDA provides 8 plotting functions for visualizing persistence barcodes:
from medtda.plotting import plot_persistence_diagram, plot_barcode, plot_betti_curve
# Visualize persistence diagram
plot_persistence_diagram(barcodes)
# Visualize barcode representation
plot_barcode(barcodes)
# Visualize Betti curves
plot_betti_curve(barcodes)
Available Plot Types:
plot_persistence_diagram: Birth-death diagram with diagonalplot_barcode: Horizontal bars showing feature lifespansplot_betti_curve: Betti number evolution across filtration valuesplot_landscape: Persistence landscape functionsplot_entropy_summary: Entropy summary curvesplot_lifespan: Lifespan distribution curvesplot_silhouette: Persistence silhouette visualizationplot_tropical_coordinates: Tropical coordinate bar charts
All plots support multiple homology dimensions, custom color palettes, and seaborn styling.
Example Workflow
from medtda import FeatureExtractor
# 1. Initialize feature extractor with preprocessing and vectorization settings
extractor = FeatureExtractor(
spacing=(1.0, 1.0, 1.0), # Resample to 1mm isotropic
normalize=True, # Apply normalization
normalize_method='minmax', # Use min-max normalization
crop_to_roi=True, # Crop to ROI for efficiency
filtration_type='sublevel', # Sublevel filtration
max_dimension=2, # Compute H0, H1, H2
vectorization_method='persistence_stats', # Statistical features
return_barcodes=True # Also return raw barcodes
)
# 2. Extract features (all preprocessing, PH computation, and vectorization in one call)
features, barcodes = extractor.execute(
image='medical_image.nii.gz',
mask='segmentation_mask.nii.gz'
)
# 3. Use features for machine learning
print(features.keys()) # Dictionary of feature vectors
# Example output: ['PersStats_H0_mean', 'PersStats_H0_std', ...]
# 4. Optional: Visualize barcodes
from medtda.plotting import plot_persistence_diagram
plot_persistence_diagram(barcodes)
License
This project is licensed under the MIT License - see the LICENSE file for details.
Citation
If you use MedTDA in your research, please cite:
@software{medtda2026,
title = {Med-TDA: Medical Imaging Topological Data Analysis},
author = {Med-TDA Contributors},
year = {2026},
url = {https://github.com/dashtiali/medtda}
}
Acknowledgments
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file medtda-0.1.0a1.tar.gz.
File metadata
- Download URL: medtda-0.1.0a1.tar.gz
- Upload date:
- Size: 1.2 MB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.2.0 CPython/3.11.14
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
e23b474795ccba0060359b97c158efd34ab48aed6f2f534bc8ee0beafb01bdb4
|
|
| MD5 |
8f3c59c3baaf8a6c697195ee805ee408
|
|
| BLAKE2b-256 |
8fe0ca06d2e1a14b26fef76be2a7032cb39774f7e75313b3468d0f09a116ec94
|
File details
Details for the file medtda-0.1.0a1-py3-none-any.whl.
File metadata
- Download URL: medtda-0.1.0a1-py3-none-any.whl
- Upload date:
- Size: 48.5 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.2.0 CPython/3.11.14
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
d107d1e046252c12b0d7d0d754ca812e3a3385a0204cede09947af4b54a14fdd
|
|
| MD5 |
77bfe406aa2f57baa8e3e542da7f2c99
|
|
| BLAKE2b-256 |
a5cb311b1968d2b62a82d00f5b5898fb52ad5885fa650024f4e14ed892c292ef
|