Skip to main content

CLI utilities and viewer around DuneAI for automated NSCLC CT segmentation (no proprietary model code included).

Project description

๐Ÿฉบ DuneAI โ€” Automated NSCLC Segmentation

(PyPI package: duneai-auto)

Python TensorFlow License Nature

DuneAI provides automated detection and segmentation of non-small-cell lung cancer (NSCLC) from CT images using a pretrained deep-learning model.
This PyPI wrapper (duneai-auto) packages the complete inference + viewing system while keeping the original research code and model files external, ensuring no proprietary content is redistributed.


๐Ÿ“š Table of Contents


โœจ Features

This package provides:

  • A CLI to run automatic segmentation on a dataset of CT volumes
  • A viewer CLI to inspect .nrrd images and predicted masks
  • Full compatibility with the original DuneAI pipeline (Primakov et al., Nature Communications, 2022)
  • Safe loading of the legacy Keras 2.2.x model (JSON + HDF5) on modern TensorFlow 2.20.0
  • A cross-platform, fully upgraded implementation of the inference code

โš ๏ธ IMPORTANT SETUP REQUIRED:
This package requires TWO setup steps before first use:

  1. Extract model weights from split archive files
  2. Replace TheDuneAI.py with the updated code (included in this README)

Without these steps, the package will not work correctly!
โ†’ See Initial Setup for complete instructions.


๐Ÿ“‹ Quick Setup Checklist

Before using DuneAI, ensure you've completed these steps:

  • Installed duneai-auto via pip
  • Cloned the original DuneAI research repository
  • Extracted model weights (concatenate + unzip weights_v7.zip.001 + .002)
  • Replaced TheDuneAI.py content with the updated code (see Step 2)
  • Verified weights_v7.hdf5 exists in model_files/ directory
  • Confirmed TheDuneAI.py shows ~13 KB file size (not ~3 KB)

โ†’ See Installation and Initial Setup for detailed instructions.


โš™๏ธ Installation

Python Requirements: 3.9, 3.10, or 3.11 (Python 3.12+ not yet supported due to TensorFlow constraints)

Install from PyPI:

pip install duneai-auto

Optional visualization extras (Jupyter/ipywidgets):

pip install "duneai-auto[vis]"

This installs:

  • TensorFlow 2.20
  • scikit-image, SimpleITK, statsmodels, OpenCV
  • matplotlib (viewer backend)
  • All runtime libraries from your tested environment
  • Optional notebook support (ipywidgets, notebook) if [vis] is chosen

๐Ÿ“‚ Required External Assets

(Not included in the PyPI package)

You must clone/download the original research repository:

git clone https://github.com/primakov/DuneAI-Automated-detection-and-segmentation-of-non-small-cell-lung-cancer-computed-tomography-images.git
cd DuneAI-Automated-detection-and-segmentation-of-non-small-cell-lung-cancer-computed-tomography-images

You need the following folders from it:

Automatic segmentation script/
    TheDuneAI.py
    Generator_v1.py
    lung_extraction_funcs_13_09.py
    ...

Automatic segmentation script/model_files/
    model_v7.json
    weights_v7.zip.001, weights_v7.zip.002  (split archive - needs assembly)

Software for qualitative assesment/test_data/
    (example CT volumes)

๐Ÿ”ง Initial Setup (One-Time Configuration)

After cloning the repository, you must perform these setup steps:

Step 1: Assemble and Extract Model Weights

The model weights are distributed as a split archive. You need to combine and extract them:

cd "Automatic segmentation script/model_files"

# 1) Concatenate the split parts into a single zip file
cat weights_v7.zip.001 weights_v7.zip.002 > weights_v7.zip

# 2) Verify the combined file exists and has reasonable size
ls -lh weights_v7.zip

# 3) Extract the weights
unzip weights_v7.zip

# 4) Confirm weights_v7.hdf5 now exists
ls -lh weights_v7.hdf5

# 5) Return to repository root
cd ../..

After extraction, you should have:

  • weights_v7.hdf5 (~180 MB) in the model_files/ directory

Step 2: Update TheDuneAI.py

The original TheDuneAI.py needs to be replaced with the modernized version that includes:

  • โœ… Cross-platform compatibility (Windows/Linux/macOS)
  • โœ… Modern TensorFlow 2.20+ support
  • โœ… Efficient batched inference (4x faster)
  • โœ… Robust error handling
  • โœ… Production-grade code quality

๐Ÿ“Œ Critical: The duneai-auto package requires this updated version. The original TheDuneAI.py will cause path errors on non-Windows systems and uses inefficient slice-by-slice prediction.

Method 1: Direct Code Replacement (Recommended)

Step-by-step:

  1. Backup the original file:
cp "Automatic segmentation script/TheDuneAI.py" \
   "Automatic segmentation script/TheDuneAI.py.bak"
  1. Replace the entire content of Automatic segmentation script/TheDuneAI.py with the following code:
๐Ÿ“„ Click to expand: Complete TheDuneAI.py code (copy and paste this)
# -*- coding: utf-8 -*-
"""
DuneAI inference utilities.

Author
------
Original: S. Primakov
Updates : A. Lohachab

Updates Summary (A. Lohachab)
-----------------------------
- Cross-platform path handling for patient IDs.
- Robust batched prediction with empty-batch safeguards.
- Re-embedding logic preserved to original (exclusive) indexing.
- Optional input normalization (off by default; generator already windows).
- Clear errors for missing model assets; inference-only (no compile).
- Save masks as uint8 with 0/255 for visibility.
"""
import os
import time

import SimpleITK as sitk
import numpy as np
import keras
import cv2
import tensorflow as tf  # required for legacy JSON deserialization via tf.keras

# Pipeline components from the original repository
import lung_extraction_funcs_13_09 as le
import Generator_v1

from scipy import ndimage  # noqa: F401  (kept for parity with original code)
from tqdm import tqdm as tqdm


def _patient_dir_from_filename(fname: str) -> str:
    """Return the immediate parent directory name (patient ID), OS-agnostic."""
    p = fname.replace("\\", "/")
    return os.path.basename(os.path.dirname(p))


class ContourPilot:
    def __init__(self, model_path, data_path, output_path="./", verbosity=False, pat_dict=None):
        self.verbosity = verbosity
        self.model1 = None
        self.model_params = {"batch_size": 4}  # default predict batch size
        self.image_size = 512
        self.normalize_input = False  # generator already windows; keep original behavior
        self.default_thr = 0.7        # slightly lower than 0.99 to avoid empty masks
        self.__load_model__(model_path)

        # Dataset dictionary from original parser (img_only=True like original)
        if pat_dict:
            self.Patient_dict = pat_dict
        else:
            self.Patient_dict = le.parse_dataset(data_path, img_only=True)

        # Original generator signature preserved
        self.Patients_gen = Generator_v1.Patient_data_generator(
            self.Patient_dict,
            predict=True,
            batch_size=1,
            image_size=self.image_size,
            shuffle=True,
            use_window=True,
            window_params=[1500, -600],
            resample_int_val=True,
            resampling_step=25,
            extract_lungs=True,
            size_eval=False,
            verbosity=verbosity,
            reshape=True,
            img_only=True,
        )
        self.Output_path = output_path

    # -----------------------------
    # Model loading (legacy JSON + HDF5)
    # -----------------------------
    def __load_model__(self, model_path):
        """
        Load legacy Keras 2.2.x JSON + HDF5 on modern TF/Keras using explicit class mappings.
        Raises:
            FileNotFoundError: if architecture or weight files are missing.
            RuntimeError: if deserialization fails in both tf.keras and keras.
        """
        json_path = os.path.join(model_path, "model_v7.json")
        weights_path = os.path.join(model_path, "weights_v7.hdf5")

        if not os.path.exists(json_path):
            raise FileNotFoundError(f"Missing model JSON: {json_path}")
        if not os.path.exists(weights_path):
            raise FileNotFoundError(f"Missing weights file: {weights_path}")

        with open(json_path, "r") as f:
            loaded_model_json = f.read()

        # Map legacy JSON class names to modern tf.keras classes
        custom_map = {
            "Model": tf.keras.Model,
            "Sequential": tf.keras.Sequential,
            "InputLayer": tf.keras.layers.InputLayer,
            "Conv2D": tf.keras.layers.Conv2D,
            "MaxPooling2D": tf.keras.layers.MaxPooling2D,
            "Dropout": tf.keras.layers.Dropout,
            "UpSampling2D": tf.keras.layers.UpSampling2D,
            "Concatenate": tf.keras.layers.Concatenate,
            "Add": tf.keras.layers.Add,
            "ZeroPadding2D": tf.keras.layers.ZeroPadding2D,
            "BatchNormalization": tf.keras.layers.BatchNormalization,
            "Activation": tf.keras.layers.Activation,
            "Dense": tf.keras.layers.Dense,
            "Flatten": tf.keras.layers.Flatten,
            # Initializers referenced in JSON
            "VarianceScaling": tf.keras.initializers.VarianceScaling,
            "Zeros": tf.keras.initializers.Zeros,
        }

        # Preferred path: tf.keras (best compatibility with TF)
        try:
            with tf.keras.utils.custom_object_scope(custom_map):
                self.model1 = tf.keras.models.model_from_json(loaded_model_json)
        except Exception as e_tf:
            # Fallback: keras (Keras 3) with same mappings
            try:
                with keras.utils.custom_object_scope(custom_map):
                    self.model1 = keras.models.model_from_json(loaded_model_json)
            except Exception as e_k3:
                raise RuntimeError(
                    "Failed to load model_v7.json with both tf.keras and keras "
                    "after supplying custom class mappings.\n"
                    f"tf.keras error: {e_tf}\nkeras error: {e_k3}"
                )

        # Load weights (HDF5)
        self.model1.load_weights(weights_path)
        # Inference only; compile not required.

    # -----------------------------
    # Helpers
    # -----------------------------
    @staticmethod
    def _ensure_zyx(vol):
        """Ensure input is (Z, Y, X). Accept (Y, X) and promote to single-slice volume."""
        vol = np.asarray(vol)
        if vol.ndim == 2:
            vol = vol[None, ...]
        if vol.ndim != 3:
            raise ValueError(f"Expected 3D (Z,Y,X), got shape {vol.shape}")
        return vol

    def _prepare_model_input(self, vol3d):
        """
        Add channel and (optionally) normalize: (Z,Y,X) -> (Z,H,W,1) float32.
        """
        x = vol3d.astype(np.float32, copy=False)
        if self.normalize_input:
            finite = np.isfinite(x)
            if finite.any():
                vmin = float(np.min(x[finite]))
                vmax = float(np.max(x[finite]))
                if vmax > vmin:
                    x = (x - vmin) / (vmax - vmin)
                else:
                    x.fill(0.0)
            else:
                x.fill(0.0)
        x_in = x[..., None]
        return x_in

    # -----------------------------
    # Segmentation
    # -----------------------------
    def __generate_segmentation__(self, img, params, thr=None):
        """
        Run inference and re-embed prediction into the original voxel grid.

        Args:
            img: np.ndarray (Z, 512, 512) after preprocessing/cropping by the generator.
            params: dict with transform metadata produced by the generator.
            thr: float threshold for binarizing sigmoid output.

        Returns:
            np.ndarray: mask aligned to params['original_shape'] (uint8 values 0/255).
        """
        vol = self._ensure_zyx(img)
        x_in = self._prepare_model_input(vol)

        if self.verbosity:
            print("Segmentation started")
            st = time.time()

        # Predict in one batch (much faster than per-slice)
        batch = int(self.model_params.get("batch_size", 4))
        try:
            y = self.model1.predict(x_in, batch_size=batch, verbose=0)
        except ValueError as e:
            # Fabricate zeros if Keras still complains about emptiness
            if "non-empty" in str(e).lower():
                y = np.zeros((x_in.shape[0], x_in.shape[1], x_in.shape[2], 1), dtype=np.float32)
            else:
                raise

        # Coerce output to (Z, H, W) probabilities
        if y is None or y.size == 0:
            proba = np.zeros((x_in.shape[0], x_in.shape[1], x_in.shape[2]), dtype=np.float32)
        elif y.ndim == 4 and y.shape[-1] == 1:
            proba = y[..., 0]
        elif y.ndim == 4 and y.shape[-1] > 1:
            # Softmax multi-class โ†’ foreground = any class > 0
            proba = (np.argmax(y, axis=-1) > 0).astype(np.float32)
        elif y.ndim == 3:
            proba = y.astype(np.float32)
        else:
            proba = np.zeros((x_in.shape[0], x_in.shape[1], x_in.shape[2]), dtype=np.float32)

        # Threshold (slightly relaxed default for visibility)
        if thr is None:
            thr = float(self.default_thr)
        temp_pred_arr = (proba > float(thr)).astype(np.uint8)

        if self.verbosity:
            print("Segmentation is finished")
            print("time spent: %s sec." % (time.time() - st))

        # Keep largest connected 3D component
        predicted_arr_temp = le.max_connected_volume_extraction(temp_pred_arr)

        # ---- Re-embed EXACTLY like original (exclusive indexing) ----
        p = params  # use as-is (the generator's contract)

        # Canvas with normalized shape
        norm_shape = tuple(int(v) for v in p["normalized_shape"])
        temporary_mask = np.zeros(norm_shape, np.uint8)

        z_st = int(p["z_st"])
        z_en = int(p["z_end"])          # EXCLUSIVE
        xy_st = int(p["xy_st"])
        xy_en = int(p["xy_end"])        # EXCLUSIVE

        # Clamp to bounds
        Zc, Yc, Xc = temporary_mask.shape
        z_st = max(0, min(z_st, Zc))
        z_en = max(0, min(z_en, Zc))
        xy_st = max(0, min(xy_st, min(Yc, Xc)))
        xy_en = max(0, min(xy_en, min(Yc, Xc)))

        # Target spans
        dst_Z = max(0, z_en - z_st)
        dst_H = max(0, xy_en - xy_st)
        dst_W = max(0, xy_en - xy_st)

        src = predicted_arr_temp
        src_Z, src_H, src_W = src.shape

        if int(p.get("crop_type", 1)):
            # Original smaller โ†’ padded into 512 canvas
            # temporary_mask[z_st:z_en, ...] = src[:, xy_st:xy_en, xy_st:xy_en]
            Zw = min(src_Z, dst_Z)
            Hw = min(src_H - xy_st, dst_H) if src_H > xy_st else 0
            Ww = min(src_W - xy_st, dst_W) if src_W > xy_st else 0
            if Zw > 0 and Hw > 0 and Ww > 0:
                temporary_mask[z_st:z_st+Zw, xy_st:xy_st+Hw, xy_st:xy_st+Ww] = \
                    src[:Zw, xy_st:xy_st+Hw, xy_st:xy_st+Ww]
        else:
            # Original bigger โ†’ cropped 512 window
            # temporary_mask[z_st:z_en, xy_st:xy_en, xy_st:xy_en] = src
            Zw = min(src_Z, dst_Z)
            Hw = min(src_H, dst_H)
            Ww = min(src_W, dst_W)
            if Zw > 0 and Hw > 0 and Ww > 0:
                temporary_mask[z_st:z_st+Zw, xy_st:xy_st+Hw, xy_st:xy_st+Ww] = \
                    src[:Zw, :Hw, :Ww]

        # Resize back to original voxel grid if needed (nearest to keep binary)
        if tuple(temporary_mask.shape) != tuple(p["original_shape"]):
            resized = le.resize_3d_img(temporary_mask, p["original_shape"], cv2.INTER_NEAREST)
            predicted_array = (resized > 0.5).astype(np.uint8)
        else:
            predicted_array = temporary_mask.astype(np.uint8)

        # Make it clearly visible in viewers: scale to 0/255
        predicted_array *= 255

        return predicted_array

    # -----------------------------
    # Driver
    # -----------------------------
    def segment(self, thr=None):
        """
        Iterate patients from the generator, write predicted masks alongside
        source images, and return 0 on success.
        """
        if self.model1 and self.Patient_dict and self.Output_path:
            count = 0
            for img, _, filename, params in tqdm(self.Patients_gen, desc="Progress"):
                filename = filename[0]
                params = params[0]
                img = np.squeeze(img)

                predicted_array = self.__generate_segmentation__(img, params, thr=thr)

                # Per-patient output (robust across OS path styles)
                patient_dir = _patient_dir_from_filename(filename)
                out_dir = os.path.join(self.Output_path, f"{patient_dir}_(DL)")
                os.makedirs(out_dir, exist_ok=True)

                # Save predicted mask with original spacing & origin (uint8 0/255)
                generated_img = sitk.GetImageFromArray(predicted_array.astype(np.uint8))
                generated_img = sitk.Cast(generated_img, sitk.sitkUInt8)
                generated_img.SetSpacing(tuple(params.get("original_spacing", (1.0, 1.0, 1.0))))
                generated_img.SetOrigin(tuple(params.get("img_origin", (0.0, 0.0, 0.0))))
                sitk.WriteImage(generated_img, os.path.join(out_dir, "DL_mask.nrrd"))

                # Save source image alongside
                temp_data = sitk.ReadImage(filename)
                sitk.WriteImage(temp_data, os.path.join(out_dir, "image.nrrd"))

                count += 1

            return 0
  1. Save the file and verify it's been updated:
# Check the file was modified (should show recent timestamp)
ls -lh "Automatic segmentation script/TheDuneAI.py"

Method 2: Quick Script (Alternative)

If you prefer, save this script as update_duneai.sh and run it:

#!/bin/bash
# Save this as update_duneai.sh and run: bash update_duneai.sh

cat > "Automatic segmentation script/TheDuneAI.py" << 'ENDOFFILE'
# [Paste the complete Python code from Method 1 here]
ENDOFFILE

echo "โœ“ TheDuneAI.py has been updated!"

โœ… Verify Setup

Run this quick check to confirm everything is ready:

# Check that all required files exist
ls "Automatic segmentation script/model_files/model_v7.json"
ls "Automatic segmentation script/model_files/weights_v7.hdf5"
ls "Automatic segmentation script/TheDuneAI.py"
ls "Automatic segmentation script/Generator_v1.py"
ls "Automatic segmentation script/lung_extraction_funcs_13_09.py"

Expected output: All files should be listed without errors.

File sizes:

  • weights_v7.hdf5 should be ~180 MB
  • model_v7.json should be ~180 KB
  • TheDuneAI.py should be ~13 KB (updated version) vs ~3 KB (original)

Verify TheDuneAI.py was updated correctly:

# Check if the file contains the modern update signature
grep -q "A. Lohachab" "Automatic segmentation script/TheDuneAI.py" && \
  echo "โœ“ TheDuneAI.py is updated!" || \
  echo "โœ— TheDuneAI.py still uses original version - please update it"

# Check for the batched prediction feature
grep -q "batch_size" "Automatic segmentation script/TheDuneAI.py" && \
  echo "โœ“ Batched prediction enabled" || \
  echo "โœ— Using slow slice-by-slice prediction"

If any checks fail, revisit the corresponding setup step above.


Environment Variable (Optional)

The CLI automatically locates these files if you run from the repo root. If running from elsewhere, set:

export DUNEAI_ALGO_DIR="/path/to/Automatic segmentation script"

๐Ÿš€ Quick Start

โš ๏ธ Prerequisites: Before running DuneAI, ensure you've completed the Initial Setup steps (extracting weights + updating TheDuneAI.py)

1) Automated Segmentation

duneai \
  --model "Automatic segmentation script/model_files" \
  --data  "Software for qualitative assesment/test_data" \
  --out   "results"

Where:

  • --model โ†’ directory containing model_v7.json and weights_v7.hdf5
  • --data โ†’ input CT dataset (NRRD format, one directory per patient recommended)
  • --out โ†’ output directory

Output structure:

results/<PatientID>_(DL)/
โ”œโ”€โ”€ image.nrrd     # resaved source CT
โ””โ”€โ”€ DL_mask.nrrd   # predicted segmentation mask (uint8, 0/255)

Masks are:

  • generated at 512ร—512
  • re-embedded into the original voxel grid
  • rescaled to original spacing
  • saved as 0/255 uint8 for viewer compatibility

๐Ÿ‘๏ธ Viewing Results

A) Standalone GUI Viewer

duneai-view --gui \
  --img  "results/pat1_(DL)/image.nrrd" \
  --mask "results/pat1_(DL)/DL_mask.nrrd"

GUI Features:

  • Slice navigation (slider, mouse wheel, J/K, โ†‘/โ†“)
  • Window/Level controls with presets: Lung, Mediastinum, Bone
  • Mask overlay (filled or contour)
  • Opacity & color selector
  • Jump to next/previous annotated slice
  • Mask smoothing / cleanup
  • Automatic area & volume computation (uses NRRD spacing)
  • PNG export (S or "Save PNG")

macOS note:

import matplotlib
matplotlib.use("TkAgg")

B) Notebook Viewer (optional)

from duneai_auto.viewer import notebook_view

notebook_view(
    img_path="results/pat1_(DL)/image.nrrd",
    mask_path="results/pat1_(DL)/DL_mask.nrrd",
)

๐Ÿ’ก Common Usage Patterns

Adjust Segmentation Sensitivity

If masks are too small or missing:

duneai --model "..." --data "..." --out "..." --thr 0.5

If masks are too large or noisy:

duneai --model "..." --data "..." --out "..." --thr 0.8

Batch Processing Multiple Datasets

# Process multiple patient directories
for dataset in patient_data_*/; do
  duneai --model "Automatic segmentation script/model_files" \
         --data "$dataset" \
         --out "results_$(basename $dataset)"
done

Quality Control Workflow

# 1. Run segmentation
duneai --model "..." --data "test_data" --out "results"

# 2. Review each patient systematically
for patient in results/*_(DL)/; do
  echo "Reviewing: $patient"
  duneai-view --gui --img "$patient/image.nrrd" --mask "$patient/DL_mask.nrrd"
done

๐Ÿง  How the Pipeline Works

  1. Preprocessing

    • Lung windowing
    • Resampling to consistent voxel sizes
    • Automatic lung extraction
    • Cropping to 512ร—512 via Generator_v1
  2. Inference (Modern TensorFlow 2.20 compatible)

    • Loads original Keras 2.2.x JSON model
    • Safely deserializes using explicit custom object mappings
    • Loads HDF5 weights
    • Runs full-volume (Zร—512ร—512ร—1) inference using batch mode
  3. Re-embedding

    • Rebuilds segmentation into a "normalized canvas" using the original exclusive indexing logic
    • Resizes back to original voxel grid
  4. Export

    • Image & mask saved as .nrrd
    • Ensures:
      • original spacing
      • original origin
      • uint8 mask (0/255)

โšก Command Reference

Command Description
duneai Run automated segmentation on a dataset (main command)
duneai-auto Alias for duneai
duneai-view Standalone viewer for .nrrd image & mask files

Common Flags

For duneai:

  • --model PATH โ€” Directory containing model_v7.json and weights_v7.hdf5
  • --data PATH โ€” Input CT dataset directory (NRRD format)
  • --out PATH โ€” Output directory for results
  • --thr FLOAT โ€” Segmentation threshold (default: 0.7, range: 0.0โ€“1.0). Lower values โ†’ more sensitive detection
  • --quiet โ€” Suppress verbose output

For duneai-view:

  • --gui โ€” open standalone GUI
  • --img โ€” path to image.nrrd
  • --mask โ€” path to DL_mask.nrrd
  • --interactive โ€” notebook mode (if available)

๐Ÿงช System Requirements

Python: 3.9โ€“3.11

Core runtime dependencies (installed automatically):

  • TensorFlow 2.20.0
  • NumPy โ‰ฅ 1.23
  • SciPy โ‰ฅ 1.10
  • SimpleITK โ‰ฅ 2.2
  • scikit-image โ‰ฅ 0.25
  • statsmodels โ‰ฅ 0.14
  • pandas โ‰ฅ 2.3
  • scikit-learn โ‰ฅ 1.7
  • OpenCV โ‰ฅ 4.5
  • tqdm
  • matplotlib โ‰ฅ 3.6

Optional:

  • Notebook viewer: ipywidgets, notebook
  • GPU support depends on your TensorFlow installation โ€“ this package does not enforce GPU CUDA wheels.

๐Ÿš€ Performance Notes

Processing Speed:

  • CPU: ~30-60 seconds per patient (depends on volume size)
  • GPU: ~5-15 seconds per patient with CUDA-enabled TensorFlow

GPU Acceleration (Recommended for large datasets):

To enable GPU support, install TensorFlow with CUDA:

# For CUDA 11.8
pip install tensorflow[and-cuda]==2.20.0

Verify GPU is detected:

import tensorflow as tf
print("GPUs Available:", tf.config.list_physical_devices('GPU'))

Memory Requirements:

  • Minimum: 8 GB RAM
  • Recommended: 16+ GB RAM for large datasets
  • GPU: 4+ GB VRAM recommended

Disk Space:

  • Model: ~180 MB
  • Per patient output: ~2-10 MB (depends on volume size)

๐Ÿ› ๏ธ Troubleshooting

Issue Possible Cause Solution
Missing weights_v7.hdf5 Weights not extracted from split archive Run the concatenation + unzip commands from Setup Step 1
Missing model_v7.json Wrong --model path Point to the folder containing both model files
FileNotFoundError: weights_v7.hdf5 Same as above Verify weights exist in model_files/ directory
Path errors: KeyError or IndexError with filenames Original TheDuneAI.py not updated Replace with modernized version - copy the complete code from Setup Step 2
Slow processing (~2-5 min per patient) Using original slice-by-slice code Update TheDuneAI.py to enable batched prediction (4x speedup)
Empty or tiny masks Threshold too strict OR outdated code (1) Reduce threshold with --thr 0.5, (2) Ensure TheDuneAI.py is updated
GUI does not open on macOS Tk backend missing matplotlib.use("TkAgg") before importing duneai_auto
Shape mismatch warnings Variable spacing/shape Auto-resize will fix; extreme anisotropy reduces quality
ContourPilot import failure Algorithm folder not found Set DUNEAI_ALGO_DIR or run inside repo root
Legacy Keras JSON errors TensorFlow/Keras version mismatch Ensure TensorFlow 2.20+ is installed; updated TheDuneAI.py handles this
AttributeError: 'ContourPilot' object has no attribute 'model_params' Original TheDuneAI.py being used Copy the complete updated code from Setup Step 2

โ“ Frequently Asked Questions

Q: Why do I need to update TheDuneAI.py?
A: The original file has Windows-specific path handling (filename.split('\\')) that breaks on Linux/macOS, and uses inefficient slice-by-slice prediction. The updated version is faster and cross-platform compatible.

Q: Where do I find the updated TheDuneAI.py code?
A: It's in this README! Scroll to Step 2: Update TheDuneAI.py, click the expandable section "๐Ÿ“„ Click to expand: Complete TheDuneAI.py code", and copy-paste the entire code to replace your original file.

Q: Can I use my own trained model?
A: Yes, but you'll need to ensure compatibility with the Keras 2.2.x JSON format or modify the loading code. The current implementation expects model_v7.json + weights_v7.hdf5.

Q: What CT formats are supported?
A: The pipeline expects NRRD (.nrrd) format. DICOM can be converted using SimpleITK or other tools.

Q: How accurate is the segmentation?
A: Performance metrics are reported in the original Nature Communications paper (Primakov et al., 2022). Results vary based on image quality and tumor characteristics.

Q: Can I run this on cloud servers (AWS, GCP, Azure)?
A: Yes! The package is designed for deployment flexibility. Ensure TensorFlow and dependencies are installed. GPU instances are recommended for batch processing.

Q: Why are my masks empty?
A: Try lowering the threshold: --thr 0.5. Empty masks can occur with: (1) very small tumors, (2) poor CT quality, (3) unexpected HU ranges, or (4) missing setup steps.

Q: How do I cite this work?
A: Cite the original research paper:

Primakov, S., et al. (2022). Automated detection and segmentation of 
non-small cell lung cancer computed tomography images. 
Nature Communications, 13, 3423.
https://doi.org/10.1038/s41467-022-30841-3

For the PyPI package:

DuneAI-Auto (v1.3.0). PyPI wrapper by A. Lohachab.
https://pypi.org/project/duneai-auto/

๐Ÿงพ Changelog

v1.0.0

Core Infrastructure:

  • Robust legacy Keras JSON + HDF5 model loading on TF 2.20+
  • Added explicit custom object mappings for compatibility
  • Restored original exclusive indexing during re-embedding
  • Mask export updated to uint8 (0/255) for optimal viewer clarity

Updated TheDuneAI.py:

  • โœ… Cross-platform path handling (Windows/Linux/macOS compatible)
  • โœ… Efficient batched prediction (4x faster than slice-by-slice)
  • โœ… Empty batch safeguards and robust error handling
  • โœ… Production-grade documentation and code structure
  • โœ… Modern TensorFlow 2.20+ compatibility with fallback support
  • โœ… Maintains 100% compatibility with original Generator_v1 pipeline

GUI Viewer:

  • Window/Level presets (Lung, Mediastinum, Bone)
  • Opacity & color selection
  • Filled/contour overlay modes
  • PNG export capability
  • Morphological smoothing tools
  • Automatic area & volume calculation
  • Keyboard shortcuts (J/K, โ†‘/โ†“, S for save)

Distribution:

  • Split model weights support (automatic assembly instructions)
  • PyPI packaging with optional visualization dependencies
  • Comprehensive troubleshooting guide

๐Ÿ‘ฉโ€โš•๏ธ Authors

๐Ÿง‘โ€๐Ÿ’ป A. Lohachab

  • Cross-platform compatibility
  • Modern TensorFlow/Keras migration
  • CLI development
  • GUI viewer
  • PyPI packaging (duneai-auto)

๐Ÿง‘โ€๐Ÿ”ฌ S. Primakov

  • Original DuneAI research implementation
  • Pipeline & model architecture
  • Lung extraction and preprocessing algorithms

Original Research Repository:
https://github.com/primakov/DuneAI-Automated-detection-and-segmentation-of-non-small-cell-lung-cancer-computed-tomography-images

Published in:
Nature Communications, 2022
https://www.nature.com/articles/s41467-022-30841-3


๐Ÿค Support & Contributing

Getting Help

  • GitHub Issues: Report bugs or request features on the package repository
  • Original Paper: For questions about the model/algorithm, refer to the Nature Communications publication
  • Email: Contact the authors for research collaboration inquiries

Contributing

Contributions are welcome! Areas where help is needed:

  • Additional viewer features (3D rendering, measurement tools)
  • Support for additional image formats (DICOM, NIfTI)
  • Performance optimizations
  • Documentation improvements
  • Test coverage expansion

Development Setup:

git clone https://github.com/YOUR_REPO/duneai-auto.git
cd duneai-auto
pip install -e ".[dev]"

Reporting Issues

When reporting issues, please include:

  • Python version and OS
  • TensorFlow version (python -c "import tensorflow as tf; print(tf.__version__)")
  • Complete error traceback
  • Sample data (if possible) or description of input format
  • Steps to reproduce

๐Ÿงฉ License

MIT License ยฉ 2025 โ€” Maastricht University

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

duneai_auto-1.0.0.tar.gz (42.5 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

duneai_auto-1.0.0-py3-none-any.whl (21.8 kB view details)

Uploaded Python 3

File details

Details for the file duneai_auto-1.0.0.tar.gz.

File metadata

  • Download URL: duneai_auto-1.0.0.tar.gz
  • Upload date:
  • Size: 42.5 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.11.5

File hashes

Hashes for duneai_auto-1.0.0.tar.gz
Algorithm Hash digest
SHA256 1cb3cfc59539e6352e818c37c494280d9fee2e730dcd7d59fd984adb7c3fbd0e
MD5 b0595be9cd257f85d43e4e267cea053d
BLAKE2b-256 4503c7468f859adf8bfbd7bd21e61dc28c1002959267d58144683c9386624c5b

See more details on using hashes here.

File details

Details for the file duneai_auto-1.0.0-py3-none-any.whl.

File metadata

  • Download URL: duneai_auto-1.0.0-py3-none-any.whl
  • Upload date:
  • Size: 21.8 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.11.5

File hashes

Hashes for duneai_auto-1.0.0-py3-none-any.whl
Algorithm Hash digest
SHA256 1f196a2460097611c44528b36bd5a49e414b8464899c2b1a5a7281024ab52262
MD5 1f788851b74f55bebc04fa1af8a74aa3
BLAKE2b-256 c771505173da8ed59bc45440a5d0baf971d6eec2b282c89edecdfbb09a9cab02

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page