Skip to main content

Dataset validation and preprocessing toolkit for neurology brain imaging (NIfTI)

Project description

DOI CI PyPI Python License

NeuroTK

Dataset validation and quality assurance for neurology brain imaging.

NeuroTK inspects NIfTI datasets for geometry, spacing, orientation, and annotation issues — then reports them as structured JSON before they break your pipeline.

Features

  • Validate — scan image and label directories for spacing inconsistencies, orientation mismatches, missing annotations, and corrupt files
  • Preprocess — deterministic orientation normalization and voxel resampling with full audit trail
  • Infer — run MONAI bundle inference (local or Hugging Face) with automatic Dice scoring
  • Analyze — lesion volume quantification, cohort stratification, and histogram generation
  • Report — machine-readable JSON + human-readable HTML reports
  • Web UI — browser-based interface for upload-and-validate workflows

Installation

pip install neurotk                # core (validation, preprocessing, analysis)
pip install neurotk[inference]     # + MONAI inference support

Requires Python >= 3.8. No GPU needed for validation and preprocessing.

Quick Start

Validate a dataset:

neurotk validate --images data/imagesTr --labels data/labelsTr --out report.json

Standardize spacing and orientation:

neurotk preprocess --images data/imagesTr --out preprocessed/ --spacing 1.0 1.0 1.0 --orientation RAS

Run segmentation inference:

neurotk infer --input data/imagesTr --output-dir predictions/ --device cuda

Measure lesion volumes:

neurotk lesion-volume --preds predictions/ --output volumes.csv --histogram hist.png

CLI Commands

Command Description
neurotk validate Check dataset geometry, spacing, orientation, annotations
neurotk preprocess Normalize orientation and resample voxel spacing
neurotk infer Run MONAI bundle inference (local or Hugging Face)
neurotk dice Compute Dice scores between predictions and labels
neurotk lesion-volume Quantify lesion burden per case with cohort statistics
neurotk cohort-stats Classify cases by lesion volume ranges
neurotk make-normal-csv Generate normal-CT flags from label volumes

Run neurotk <command> --help for full option details, or see the CLI Reference.

Output Format

NeuroTK emits structured JSON reports with dataset-level summaries and per-file diagnostics:

{
  "run_mode": "validate",
  "summary": {
    "scope": "original_inputs",
    "num_images": 100,
    "files_with_issues": 7,
    "orientation_modal": "RAS",
    "spacing_min": [1.0, 1.0, 1.0],
    "spacing_max": [1.0, 1.0, 1.0]
  },
  "files": {
    "case_001.nii.gz": {
      "shape": [256, 256, 128],
      "spacing": [1.0, 1.0, 1.0],
      "orientation": "RAS",
      "issues": ["label_missing"]
    }
  }
}

Add --html report.html to generate a shareable visual report.

Python API

All CLI commands are importable for programmatic use:

from neurotk.validate import validate_dataset

report = validate_dataset(images_dir="data/images", labels_dir="data/labels")
print(f"Files with issues: {report['summary']['files_with_issues']}")

Web UI

Launch the browser-based interface:

pip install neurotk
python -m neurotk.web.app

Or via Docker:

docker build -t neurotk .
docker run -p 8000:8000 neurotk

Sample Data

The sample_data/ directory contains two synthetic NIfTI image-label pairs for testing:

sample_data/
  images/   CASE_001.nii.gz, CASE_002.nii.gz
  labels/   CASE_001.nii.gz, CASE_002.nii.gz
neurotk validate --images sample_data/images --labels sample_data/labels --out report.json

Support

Citation

@software{neurotk,
  title  = {NeuroTK: Dataset Validation for Neurology Brain Imaging},
  author = {Sakshi Rathi},
  year   = {2026},
  doi    = {10.5281/zenodo.18252017},
  url    = {https://github.com/SakshiRa/neurotk}
}

License

Apache 2.0

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

neurotk-0.3.4.tar.gz (50.2 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

neurotk-0.3.4-py3-none-any.whl (46.5 kB view details)

Uploaded Python 3

File details

Details for the file neurotk-0.3.4.tar.gz.

File metadata

  • Download URL: neurotk-0.3.4.tar.gz
  • Upload date:
  • Size: 50.2 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.13.12

File hashes

Hashes for neurotk-0.3.4.tar.gz
Algorithm Hash digest
SHA256 11879f16e3828987d9b1b46c7bdbb99c50eb7d226213c5192e9712fa93667286
MD5 aefaae4aaf6944ecdb7ac77f69d33b1b
BLAKE2b-256 6dd93eb3137e30eb05c15ac2df642298b40768bef90e48b34d8889ed3ca7faee

See more details on using hashes here.

Provenance

The following attestation bundles were made for neurotk-0.3.4.tar.gz:

Publisher: python-publish.yml on SakshiRa/neurotk

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file neurotk-0.3.4-py3-none-any.whl.

File metadata

  • Download URL: neurotk-0.3.4-py3-none-any.whl
  • Upload date:
  • Size: 46.5 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.13.12

File hashes

Hashes for neurotk-0.3.4-py3-none-any.whl
Algorithm Hash digest
SHA256 9bcac4a153074835be3fb99e017a01f605beec948a47d54018a8daaa0a718a5f
MD5 60e430aac365f4cc4c88b58ff4ba14aa
BLAKE2b-256 264ba12cc6632b01834de51660ca7aa4dac5c9726d15aad6b21a93813ddbd966

See more details on using hashes here.

Provenance

The following attestation bundles were made for neurotk-0.3.4-py3-none-any.whl:

Publisher: python-publish.yml on SakshiRa/neurotk

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page