Skip to main content

A Scalable Modular Framework for Multimodal AI in Oncology

Project description

HoneyBee Logo

HoneyBee

A Scalable Modular Framework for Multimodal AI in Oncology

Nature Digital Medicine PyPI version PyPI Downloads GitHub stars Python PyTorch

Documentation & Examples | Paper

Publication

HoneyBee has been officially published in Nature Digital Medicine!

Tripathi, A., Waqas, A., Schabath, M.B. et al. HONeYBEE: enabling scalable multimodal AI in oncology through foundation model-driven embeddings. npj Digit. Med. 8, 622 (2025). https://doi.org/10.1038/s41746-025-02003-4

Overview

HoneyBee is a comprehensive multimodal AI framework designed specifically for oncology research and clinical applications. It seamlessly integrates and processes diverse medical data types—clinical text, radiology images, pathology slides, and molecular data—through a unified, modular architecture. Built with scalability and extensibility in mind, HoneyBee empowers researchers to develop sophisticated AI models for cancer diagnosis, prognosis, and treatment planning.

[!WARNING] Alpha Release: This framework is currently in alpha. APIs may change, and some features are still under development.

Key Features

  • Multimodal data support: clinical text, radiology (DICOM/NIFTI), pathology (WSI), and molecular data
  • 3-layer modular architecture: clean separation between loaders, processors, and embedding models
  • Clinical NLP pipeline: OCR, cancer entity extraction, temporal parsing, and medical ontology mapping
  • Whole Slide Image processing: tissue detection, patch extraction, stain normalization, and quality filtering
  • State-of-the-art embedding models: GatorTron, BioBERT, PubMedBERT, UNI, REMEDIS, RadImageNet, and more
  • Cross-modal integration: unified patient-level representations from multiple data modalities
  • Survival analysis: Cox PH, Random Survival Forest, and DeepSurv
  • Similar patient retrieval: find patients with matching clinical profiles
  • Interactive visualization: t-SNE dashboards for embedding exploration
  • GPU-accelerated: CuCIM backend for WSI processing with OpenSlide fallback

Quick Start

System Dependencies

# Ubuntu/Debian
sudo apt-get install -y openslide-tools tesseract-ocr

# macOS
brew install openslide tesseract

Installation

pip install honeybee-ml
python -c "import nltk; nltk.download('punkt'); nltk.download('punkt_tab')"

Optional Extras

Extra Command Includes
Clinical pip install honeybee-ml[clinical] NLP, OCR, and text processing dependencies
Pathology pip install honeybee-ml[pathology] WSI loading and image processing
Molecular pip install honeybee-ml[molecular] Genomics and expression data support
All pip install honeybee-ml[all] Everything above

Research Applications

HoneyBee has been successfully applied to:

  • Cancer Subtype Classification: Automated identification of cancer subtypes from multimodal data
  • Survival Prediction: Risk stratification and outcome prediction for treatment planning
  • Similar Patient Retrieval: Finding patients with similar clinical profiles for precision medicine
  • Biomarker Discovery: Identifying multimodal patterns associated with treatment response

License

See the LICENSE file for details.

Citation

If you use HoneyBee in your research, please cite our paper:

Tripathi, A., Waqas, A., Schabath, M.B. et al. HONeYBEE: enabling scalable multimodal AI in
oncology through foundation model-driven embeddings. npj Digit. Med. 8, 622 (2025).
https://doi.org/10.1038/s41746-025-02003-4

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

honeybee_ml-0.3.0.tar.gz (160.1 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

honeybee_ml-0.3.0-py3-none-any.whl (188.7 kB view details)

Uploaded Python 3

File details

Details for the file honeybee_ml-0.3.0.tar.gz.

File metadata

  • Download URL: honeybee_ml-0.3.0.tar.gz
  • Upload date:
  • Size: 160.1 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.13.5

File hashes

Hashes for honeybee_ml-0.3.0.tar.gz
Algorithm Hash digest
SHA256 89df2eed2e7ce949763bb83c494cc255e3be541001de7876f7b422eb7015c04b
MD5 f721b09957dec4f34566a4d52b5f6340
BLAKE2b-256 eac362ecbc146e8b8286b16b7c4d53e4b0eab7936715e044e2abe0cec941effc

See more details on using hashes here.

File details

Details for the file honeybee_ml-0.3.0-py3-none-any.whl.

File metadata

  • Download URL: honeybee_ml-0.3.0-py3-none-any.whl
  • Upload date:
  • Size: 188.7 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.13.5

File hashes

Hashes for honeybee_ml-0.3.0-py3-none-any.whl
Algorithm Hash digest
SHA256 ef65fac0ae0ad202eac5b445f56904e5f7271500947d48ce06beb97601063bcd
MD5 341bd6af3a3d78e18e39c0a348e6ef3b
BLAKE2b-256 2d6ba8e7b7cb1abf2f61188e58a046689614d76a8786b42e3b3b390c878324ff

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page