Skip to main content

Foundation models for digital pathology.

Project description

HistoEncoder

Foundation models for digital pathology.

DescriptionWhy?InstallationUsageAPI DocumentationCitation

Description

HistoEncoder CLI interface allows users to extract and cluster useful features for histological slide images. The histoencoder python package also exposes some useful functions for using the encoder models, which are described in the API docs.

Why?

The models provided in this package produce similar features for tile images with similar histological patterns. This means that when we cluster the tile images based on their features, each cluster contains tile images with similar histological patterns.

Thus, visualising the clusters allows us to automatically annotate whole datasets! Additionally, calculating cluster percentages for a given patient would give us the distribution of histological patterns for the patient. This information could then be combined with other data modalities. automatically annotate datasets

Installation

pip install histoencoder

Usage

  1. Cut histological slide images into small tile images with HistoPrep.
HistoPrep --input './slide_images/*.tiff' --output ./tile_images --width 512 --overlap 0.5 --max-background 0.5
  1. Extract features for each tile image.
HistoEncoder extract --input ./tile_images --model-name prostate-small
  1. Cluster extracted features.
HistoEncoder cluster --input ./tile_images

Now train_tiles contains a directory for each slide with the following contents.

train_tiles
└── slide_image
    ├── clusters.parquet # Clusters for each tile image.
    ├── features.parquet # Extracted features for each tile.
    ├── metadata.parquet # Everything else is generated by HistoPrep.
    ├── properties.json
    ├── thumbnail.jpeg
    ├── thumbnail_tiles.jpeg
    ├── thumbnail_tissue.jpeg
    └── tiles  [52473 entries exceeds filelimit, not opening dir]

Citation

If you use HistoEncoder models or pipelines in your publication, please cite the github repository.

@misc{histoencoder,
  author = {Pohjonen, Joona},
  title = {HistoEncoder: Foundation models for digital pathology},
  year = {2023},
  publisher = {GitHub},
  journal = {GitHub repository},
  howpublished = {https://github.com/jopo666/HistoEncoder},
}

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

histoencoder-0.1.0.tar.gz (12.7 kB view hashes)

Uploaded Source

Built Distribution

histoencoder-0.1.0-py3-none-any.whl (16.6 kB view hashes)

Uploaded Python 3

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page