Foundation models for digital pathology.
Project description
HistoEncoder
Foundation models for digital pathology.
Description • Why? • Installation • Usage • API Documentation • Citation
Description
HistoEncoder
CLI interface allows users to extract and cluster useful features for
histological slide images. The histoencoder
python package also exposes some useful functions for using the encoder models, which
are described in the API docs.
Why?
The models provided in this package produce similar features for tile images with similar histological patterns. This means that when we cluster the tile images based on their features, each cluster contains tile images with similar histological patterns.
Thus, visualising the clusters allows us to automatically annotate whole datasets! Additionally, calculating cluster percentages for a given patient would give us the distribution of histological patterns for the patient. This information could then be combined with other data modalities.
Installation
pip install histoencoder
Usage
- Cut histological slide images into small tile images with
HistoPrep
.
HistoPrep --input './slide_images/*.tiff' --output ./tile_images --width 512 --overlap 0.5 --max-background 0.5
- Extract features for each tile image.
HistoEncoder extract --input ./tile_images --model-name prostate-small
- Cluster extracted features.
HistoEncoder cluster --input ./tile_images
Now train_tiles
contains a directory for each slide with the following contents.
train_tiles
└── slide_image
├── clusters.parquet # Clusters for each tile image.
├── features.parquet # Extracted features for each tile.
├── metadata.parquet # Everything else is generated by HistoPrep.
├── properties.json
├── thumbnail.jpeg
├── thumbnail_tiles.jpeg
├── thumbnail_tissue.jpeg
└── tiles [52473 entries exceeds filelimit, not opening dir]
Citation
If you use HistoEncoder
models or pipelines in your publication, please cite the github repository.
@misc{histoencoder,
author = {Pohjonen, Joona},
title = {HistoEncoder: Foundation models for digital pathology},
year = {2023},
publisher = {GitHub},
journal = {GitHub repository},
howpublished = {https://github.com/jopo666/HistoEncoder},
}
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Hashes for histoencoder-0.1.0-py3-none-any.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | 2b459853a1f68005234b1613f957c5b674bc4feb2a5b88ff8463fe70a5812570 |
|
MD5 | 5f539abeded67aa728f85ffe1b0252c1 |
|
BLAKE2b-256 | 9e4be2604a9987fb354bfd0abfc577a7524dbc7f1c4c9009ee127cd6f444a78e |