Preprocessing module for large histological images.
Project description
HistoPrep
Preprocessing large medical images for machine learning made easy!
Description • Installation • Documentation • How To Use • Examples • What's coming? • Citation
Description
This module allows you to easily cut and preprocess large histological slides.
- Cut tiles from large slide images.
- Dearray TMA spots (and cut tiles from individual spots).
- Preprocess extracted tiles automatically.
Installation
pip install histoprep
Cutting slide into tiles
HistoPrep
can be used easily to prepare histological slide images for machine learning tasks.
You can either use HistoPrep
as a python module...
import histoprep
# Cutting tiles is super easy!
reader = histoprep.SlideReader('/path/to/slide')
metadata = reader.save_tiles(
'/path/to/output_folder',
coordinates=reader.get_tile_coordinates(
width=512,
overlap=0.1,
max_background=0.96
),
)
or as an excecutable from your command line!
jopo666@MacBookM1$ HistoPrep input_dir output_dir width {optional arguments}
Preprocessing
After the tiles have been saved, preprocessing is just a simple outlier detection from the preprocessing metrics saved in tile_metadata.csv
!
from histoprep import OutlierDetector
from histoprep.helpers import combine metadata
# Let's combine all metadata from the cut slides
metadata = collect_metadata("/path/to/output_folder", "tile_metadata.csv")
metadata["outlier"] = False
# Then mark any outlying values!
metadata.loc[metadata['sharpness_max'] < 5, "outlier"] = True # blurry
metadata.loc[metadata['black_pixels'] > 0.05, "outlier"] = True # data loss
metadata.loc[metadata['saturation_mean'] > 230, "outlier"] = True # weird blue shit
# This can also be done automatically!
detector = OutlierDetector(metadata, num_clusters=10)
# Plot clusters from most likely outlier to least likely outlier
detector.plot_clusters()
# After visual inspection we can discard some clusters as outliers.
metadata.loc[detector.clusters < 2, "outlier"] = True
Examples
Examples can be found in the docs.
What's coming?
HistoPrep
is under constant development. If there are some features you would like to be added, just submit an issue and we'll start working on the feature!
Citation
If you use HistoPrep
in a publication, please cite the github repository.
@misc{histoprep2021,
author = {Pohjonen J. and Ariotta. V},
title = {HistoPrep: Preprocessing large medical images for machine learning made easy!},
year = {2022},
publisher = {GitHub},
journal = {GitHub repository},
howpublished = {\url{https://github.com/jopo666/HistoPrep}},
}
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Hashes for histoprep-1.0.8-py3-none-any.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | c871981b0e9d2f0ca3ac0dde9ea5454149c876c948aff6ba2f8ab10a643b0743 |
|
MD5 | 572332b57e60c4d06f0d9b8506e50bb8 |
|
BLAKE2b-256 | 99d9b7e8961a21b9baad7a55d388eebcd80b982e9e4297d472f7b86ff1d46b88 |