Preprocessing module for large histological images.

These details have not been verified by PyPI

Project links

Homepage

GitHub Statistics

View statistics for this project via Libraries.io, or by using our public dataset on Google BigQuery

Development Status
- 3 - Alpha
Intended Audience
- Science/Research
License
- OSI Approved :: MIT License
Operating System
- OS Independent
Programming Language
- Python :: 3
Topic
- Scientific/Engineering :: Bio-Informatics

Project description

HistoPrep

Preprocessing large medical images for machine learning made easy!

Description • Installation • Documentation • How To Use • Examples • What's coming? • Citation

Description

This module allows you to easily cut and preprocess large histological slides.

Cut tiles from large slide images.
Dearray TMA spots (and cut tiles from individual spots).
Preprocess extracted tiles automatically.

Installation

pip install histoprep

Cutting slide into tiles

HistoPrep can be used easily to prepare histological slide images for machine learning tasks.

You can either use HistoPrep as a python module...

import histoprep

# Cutting tiles is super easy!
reader = histoprep.SlideReader('/path/to/slide')
metadata = reader.save_tiles(
    '/path/to/output_folder',
    coordinates=reader.get_tile_coordinates(
        width=512, 
        overlap=0.1, 
        max_background=0.96
    ),
)

or as an excecutable from your command line!

jopo666@MacBookM1$ HistoPrep input_dir output_dir width {optional arguments}

Preprocessing

After the tiles have been saved, preprocessing is just a simple outlier detection from the preprocessing metrics saved in tile_metadata.csv!

from histoprep import OutlierDetector
from histoprep.helpers import combine metadata

# Let's combine all metadata from the cut slides
metadata = collect_metadata("/path/to/output_folder", "tile_metadata.csv")
metadata["outlier"] = False 
# Then mark any outlying values!
metadata.loc[metadata['sharpness_max'] < 5, "outlier"] = True     # blurry
metadata.loc[metadata['black_pixels'] > 0.05, "outlier"] = True   # data loss
metadata.loc[metadata['saturation_mean'] > 230, "outlier"] = True # weird blue shit

# This can also be done automatically!
detector = OutlierDetector(metadata, num_clusters=10)
# Plot clusters from most likely outlier to least likely outlier
detector.plot_clusters()
# After visual inspection we can discard some clusters as outliers.
metadata.loc[detector.clusters < 2, "outlier"] = True

Examples

Examples can be found in the docs.

What's coming?

HistoPrep is under constant development. If there are some features you would like to be added, just submit an issue and we'll start working on the feature!

Citation

If you use HistoPrep in a publication, please cite the github repository.

@misc{histoprep2021,
  author = {Pohjonen J. and Ariotta. V},
  title = {HistoPrep: Preprocessing large medical images for machine learning made easy!},
  year = {2022},
  publisher = {GitHub},
  journal = {GitHub repository},
  howpublished = {\url{https://github.com/jopo666/HistoPrep}},
}

Project details

These details have not been verified by PyPI

Project links

Homepage

GitHub Statistics

View statistics for this project via Libraries.io, or by using our public dataset on Google BigQuery

Development Status
- 3 - Alpha
Intended Audience
- Science/Research
License
- OSI Approved :: MIT License
Operating System
- OS Independent
Programming Language
- Python :: 3
Topic
- Scientific/Engineering :: Bio-Informatics

Release history Release notifications | RSS feed

2.0.5

Jun 16, 2023

2.0.4

Jun 7, 2023

2.0.3

May 3, 2023

2.0.2

Apr 19, 2023

2.0.1

Apr 13, 2023

1.0.8

Aug 3, 2022

1.0.7

Jun 6, 2022

1.0.6

Jun 6, 2022

1.0.5

Jun 3, 2022

This version

1.0.4

Jun 3, 2022

1.0.3

Jun 3, 2022

1.0.2

Jun 2, 2022

1.0.1

Jun 2, 2022

1.0.0

Jun 2, 2022

0.0.2.12.dev1 pre-release

Oct 13, 2021

0.0.2.11

Sep 29, 2021

0.0.2.10

Sep 29, 2021

0.0.2.9

Sep 29, 2021

0.0.2.8

Jun 1, 2021

0.0.2.7

May 25, 2021

0.0.2.6

May 24, 2021

0.0.2.5

Apr 30, 2021

0.0.2.4

Apr 21, 2021

0.0.2.3

Apr 21, 2021

0.0.2.2

Apr 20, 2021

0.0.2.1

Apr 20, 2021

0.0.2.0

Apr 18, 2021

0.0.1.9

Mar 26, 2021

0.0.1.9.dev1 pre-release

Apr 6, 2021

0.0.1.9.dev0 pre-release

Apr 6, 2021

0.0.1.8

Mar 24, 2021

0.0.1.7

Mar 23, 2021

0.0.1.6

Mar 1, 2021

0.0.1.5

Feb 26, 2021

0.0.1.5.dev1 pre-release

Feb 26, 2021

0.0.1.5.dev0 pre-release

Feb 26, 2021

0.0.1.4

Feb 24, 2021

0.0.1.3

Feb 24, 2021

0.0.1.2

Feb 23, 2021

0.0.1.1

Feb 13, 2021

0.0.1

Feb 12, 2021

0.0.1.dev13 pre-release

Feb 12, 2021

0.0.1.dev12 pre-release

Feb 12, 2021

0.0.1.dev11 pre-release

Feb 12, 2021

0.0.1.dev10 pre-release

Feb 12, 2021

0.0.1.dev9 pre-release

Feb 12, 2021

0.0.1.dev8 pre-release

Feb 12, 2021

0.0.1.dev7 pre-release

Feb 12, 2021

0.0.1.dev6 pre-release

Feb 10, 2021

0.0.1.dev5 pre-release

Feb 9, 2021

0.0.1.dev4 pre-release

Feb 5, 2021

0.0.1.dev3 pre-release yanked

Feb 5, 2021

Reason this release was yanked:

typo

0.0.1.dev2 pre-release yanked

Feb 5, 2021

Reason this release was yanked:

missing dependecies

0.0.1.dev1 pre-release yanked

Feb 5, 2021

Reason this release was yanked:

missing dependecies

0.0.1.dev0 pre-release yanked

Feb 5, 2021

Reason this release was yanked:

missing dependecies

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

histoprep-1.0.4.tar.gz (45.5 kB view hashes)

Uploaded Jun 3, 2022 Source

Built Distribution

histoprep-1.0.4-py3-none-any.whl (49.5 kB view hashes)

Uploaded Jun 3, 2022 Python 3

Hashes for histoprep-1.0.4.tar.gz

Hashes for histoprep-1.0.4.tar.gz
Algorithm	Hash digest
SHA256	`426cf0869bf3dcc86bf189f19a17946c4bcbc505be82b50aa35a13094f92a8b1`
MD5	`2f6ecb0997006dd336cbeb48c84f08cf`
BLAKE2b-256	`17e6a6f98fd615af04cda373ef6f824f00cc87b5ea178978c60cb0bf4ca54f3d`

Hashes for histoprep-1.0.4-py3-none-any.whl

Hashes for histoprep-1.0.4-py3-none-any.whl
Algorithm	Hash digest
SHA256	`b797feaeac5a45968b744ce4bc9a6a0a78c88a33ce86c5982bf327cf4218bccb`
MD5	`69443a607f192d8778b42b3b9e7cce30`
BLAKE2b-256	`5244d05f608e1494540afabefc384e1832650562cbfedcd5abbc38d785b26061`