Read and process histological slide images with python!
Project description
Description
HistoSlice makes is easy to prepare your histological slide images for deep
learning models. You can easily cut large slide images into smaller tiles and then
preprocess those tiles (remove tiles with shitty tissue, finger marks etc).
[!NOTE] This project was forked from HistoPrep, and further modified for additional features and improvements.
Installation
uv add histoslice
# or
pip install histoslice
Usage
Typical workflow for training deep learning models with histological images is the following:
- Cut each slide image into smaller tile images.
- Preprocess smaller tile images by removing tiles with bad tissue, staining artifacts.
histoslice --input './train_images/*.tiff' --output ./tiles --width 512 --overlap 0.5 --max-background 0.5 --metrics --thumbnail
Or you can use the HistoSlice python API to do the same thing!
from histoslice import SlideReader
# Read slide image.
reader = SlideReader("./slides/slide_with_ink.jpeg")
# Detect tissue.
threshold, tissue_mask = reader.get_tissue_mask(level=-1)
# Extract overlapping tile coordinates with less than 50% background.
tile_coordinates = reader.get_tile_coordinates(
tissue_mask, width=512, overlap=0.5, max_background=0.5
)
# Save tile images with image metrics for preprocessing.
tile_metadata = reader.save_regions(
"./train_tiles/",
tile_coordinates,
threshold=threshold,
save_metrics=True,
save_thumbnail=True
)
Let's take a look at the output and visualise the thumbnails.
train_tiles
└── slide_with_ink
├── metadata.parquet # tile metadata
├── properties.json # tile properties
├── thumbnail.jpeg # thumbnail image
├── thumbnail_tiles.jpeg # thumbnail with tiles
├── thumbnail_tissue.jpeg # thumbnail of the tissue mask
└── tiles [390 entries exceeds filelimit, not opening dir]
As we can see from the above images, histological slide images often contain areas that we would not like to include into our training data. Might seem like a daunting task but let's try it out!
from histoslice.utils import OutlierDetector
# Let's wrap the tile metadata with a helper class.
detector = OutlierDetector(tile_metadata)
# Cluster tiles based on image metrics.
clusters = detector.cluster_kmeans(num_clusters=4, random_state=666)
# Visualise first cluster.
reader.get_annotated_thumbnail(
image=reader.read_level(-1), coordinates=detector.coordinates[clusters == 0]
)
Now we can mark tiles in cluster 0 as outliers!
Project details
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file histoslice-0.2.1.tar.gz.
File metadata
- Download URL: histoslice-0.2.1.tar.gz
- Upload date:
- Size: 4.4 MB
- Tags: Source
- Uploaded using Trusted Publishing? Yes
- Uploaded via: twine/6.1.0 CPython/3.13.7
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
a65b4538c347712b493ea84eb3621cc26c9ede85950d78f9998ec5bdf444111e
|
|
| MD5 |
00e3f63a840c213bf77e43235540ad82
|
|
| BLAKE2b-256 |
feea6ad84250976b8c222f6468bdc6d5b92b5fa7be92efdf8dc554647fb25851
|
Provenance
The following attestation bundles were made for histoslice-0.2.1.tar.gz:
Publisher:
publish.yaml on rmuraix/HistoSlice
-
Statement:
-
Statement type:
https://in-toto.io/Statement/v1 -
Predicate type:
https://docs.pypi.org/attestations/publish/v1 -
Subject name:
histoslice-0.2.1.tar.gz -
Subject digest:
a65b4538c347712b493ea84eb3621cc26c9ede85950d78f9998ec5bdf444111e - Sigstore transparency entry: 567889633
- Sigstore integration time:
-
Permalink:
rmuraix/HistoSlice@1107d7cc72c271d9c9d9a7c6ffbc3247af96a1d5 -
Branch / Tag:
refs/tags/0.2.1 - Owner: https://github.com/rmuraix
-
Access:
public
-
Token Issuer:
https://token.actions.githubusercontent.com -
Runner Environment:
github-hosted -
Publication workflow:
publish.yaml@1107d7cc72c271d9c9d9a7c6ffbc3247af96a1d5 -
Trigger Event:
push
-
Statement type:
File details
Details for the file histoslice-0.2.1-py3-none-any.whl.
File metadata
- Download URL: histoslice-0.2.1-py3-none-any.whl
- Upload date:
- Size: 46.9 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? Yes
- Uploaded via: twine/6.1.0 CPython/3.13.7
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
42e9f190c27d5f627772716c9a4bb03f7954cf9b37a32bf12679c521bdbec2d7
|
|
| MD5 |
0920a347d8e17bb58ec778685ffcfc76
|
|
| BLAKE2b-256 |
c8db874a4a04943bd305ec957265131e298ed4463c39d1753e94e79e45c21787
|
Provenance
The following attestation bundles were made for histoslice-0.2.1-py3-none-any.whl:
Publisher:
publish.yaml on rmuraix/HistoSlice
-
Statement:
-
Statement type:
https://in-toto.io/Statement/v1 -
Predicate type:
https://docs.pypi.org/attestations/publish/v1 -
Subject name:
histoslice-0.2.1-py3-none-any.whl -
Subject digest:
42e9f190c27d5f627772716c9a4bb03f7954cf9b37a32bf12679c521bdbec2d7 - Sigstore transparency entry: 567889635
- Sigstore integration time:
-
Permalink:
rmuraix/HistoSlice@1107d7cc72c271d9c9d9a7c6ffbc3247af96a1d5 -
Branch / Tag:
refs/tags/0.2.1 - Owner: https://github.com/rmuraix
-
Access:
public
-
Token Issuer:
https://token.actions.githubusercontent.com -
Runner Environment:
github-hosted -
Publication workflow:
publish.yaml@1107d7cc72c271d9c9d9a7c6ffbc3247af96a1d5 -
Trigger Event:
push
-
Statement type: