Optimized slide tiling library for histopathology
Project description
hs2p
hs2p is a Python package for efficient slide tiling and tile sampling at any requested spacing, whether or not that spacing is natively present in the whole-slide image. It is designed for computational pathology workflows that need reproducible coordinates.
We support two main workflows:
- a Python API for library-style integration
- a CLI for batch preprocessing from a CSV and YAML config
Installation
pip install hs2p
If a mask is not provided, hs2p can segment tissue directly from the slide; if you want to precompute tissue masks, a standalone script is available.
Workflows
Tiling
Tiling computes a reproducible grid of tile coordinates for each slide and saves them as named artifacts with extraction metadata, ready for downstream use.
Sampling
Sampling filters or partitions tile coordinates by annotation coverage so you can keep only tiles relevant to a tissue class or label.
Python API
hs2p supports pre-extracted tissue masks. If you don't have such tissue masks, you can either:
- use our standalone tissue segmentation script (Recommended)
- tune the SegmentationConfig parameters and let
hs2psegments tissue on the fly
Minimal tiling example:
from pathlib import Path
from hs2p import (
FilterConfig,
SegmentationConfig,
TilingConfig,
WholeSlide,
overlay_mask_on_slide,
save_tiling_result,
tile_slide,
write_tiling_preview,
)
result = tile_slide(
WholeSlide(
sample_id="slide-1",
image_path=Path("/data/wsi/slide-1.tif"),
mask_path=Path("/data/mask/slide-1.tif"),
),
tiling=TilingConfig(
backend="openslide",
target_spacing_um=0.5,
target_tile_size_px=224,
tolerance=0.07,
overlap=0.0,
tissue_threshold=0.1,
),
segmentation=SegmentationConfig(downsample=64),
filtering=FilterConfig(ref_tile_size=224, a_t=4, a_h=2),
num_workers=1,
)
artifacts = save_tiling_result(result, output_dir=Path("output"))
tiling_preview_path = write_tiling_preview(
result=result,
output_dir=Path("output"),
downsample=32,
)
mask_overlay = overlay_mask_on_slide(
wsi_path=result.image_path,
annotation_mask_path=Path("/data/mask/slide-1.tif"),
downsample=32,
backend=result.backend,
)
mask_overlay.save("output/visualization/mask/slide-1.jpg")
print(artifacts.tiles_npz_path)
print(artifacts.tiles_meta_path)
print(tiling_preview_path)
result is a TilingResult for one slide. It gives downstream pipelines the tile coordinates plus the metadata needed to relate those coordinates back to the slide pyramid and persist them as reusable named artifacts.
More API details: docs/api.md
CLI
Both CLI entrypoints use the same input CSV schema:
sample_id,image_path,mask_path
slide-1,/data/wsi/slide-1.tif,/data/mask/slide-1.tif
slide-2,/data/wsi/slide-2.tif,
For a first run, start from hs2p/configs/default.yaml and edit only the essentials:
csvoutput_dirtiling.backendtiling.params.target_spacing_umtiling.params.target_tile_size_px
Run tiling:
python -m hs2p.tiling --config-file /path/to/config.yaml
Run sampling:
python -m hs2p.sampling --config-file /path/to/config.yaml
For sampling, add tiling.sampling_params.pixel_mapping and tiling.sampling_params.tissue_percentage for the annotations you want to keep.
More CLI details: docs/cli.md
Outputs
hs2p writes explicit named artifacts rather than anonymous coordinate dumps.
- Tiling writes
coordinates/{sample_id}.tiles.npzandcoordinates/{sample_id}.tiles.meta.json - Sampling writes the same pair under
coordinates/<annotation>/ - Batch runs also write
process_list.csv - Saved coordinate arrays use a deterministic column-major order: numeric
xfirst, then numericywithin each sharedx
Artifact field reference: docs/artifacts.md
Docker
If you prefer running hs2p in a container, a published Docker image is available:
docker pull waticlems/hs2p:latest
docker run --rm -it -v /path/to/your/data:/data waticlems/hs2p:latest
Documentation
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file hs2p-2.0.0.tar.gz.
File metadata
- Download URL: hs2p-2.0.0.tar.gz
- Upload date:
- Size: 62.5 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.2.0 CPython/3.11.14
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
eccc62fed5f10839f30c90e894872d3a0dc8ee4e65457c4c0b9edc38e4b4d76c
|
|
| MD5 |
b2da0a82c9b289b32ca1c07fcf40ceb0
|
|
| BLAKE2b-256 |
5692d6dcd0b9b465baacf6829bafbceb1decf424742eeaf27f0abaffe4c1b793
|
File details
Details for the file hs2p-2.0.0-py3-none-any.whl.
File metadata
- Download URL: hs2p-2.0.0-py3-none-any.whl
- Upload date:
- Size: 45.8 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.2.0 CPython/3.11.14
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
c705e679b1fdff5abea875fb423043995eaea0e94421801a8e2021ee546199e1
|
|
| MD5 |
faf1a60bb0e85f4f37180886ce3acf4f
|
|
| BLAKE2b-256 |
34935ba5216ffd089a8f83ba103b6bb4a5fe67c3f17ff905f6d6f5ce40b616a0
|