Skip to main content

Optimized slide tiling library for histopathology

Project description

hs2p

PyPI version Python 3.10+ empty empty HuggingFace Space

hs2p is a Python package for fast, scalable whole-slide tiling and annotation-aware sampling. You can request tiles at any spacing, whether or not that spacing is natively present in the image pyramid. It is designed for computational pathology workflows that need reproducible coordinates, explicit artifacts, and backend-independent physical semantics.

We support two main workflows:

  • a Python API for library-style integration
  • a CLI for batch preprocessing

Demo

Try hs2p interactively: hs2p-demo on HuggingFace Spaces
You can adjust tiling parameters and inspect the resulting grid and mask previews.
You can also upload your own pyramidal WSI (up to 1 GB).

Installation

Base install:

pip install hs2p

Optional backend extras:

pip install "hs2p[openslide]"
pip install "hs2p[asap]"
pip install "hs2p[vips]"
pip install "hs2p[cucim]"
pip install "hs2p[all]"

The supported backend set is:

  • auto
  • cucim
  • vips
  • openslide
  • asap

auto prefers cucim -> vips -> openslide -> asap.

Workflows

Tiling

Tiling computes a reproducible grid of tile coordinates for each slide and saves them as explicit named artifacts. When a precomputed tissue mask is not provided, hs2p segments tissue on the fly. If you want to create those masks ahead of time, a standalone script is available.

hs2p tiling workflow

Sampling

Sampling filters or partitions tile coordinates by annotation coverage so you can keep only tiles relevant to a label or tissue class.

hs2p sampling workflow

Python API

Minimal tiling example:

from pathlib import Path

from hs2p import (
    SlideSpec,
    TilingConfig,
    tile_slide,
    save_tiling_result,
    write_tiling_preview,
)

result = tile_slide(
    SlideSpec(
        sample_id="slide-1",
        image_path=Path("/data/wsi/slide-1.tif"),
        mask_path=Path("/data/mask/slide-1-tissue-mask.tif"), # optional
    ),
    tiling=TilingConfig(
        backend="openslide",
        target_spacing_um=0.5,
        target_tile_size_px=224,
        tolerance=0.07,
        overlap=0.0,
        tissue_threshold=0.1,
    ),
)

# save tiling results to disk
artifacts = save_tiling_result(result, output_dir=Path("output"))

print(artifacts.coordinates_npz_path)   # output/tiles/slide-1.coordinates.npz
print(artifacts.coordinates_meta_path)  # output/tiles/slide-1.coordinates.meta.json

# preview tile grid
tiling_preview_path = write_tiling_preview(
    result=result,
    output_dir=Path("output"),
    downsample=32,
)
print(tiling_preview_path)  # output/preview/tiling/slide-1.jpg

result is a canonical hs2p.preprocessing.TilingResult. Downstream code should use its structured fields such as:

  • coordinates
  • tissue_fractions
  • tile_index
  • requested_*
  • effective_*
  • min_tissue_fraction

More API details: docs/api.md

CLI

The CLI is intended for fast batch processing of multiple slides with the same config.
Both entrypoints read the same public mask_path column, and the command determines whether that path is treated as a tissue mask or an annotation mask:

Tiling csv (mask_path is optional and means a tissue mask here):

sample_id,image_path,mask_path
slide-1,/data/wsi/slide-1.tif,/data/mask/slide-1-tissue-mask.tif
slide-2,/data/wsi/slide-2.tif,
...

Sampling csv (mask_path is mandatory and means an annotation mask here):

sample_id,image_path,mask_path
slide-1,/data/wsi/slide-1.tif,/data/mask/slide-1-annotations.tif
slide-2,/data/wsi/slide-2.tif,/data/mask/slide-2-annotations.tif
...

Run tiling:

python -m hs2p.cli.tiling --config-file /path/to/config.yaml

Run sampling:

python -m hs2p.cli.sampling --config-file /path/to/config.yaml

For a first run, start from hs2p/configs/default.yaml and edit only the essentials:

  • csv
  • output_dir
  • tiling.backend
  • tiling.params.target_spacing_um
  • tiling.params.target_tile_size_px

More details about CLI: docs/cli.md

Outputs

hs2p writes explicit named artifacts rather than anonymous coordinate dumps.

  • Tiling writes tiles/{sample_id}.coordinates.npz and tiles/{sample_id}.coordinates.meta.json
  • Sampling writes the same pair under tiles/<annotation>/
  • Batch runs also write process_list.csv
  • Saved coordinate arrays use a deterministic order: numeric x first, then numeric y within each shared x

Artifact field reference: docs/artifacts.md

Docker

Docker Version

If you prefer running hs2p in a container, a published Docker image is available:

docker pull waticlems/hs2p:latest
docker run --rm -it -v /path/to/your/data:/data waticlems/hs2p:latest

Documentation

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

hs2p-3.0.1.tar.gz (132.7 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

hs2p-3.0.1-py3-none-any.whl (91.8 kB view details)

Uploaded Python 3

File details

Details for the file hs2p-3.0.1.tar.gz.

File metadata

  • Download URL: hs2p-3.0.1.tar.gz
  • Upload date:
  • Size: 132.7 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.11.15

File hashes

Hashes for hs2p-3.0.1.tar.gz
Algorithm Hash digest
SHA256 14654eaae1a464613e6c0933b18d86405b7dd73719e932b4760bb00e78599940
MD5 672dc63c06fe2c41c45260688c46712f
BLAKE2b-256 d1b6f9c86d4c031a42a5c6ccf99dec60c4d5314aaaabccdbe45ea9af991d4cb4

See more details on using hashes here.

File details

Details for the file hs2p-3.0.1-py3-none-any.whl.

File metadata

  • Download URL: hs2p-3.0.1-py3-none-any.whl
  • Upload date:
  • Size: 91.8 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.11.15

File hashes

Hashes for hs2p-3.0.1-py3-none-any.whl
Algorithm Hash digest
SHA256 f5435eef1e8fcb3f08a5eee42cee3e74f219632720dd569498fd9f3a4bd2a192
MD5 5b0f2d4c7a359568c32808b3d84f7991
BLAKE2b-256 5c40f8f579eb0ebd08e2ab86e564ba5bca294c2d8a4a3922b499e8d27abfb265

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page