Skip to main content

Optimized slide tiling library for histopathology

Project description

Histopathology Slide Pre-processing Pipeline

PyPI version Docker Version

HS2P is an open-source project largely based on CLAM tissue segmentation and patching code.

empty empty

🛠️ Installation

System requirements: Linux-based OS (e.g., Ubuntu 22.04) with Python 3.11+ and Docker installed.

We recommend running the script inside a container using the latest hs2p image from Docker Hub:

docker pull waticlems/hs2p:latest
docker run --rm -it \
    -v /path/to/your/data:/data \
    waticlems/hs2p:latest

Replace /path/to/your/data with your local data directory.

Alternatively, you can install hs2p via pip:

pip install hs2p

Slide tiling

  1. Create a .csv file containing paths to the desired slides. Optionally, you can provide paths to pre-computed tissue masks under the 'mask_path' column

    wsi_path,mask_path
    /path/to/slide1.tif,/path/to/mask1.tif
    /path/to/slide2.tif,/path/to/mask2.tif
    ...
    
  2. Create a configuration file

    A good starting point is to look at the default configuration file under hs2p/configs/default.yaml where parameters are documented.

  3. Kick off slide tiling

    python3 -m hs2p.tiling --config-file </path/to/config.yaml>
    

Tile sampling

  1. Create a .csv file containing paths to the desired slides & associated annotation masks:

    wsi_path,mask_path
    /path/to/slide1.tif,/path/to/mask1.tif
    /path/to/slide2.tif,/path/to/mask2.tif
    ...
    
  2. Create a configuration file

    A good starting point is to look at the default configuration file under hs2p/configs/default.yaml where parameters are documented.

  3. Kick off tile sampling

    python3 -m hs2p.sampling --config-file </path/to/config.yaml>
    

Output structure

Both tiling.py and sampling.py produce a similar output structure in the specified output directory.

Coordinates

The coordinates/ folder contains a .npy file for each successfully processed slide.
This file stores a numpy array of shape (num_tiles, 7) containing the following information for each tile:

  1. x: x-coordinate of the tile at level 0
  2. y: y-coordinate of the tile at level 0
  3. tile_level: pyramid level at which the tile was extracted
  4. tile_size_resized: size of the tile at the extraction level, which may differ from the requested tile size if the target spacing was not available
  5. resize_factor: ratio between tile_size_resized and the requested tile size, useful for resizing when loading the tile
  6. tile_size_lv0: tile size scaled to the slide's level 0
  7. target_spacing: spacing at which the user requested the tile (in microns per pixel)

Visualization (optional)

If visualize is set to true, a visualization/ folder is created containing low-resolution images to verify the results:

  • mask/: visualizations of the provided tissue (or annotation) mask
  • tiling/ (for tiling.py) or sampling/ (for sampling.py): visualizations of the extracted or sampled tiles overlaid on the slide. For sampling.py, this includes subfolders for each category defined in the sampling parameters (e.g., tumor, stroma, etc.)

These visualizations are useful for double-checking that the tiling or sampling process ran as expected.

Process summary

  • process_list.csv: a summary file listing each processed slide, indicating whether processing was successful or failed. If a failure occurred, the traceback is provided to help diagnose the issue.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

hs2p-1.0.1.tar.gz (32.1 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

hs2p-1.0.1-py3-none-any.whl (34.6 kB view details)

Uploaded Python 3

File details

Details for the file hs2p-1.0.1.tar.gz.

File metadata

  • Download URL: hs2p-1.0.1.tar.gz
  • Upload date:
  • Size: 32.1 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.11.14

File hashes

Hashes for hs2p-1.0.1.tar.gz
Algorithm Hash digest
SHA256 600b8049045044517344be619130aafc515c1090c2d05a9ebc4177cbf3b9f633
MD5 1c61a154113ffa37adab292cfdc78bf9
BLAKE2b-256 074e541a8cc686cdaf1cf2088f90ea637576ab6e4548c3a89adea6da62531ccb

See more details on using hashes here.

File details

Details for the file hs2p-1.0.1-py3-none-any.whl.

File metadata

  • Download URL: hs2p-1.0.1-py3-none-any.whl
  • Upload date:
  • Size: 34.6 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.11.14

File hashes

Hashes for hs2p-1.0.1-py3-none-any.whl
Algorithm Hash digest
SHA256 970931770e918168bd1adbbbfc29742f0c64275ccb71b914eff6be9fa5ab229c
MD5 3f61808a82b38bffe2c22443092d3631
BLAKE2b-256 26b459e4c29ec6c4a7eee074d82f38f4ead90f613b0df982e67ac108a31813d2

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page