Skip to main content

Embedding of whole slide images with Foundation Models

Project description

slide2vec

PyPI version

slide2vec is a Python package for efficient encoding of whole-slide images using publicly available foundation models. It builds on hs2p for fast preprocessing and exposes a focused surface around Model, Pipeline, and ExecutionOptions.

Installation

pip install slide2vec

Python API

from slide2vec import Model, PreprocessingConfig

model = Model.from_pretrained("virchow2", level="region")
preprocessing = PreprocessingConfig(
    target_spacing_um=0.5,
    target_tile_size_px=224,
    tissue_threshold=0.1,
)
embedded = model.embed_slide(
    "/path/to/slide.svs",
    preprocessing=preprocessing,
)

tile_embeddings = embedded.tile_embeddings
coordinates = embedded.coordinates

By default, ExecutionOptions() uses all available GPUs. Set ExecutionOptions(num_gpus=4) when you want to cap the sharding explicitly.

Use Pipeline(...) for manifest-driven batch processing when you want artifacts written to disk instead of only in-memory outputs:

from slide2vec import ExecutionOptions, Pipeline

pipeline = Pipeline(
    model=model,
    preprocessing=preprocessing,
    execution=ExecutionOptions(output_dir="outputs/demo"),
)
result = pipeline.run(manifest_path="/path/to/slides.csv")

Input Manifest

Manifest-driven runs use the schema below. mask_path and spacing_at_level_0 are optional.

sample_id,image_path,mask_path,spacing_at_level_0
slide-1,/path/to/slide-1.svs,/path/to/mask-1.png,0.25
slide-2,/path/to/slide-2.svs,,
...

Use spacing_at_level_0 when the slide file reports a missing or incorrect level-0 spacing and you want to override it.

Outputs

The package writes explicit artifact directories:

  • tile_embeddings/<sample_id>.pt or .npz
  • tile_embeddings/<sample_id>.meta.json
  • slide_embeddings/<sample_id>.pt or .npz
  • slide_embeddings/<sample_id>.meta.json
  • optional slide_latents/<sample_id>.pt or .npz

.pt remains the default format. .npz is available through ExecutionOptions(output_format="npz").

Supported Models

slide2vec currently ships preset configs for 10 tile-level models and 3 slide-level models.
For the full catalog and preset names, see docs/models.md.

CLI

The CLI is a thin wrapper over the package API.
Bundled configs live under slide2vec/configs/preprocessing/ and slide2vec/configs/models/.

python -m slide2vec --config-file /path/to/config.yaml

By default, manifest-driven CLI runs use all available GPUs. Set speed.num_gpus=4 when you want to cap the sharding explicitly.

New to the CLI or doing batch runs to disk? Start with docs/cli.md for the config-driven workflow, overrides, and common run patterns.

Docker

Docker Version

Docker remains available when you prefer a containerized runtime:

docker pull waticlems/slide2vec:latest
docker run --rm -it \
    -v /path/to/your/data:/data \
    -e HF_TOKEN=<your-huggingface-api-token> \
    waticlems/slide2vec:latest

Documentation

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

slide2vec-3.0.0.tar.gz (75.7 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

slide2vec-3.0.0-py3-none-any.whl (82.2 kB view details)

Uploaded Python 3

File details

Details for the file slide2vec-3.0.0.tar.gz.

File metadata

  • Download URL: slide2vec-3.0.0.tar.gz
  • Upload date:
  • Size: 75.7 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.11.15

File hashes

Hashes for slide2vec-3.0.0.tar.gz
Algorithm Hash digest
SHA256 daf894e6dbd6895a8c0e51374e920674e01a5c6d28f002e76a794feae3f21914
MD5 ef978f1330f2649fe7fc59c6b5a9625b
BLAKE2b-256 5521c5327456f3039621a1127249ca19bc2f027ee509fa56bfe70d2a20ba847f

See more details on using hashes here.

File details

Details for the file slide2vec-3.0.0-py3-none-any.whl.

File metadata

  • Download URL: slide2vec-3.0.0-py3-none-any.whl
  • Upload date:
  • Size: 82.2 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.11.15

File hashes

Hashes for slide2vec-3.0.0-py3-none-any.whl
Algorithm Hash digest
SHA256 f2255341bd3a71409e804110a360118b979ea3573f9a55c841956a6b68df5ace
MD5 b306a3c40a4eec5ed1ece47e6020873a
BLAKE2b-256 9fe7122f7ece483db06410a98c4dc37f289d303d0921639755a6bd9e68d5f309

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page