...
Project description
HistoSeg is a Python toolkit for spatial transcriptomics segmentation / geometry extraction.
The current focus is Pattern1 isoline (0.5) contour generation from cell clusters (e.g., 10x Xenium GraphClust output):
- Pick a set of “target clusters” (Pattern1)
- Fit a KNN regressor to estimate P(target) over space
- Smooth the probability field
- Extract a contour (isoline) at level = 0.5
- Save contour vertices and a quick preview plot
Quick links
- Documentation: https://histoseg.readthedocs.io/en/latest/
- Source code: https://github.com/hutaobo/HistoSeg
- Issue tracker: https://github.com/hutaobo/HistoSeg/issues
⚠️ License note
This project is distributed under the PolyForm Noncommercial 1.0.0 license. Academic and other noncommercial use is permitted. Any commercial use requires a separate commercial license from SPATHO AB. See
LICENSEfor the full terms.
Installation
Install from PyPI (recommended)
pip install -U histoseg
Install from source (for development)
git clone https://github.com/hutaobo/HistoSeg.git
cd HistoSeg
pip install -U pip
pip install -e .
Dependencies
The Pattern1 isoline workflow uses:
- numpy, pandas
- scipy
- scikit-learn
- matplotlib
- a Parquet engine (pyarrow is recommended)
If you run into missing imports, install them explicitly:
pip install -U numpy pandas pyarrow scipy scikit-learn matplotlib
Optional:
- Hugging Face downloader:
pip install -U huggingface_hub
Tutorial: Pattern1 isoline (0.5)
What you need (inputs)
The isoline workflow expects the following files:
-
clusters.csv- Typically from GraphClust:
analysis/clustering/gene_expression_graphclust/clusters.csv - Must contain columns:
Barcode,Cluster
- Typically from GraphClust:
-
cells.parquet- A cell-level table with spatial coordinates (x/y-like columns)
- Must contain at least:
- coordinate columns (e.g.
x/yorx_centroid/y_centroid) - an id column that can be aligned with
clusters.csv:Barcode(the code tries several common column names)
- coordinate columns (e.g.
-
tissue_boundary.csv(optional but recommended if you enable synthetic background)- Must contain columns
x,yorX,Y
- Must contain columns
What you get (outputs)
By default, the pipeline writes into out_dir:
params.json— all parameters + inferred join columnspattern1_isoline_<level>_<i>.npy— contour vertices (Nx2 arrays)pattern1_isoline_<level>.png— quick preview plot
Quickstart
One-liner (from a Hugging Face dataset repo)
This follows the example notebook in examples/contour_generation_pattern1_from_hf.ipynb.
# pip install -U histoseg
# pip install -U huggingface_hub pandas pyarrow numpy scipy scikit-learn matplotlib
from histoseg import run_pattern1_isoline_from_hf
PATTERN1 = (10, 23, 19, 27, 14, 20, 25, 26)
result = run_pattern1_isoline_from_hf(
repo_id="hutaobo/output-XETG00082_C105",
revision="main", # or a commit hash for strict reproducibility
out_dir="outputs/pattern1_isoline0p5_from_graphclust",
pattern1_clusters=PATTERN1,
# Defaults are intentionally exposed for tuning:
grid_n=1200,
knn_k=30,
smooth_sigma=5.0,
min_cells_inside=10,
)
print("Outputs folder:", result.out_dir)
print("Preview image:", result.preview_png)
print("Contours:", len(result.contours))
Run on local files
from histoseg import Pattern1IsolineConfig, run_pattern1_isoline
PATTERN1 = (10, 23, 19, 27, 14, 20, 25, 26)
cfg = Pattern1IsolineConfig(
clusters_csv="/path/to/analysis/clustering/gene_expression_graphclust/clusters.csv",
cells_parquet="/path/to/cells.parquet",
tissue_boundary_csv="/path/to/tissue_boundary.csv",
out_dir="outputs/pattern1_isoline0p5",
pattern1_clusters=PATTERN1,
# Optional tuning:
grid_n=1200,
knn_k=30,
smooth_sigma=5.0,
min_cells_inside=10,
)
result = run_pattern1_isoline(cfg)
print(result)
How it works (workflow overview)
flowchart TD
A["clusters.csv<br/>Barcode/Cluster"] --> C["Align barcodes<br/>with cells.parquet"]
B["cells.parquet<br/>x/y + id-like column"] --> C
C --> D["Select target clusters<br/>(Pattern1)"]
D --> E["Sample background points<br/>(other cells)"]
F["tissue_boundary.csv"] --> G["Generate synthetic background<br/>(optional)"]
G --> E
D --> H["KNN regression<br/>predict P(target)"]
E --> H
H --> I["Predict on mesh grid"]
I --> J["Gaussian smoothing"]
J --> K["Mask by tissue<br/>(nearest-cell threshold)"]
K --> L["Extract isoline<br/>level = 0.5"]
L --> M["Filter loops<br/>min_cells_inside"]
M --> N["Save params.json<br/>+ contours .npy<br/>+ preview .png"]
Troubleshooting & tuning
If no contour is found, try:
- Decrease
min_cells_inside(e.g. 10 → 3) - Increase
smooth_sigma(e.g. 5 → 8) - Increase
knn_k(e.g. 30 → 50) - Reduce
grid_nto speed up (note:grid_n=1200can be heavy)
API reference (high-level)
Pattern1 isoline
-
Pattern1IsolineConfig
Dataclass holding all parameters and input paths. -
run_pattern1_isoline(cfg) -> Pattern1IsolineResult
Runs the full pipeline on local files. -
run_pattern1_isoline_from_hf(repo_id, revision="main", ...) -> Pattern1IsolineResult
Convenience wrapper that downloads required files from a Hugging Face dataset repo and then runs the pipeline.
Hugging Face I/O helpers
download_xenium_outs(repo_id, revision="main", clusters_relpath=..., cache_dir=None)
Downloadscells.parquet,tissue_boundary.csv, and the specifiedclusters.csvfrom a dataset repo.
SFPlot utilities (legacy / optional)
This repository contains a small subset of SFPlot-style utilities and re-exports:
compute_cophenetic_distances_from_df(df, ...)plot_cophenetic_heatmap(matrix, ...)
GUI (experimental)
A GUI entry point is configured as:
histoseg-gui
Notes:
- The current GUI code path is still in flux and may require extra dependencies (e.g., Pillow) and/or an external
sfplotinstallation. - For production workflows, prefer the Python API shown above.
Contributing
Issues and pull requests are welcome.
When reporting a bug, please include:
- OS + Python version
histosegversion- Minimal reproducible code (or a small input subset)
- Expected vs. actual behavior
License
This project is distributed under the PolyForm Noncommercial 1.0.0 license.
Noncommercial use (including academic research) is permitted.
Any commercial use requires a separate commercial license from SPATHO AB.
See LICENSE for details.
Project details
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file histoseg-0.1.8.1.tar.gz.
File metadata
- Download URL: histoseg-0.1.8.1.tar.gz
- Upload date:
- Size: 5.0 MB
- Tags: Source
- Uploaded using Trusted Publishing? Yes
- Uploaded via: twine/6.1.0 CPython/3.13.7
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
290aea06eba3bdccea54bf7366638625d79280a098dcfa5b4fabcf1986353c51
|
|
| MD5 |
37e13de123dded107e324caad6af91c9
|
|
| BLAKE2b-256 |
9337b3f93d18ad848d3f24939d648f5130e9868d3b7cdd194bcdd924f714f588
|
Provenance
The following attestation bundles were made for histoseg-0.1.8.1.tar.gz:
Publisher:
publish.yml on hutaobo/HistoSeg
-
Statement:
-
Statement type:
https://in-toto.io/Statement/v1 -
Predicate type:
https://docs.pypi.org/attestations/publish/v1 -
Subject name:
histoseg-0.1.8.1.tar.gz -
Subject digest:
290aea06eba3bdccea54bf7366638625d79280a098dcfa5b4fabcf1986353c51 - Sigstore transparency entry: 902659132
- Sigstore integration time:
-
Permalink:
hutaobo/HistoSeg@a6781e86600e3a1226b14bb9697c8e2a32196f65 -
Branch / Tag:
refs/tags/v0.1.8.1 - Owner: https://github.com/hutaobo
-
Access:
public
-
Token Issuer:
https://token.actions.githubusercontent.com -
Runner Environment:
github-hosted -
Publication workflow:
publish.yml@a6781e86600e3a1226b14bb9697c8e2a32196f65 -
Trigger Event:
push
-
Statement type:
File details
Details for the file histoseg-0.1.8.1-py3-none-any.whl.
File metadata
- Download URL: histoseg-0.1.8.1-py3-none-any.whl
- Upload date:
- Size: 27.8 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? Yes
- Uploaded via: twine/6.1.0 CPython/3.13.7
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
292e4481ec6df53e3623edab88f651e05ca6f2332a6fce066a678e4dd3e8dfe8
|
|
| MD5 |
cbc2ac1aaa8509f63ad6734291cae397
|
|
| BLAKE2b-256 |
31cc49fbf982b0cd5c2b0b766a5f23a717978768a410f22cf3b682613c6fdfc7
|
Provenance
The following attestation bundles were made for histoseg-0.1.8.1-py3-none-any.whl:
Publisher:
publish.yml on hutaobo/HistoSeg
-
Statement:
-
Statement type:
https://in-toto.io/Statement/v1 -
Predicate type:
https://docs.pypi.org/attestations/publish/v1 -
Subject name:
histoseg-0.1.8.1-py3-none-any.whl -
Subject digest:
292e4481ec6df53e3623edab88f651e05ca6f2332a6fce066a678e4dd3e8dfe8 - Sigstore transparency entry: 902659200
- Sigstore integration time:
-
Permalink:
hutaobo/HistoSeg@a6781e86600e3a1226b14bb9697c8e2a32196f65 -
Branch / Tag:
refs/tags/v0.1.8.1 - Owner: https://github.com/hutaobo
-
Access:
public
-
Token Issuer:
https://token.actions.githubusercontent.com -
Runner Environment:
github-hosted -
Publication workflow:
publish.yml@a6781e86600e3a1226b14bb9697c8e2a32196f65 -
Trigger Event:
push
-
Statement type: