Identify spatial cell triplets within a distance threshold
Project description
triplet-finder
This tool identifies spatial triplets of cells based on a user-defined distance criterion. It was originally designed as a preprocessing step for identifying immune cell triads, but may also be useful for other spatial analysis tasks.
Definition of Distance Criteria and Triplets
The tool supports two distance modes:
- effective: pairwise distances are computed as the centroid-to-centroid distance minus the sum of the two cells’ radii, where each radius is defined as the average of the minimum and maximum cell diameter columns.
- centroid: pairwise distances are computed directly as centroid-to-centroid distance, without accounting for cell size.
Two cells are considered capable of interaction if their pairwise distance under the selected mode is less than a user-defined threshold, representing the assumed spatial range over which cells can communicate or influence one another.
A triplet is defined as a fully connected set of three cells (a 3-clique) in which all three pairwise distances satisfy this criterion within a single image or field of view.
The tool is designed to be:
- memory-safe by default (streaming mode)
- usable as both a Python library and a command-line tool
- suitable for large microscopy / spatial biology datasets
Installation
From source
Clone the repository and install in editable mode:
pip install -e .
This will install:
- the Python package:
triplet_finder - the CLI command:
triplet-finder
Dependencies
Python >= 3.9 is required.
Core dependencies:
- pandas
- numpy
- scipy
These are installed automatically via pip.
Command-Line Usage
After installation, the CLI is available as:
triplet-finder --help
Basic usage (streaming, low memory)
triplet-finder \
--input cells.csv \
--output-dir per_image_triplets/
This:
- processes each image independently
- writes one CSV per image if
--output-diris provided - does not keep all triplets in memory by default
Choose the distance mode
Effective distance mode
triplet-finder \
--input cells.csv \
--distance-mode effective \
--output-dir per_image_triplets/
Centroid distance mode
triplet-finder \
--input cells.csv \
--distance-mode centroid \
--output-dir per_image_triplets/
Write a combined output CSV
triplet-finder \
--input cells.csv \
--output-dir per_image_triplets/ \
--combined-output \
--output all_triplets.csv
This:
- writes per-image CSVs
- also returns all triplets in memory
- writes a combined CSV
For large datasets, this may require substantial memory.
Excluding cells by metadata
triplet-finder \
--input cells.csv \
--exclude-col tentative_cell_type \
--exclude-values Unknown \
--output-dir per_image_triplets/
Summary / dry-run mode
Prints a summary of the dataset and parameters without computing triplets:
triplet-finder \
--input cells.csv \
--exclude-col tentative_cell_type \
--exclude-values Unknown \
--distance-mode effective \
--summary
Example output:
Summary
-------
Images (total): 132
Cells (total): 45892
Cells after filtering: 37211
Images with ≥3 cells: 129
Parameters
----------
Image column: image_sliced
X coordinate column: centroid_x_um
Y coordinate column: centroid_y_um
Distance mode: effective
Distance threshold: 15.0
Min cell diameter column: cell_min_caliper
Max cell diameter column: cell_max_caliper
In centroid mode, the minimum and maximum cell diameter columns are not used.
Metadata headers
By default, per-image CSV outputs include metadata headers such as tool version, parameters, and command invocation.
To disable metadata headers:
triplet-finder --no-metadata ...
Python API Usage
The core functionality is available as a Python function.
Import
from triplet_finder import find_triplets_with_details
Example: streaming mode with effective distance
import pandas as pd
from triplet_finder import find_triplets_with_details
df = pd.read_csv("cells.csv")
find_triplets_with_details(
input_data=df,
image_col="image_sliced",
x_col="centroid_x_um",
y_col="centroid_y_um",
min_cell_diameter_column="cell_min_caliper",
max_cell_diameter_column="cell_max_caliper",
distance_mode="effective",
threshold=15.0,
output_dir="per_image_triplets",
return_triplets=False,
)
This writes per-image CSVs and returns None.
Example: centroid mode
triplets = find_triplets_with_details(
input_data=df,
image_col="image_sliced",
x_col="centroid_x_um",
y_col="centroid_y_um",
distance_mode="centroid",
threshold=15.0,
return_triplets=True,
)
Example: returning all triplets with effective distance
triplets = find_triplets_with_details(
input_data=df,
image_col="image_sliced",
x_col="centroid_x_um",
y_col="centroid_y_um",
min_cell_diameter_column="cell_min_caliper",
max_cell_diameter_column="cell_max_caliper",
distance_mode="effective",
threshold=15.0,
return_triplets=True,
)
print(triplets.head())
Metadata in Python usage
metadata = {
"experiment": "Immune_Triads_Preprocessing",
"threshold": 15.0,
"distance_mode": "effective",
"notes": "Unknown cells excluded",
}
find_triplets_with_details(
input_data=df,
image_col="image_sliced",
x_col="centroid_x_um",
y_col="centroid_y_um",
min_cell_diameter_column="cell_min_caliper",
max_cell_diameter_column="cell_max_caliper",
distance_mode="effective",
threshold=15.0,
output_dir="per_image_triplets",
metadata=metadata,
)
Design Notes
- Triplets are computed independently per image
- KD-tree acceleration is used for spatial queries
- Default behavior avoids storing all triplets in memory
- Per-image outputs can be parallelized downstream
- Suitable for use from Python, R via reticulate, or shell pipelines
License
MIT License
Disclaimer
This software was developed for internal research use and is provided “as is.” No guarantees are made regarding suitability for any specific purpose. Users are responsible for validating the software’s assumptions, correctness, and performance on their own data before applying it to analyses or publications.
The main changes are:
- replaced the old single-definition paragraph with a two-mode definition
- added `--distance-mode` to CLI usage
- updated the summary example to include `Distance mode`
- clarified that caliper columns are only used in `effective` mode
- kept the Python import as `triplet_finder`
- cleaned up the disclaimer wording and CSV capitalization
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file triplet_finder-0.2.0.tar.gz.
File metadata
- Download URL: triplet_finder-0.2.0.tar.gz
- Upload date:
- Size: 10.7 kB
- Tags: Source
- Uploaded using Trusted Publishing? Yes
- Uploaded via: twine/6.1.0 CPython/3.13.12
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
4f53d2553f3552b2e91115fea28eee6cbbb050feec4fedc9436f9d273492328a
|
|
| MD5 |
9a6f151b550d10dca493eceaa9f21c31
|
|
| BLAKE2b-256 |
45135380ecccc5b8ce5a527bab373e4c980b8b7f3abf1836e9f5f30e0ddb9cb7
|
Provenance
The following attestation bundles were made for triplet_finder-0.2.0.tar.gz:
Publisher:
python-publish.yml on OmicsUtils/tripletfinder
-
Statement:
-
Statement type:
https://in-toto.io/Statement/v1 -
Predicate type:
https://docs.pypi.org/attestations/publish/v1 -
Subject name:
triplet_finder-0.2.0.tar.gz -
Subject digest:
4f53d2553f3552b2e91115fea28eee6cbbb050feec4fedc9436f9d273492328a - Sigstore transparency entry: 1285243958
- Sigstore integration time:
-
Permalink:
OmicsUtils/tripletfinder@85e491f5722a9cbde9b0a7609a51275a02ec603f -
Branch / Tag:
refs/tags/v0.2.0 - Owner: https://github.com/OmicsUtils
-
Access:
public
-
Token Issuer:
https://token.actions.githubusercontent.com -
Runner Environment:
github-hosted -
Publication workflow:
python-publish.yml@85e491f5722a9cbde9b0a7609a51275a02ec603f -
Trigger Event:
release
-
Statement type:
File details
Details for the file triplet_finder-0.2.0-py3-none-any.whl.
File metadata
- Download URL: triplet_finder-0.2.0-py3-none-any.whl
- Upload date:
- Size: 9.1 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? Yes
- Uploaded via: twine/6.1.0 CPython/3.13.12
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
e9a6ed40bb9dba8298832cadd192abdb92e5a81627789ae4fedb18adb904e1ee
|
|
| MD5 |
d1274bbc46a8fd91b234a4fc555f194f
|
|
| BLAKE2b-256 |
7350ae4bdbeebd32bfee9b3c1153096f6f53ccdd5e60981303384f69e61f6438
|
Provenance
The following attestation bundles were made for triplet_finder-0.2.0-py3-none-any.whl:
Publisher:
python-publish.yml on OmicsUtils/tripletfinder
-
Statement:
-
Statement type:
https://in-toto.io/Statement/v1 -
Predicate type:
https://docs.pypi.org/attestations/publish/v1 -
Subject name:
triplet_finder-0.2.0-py3-none-any.whl -
Subject digest:
e9a6ed40bb9dba8298832cadd192abdb92e5a81627789ae4fedb18adb904e1ee - Sigstore transparency entry: 1285244041
- Sigstore integration time:
-
Permalink:
OmicsUtils/tripletfinder@85e491f5722a9cbde9b0a7609a51275a02ec603f -
Branch / Tag:
refs/tags/v0.2.0 - Owner: https://github.com/OmicsUtils
-
Access:
public
-
Token Issuer:
https://token.actions.githubusercontent.com -
Runner Environment:
github-hosted -
Publication workflow:
python-publish.yml@85e491f5722a9cbde9b0a7609a51275a02ec603f -
Trigger Event:
release
-
Statement type: