Skip to main content

Identify spatial cell triplets within a distance threshold

Project description

triplet-finder

This tool identifies spatial triplets of cells based on a user-defined distance criterion. It was originally designed as a preprocessing step for identifying immune cell triads, but may also be useful for other spatial analysis tasks.

Definition of Distance Criteria and Triplets

The tool supports two distance modes:

  • effective: pairwise distances are computed as the centroid-to-centroid distance minus the sum of the two cells’ radii, where each radius is defined as the average of the minimum and maximum cell diameter columns.
  • centroid: pairwise distances are computed directly as centroid-to-centroid distance, without accounting for cell size.

Two cells are considered capable of interaction if their pairwise distance under the selected mode is less than a user-defined threshold, representing the assumed spatial range over which cells can communicate or influence one another.

A triplet is defined as a fully connected set of three cells (a 3-clique) in which all three pairwise distances satisfy this criterion within a single image or field of view.

The tool is designed to be:

  • memory-safe by default (streaming mode)
  • usable as both a Python library and a command-line tool
  • suitable for large microscopy / spatial biology datasets

Installation

From source

Clone the repository and install in editable mode:

pip install -e .

This will install:

  • the Python package: triplet_finder
  • the CLI command: triplet-finder

Dependencies

Python >= 3.9 is required.

Core dependencies:

  • pandas
  • numpy
  • scipy

These are installed automatically via pip.


Command-Line Usage

After installation, the CLI is available as:

triplet-finder --help

Basic usage (streaming, low memory)

triplet-finder \
  --input cells.csv \
  --output-dir per_image_triplets/

This:

  • processes each image independently
  • writes one CSV per image if --output-dir is provided
  • does not keep all triplets in memory by default

Choose the distance mode

Effective distance mode

triplet-finder \
  --input cells.csv \
  --distance-mode effective \
  --output-dir per_image_triplets/

Centroid distance mode

triplet-finder \
  --input cells.csv \
  --distance-mode centroid \
  --output-dir per_image_triplets/

Write a combined output CSV

triplet-finder \
  --input cells.csv \
  --output-dir per_image_triplets/ \
  --combined-output \
  --output all_triplets.csv

This:

  • writes per-image CSVs
  • also returns all triplets in memory
  • writes a combined CSV

For large datasets, this may require substantial memory.

Excluding cells by metadata

triplet-finder \
  --input cells.csv \
  --exclude-col tentative_cell_type \
  --exclude-values Unknown \
  --output-dir per_image_triplets/

Summary / dry-run mode

Prints a summary of the dataset and parameters without computing triplets:

triplet-finder \
  --input cells.csv \
  --exclude-col tentative_cell_type \
  --exclude-values Unknown \
  --distance-mode effective \
  --summary

Example output:

Summary
-------
Images (total): 132
Cells (total): 45892
Cells after filtering: 37211
Images with ≥3 cells: 129

Parameters
----------
Image column: image_sliced
X coordinate column: centroid_x_um
Y coordinate column: centroid_y_um
Distance mode: effective
Distance threshold: 15.0
Min cell diameter column: cell_min_caliper
Max cell diameter column: cell_max_caliper

In centroid mode, the minimum and maximum cell diameter columns are not used.

Metadata headers

By default, per-image CSV outputs include metadata headers such as tool version, parameters, and command invocation.

To disable metadata headers:

triplet-finder --no-metadata ...

Python API Usage

The core functionality is available as a Python function.

Import

from triplet_finder import find_triplets_with_details

Example: streaming mode with effective distance

import pandas as pd
from triplet_finder import find_triplets_with_details

df = pd.read_csv("cells.csv")

find_triplets_with_details(
    input_data=df,
    image_col="image_sliced",
    x_col="centroid_x_um",
    y_col="centroid_y_um",
    min_cell_diameter_column="cell_min_caliper",
    max_cell_diameter_column="cell_max_caliper",
    distance_mode="effective",
    threshold=15.0,
    output_dir="per_image_triplets",
    return_triplets=False,
)

This writes per-image CSVs and returns None.

Example: centroid mode

triplets = find_triplets_with_details(
    input_data=df,
    image_col="image_sliced",
    x_col="centroid_x_um",
    y_col="centroid_y_um",
    distance_mode="centroid",
    threshold=15.0,
    return_triplets=True,
)

Example: returning all triplets with effective distance

triplets = find_triplets_with_details(
    input_data=df,
    image_col="image_sliced",
    x_col="centroid_x_um",
    y_col="centroid_y_um",
    min_cell_diameter_column="cell_min_caliper",
    max_cell_diameter_column="cell_max_caliper",
    distance_mode="effective",
    threshold=15.0,
    return_triplets=True,
)

print(triplets.head())

Metadata in Python usage

metadata = {
    "experiment": "Immune_Triads_Preprocessing",
    "threshold": 15.0,
    "distance_mode": "effective",
    "notes": "Unknown cells excluded",
}

find_triplets_with_details(
    input_data=df,
    image_col="image_sliced",
    x_col="centroid_x_um",
    y_col="centroid_y_um",
    min_cell_diameter_column="cell_min_caliper",
    max_cell_diameter_column="cell_max_caliper",
    distance_mode="effective",
    threshold=15.0,
    output_dir="per_image_triplets",
    metadata=metadata,
)

Design Notes

  • Triplets are computed independently per image
  • KD-tree acceleration is used for spatial queries
  • Default behavior avoids storing all triplets in memory
  • Per-image outputs can be parallelized downstream
  • Suitable for use from Python, R via reticulate, or shell pipelines

License

MIT License


Disclaimer

This software was developed for internal research use and is provided “as is.” No guarantees are made regarding suitability for any specific purpose. Users are responsible for validating the software’s assumptions, correctness, and performance on their own data before applying it to analyses or publications.


The main changes are:
- replaced the old single-definition paragraph with a two-mode definition
- added `--distance-mode` to CLI usage
- updated the summary example to include `Distance mode`
- clarified that caliper columns are only used in `effective` mode
- kept the Python import as `triplet_finder`
- cleaned up the disclaimer wording and CSV capitalization

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

triplet_finder-0.2.0.tar.gz (10.7 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

triplet_finder-0.2.0-py3-none-any.whl (9.1 kB view details)

Uploaded Python 3

File details

Details for the file triplet_finder-0.2.0.tar.gz.

File metadata

  • Download URL: triplet_finder-0.2.0.tar.gz
  • Upload date:
  • Size: 10.7 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.13.12

File hashes

Hashes for triplet_finder-0.2.0.tar.gz
Algorithm Hash digest
SHA256 4f53d2553f3552b2e91115fea28eee6cbbb050feec4fedc9436f9d273492328a
MD5 9a6f151b550d10dca493eceaa9f21c31
BLAKE2b-256 45135380ecccc5b8ce5a527bab373e4c980b8b7f3abf1836e9f5f30e0ddb9cb7

See more details on using hashes here.

Provenance

The following attestation bundles were made for triplet_finder-0.2.0.tar.gz:

Publisher: python-publish.yml on OmicsUtils/tripletfinder

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file triplet_finder-0.2.0-py3-none-any.whl.

File metadata

  • Download URL: triplet_finder-0.2.0-py3-none-any.whl
  • Upload date:
  • Size: 9.1 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.13.12

File hashes

Hashes for triplet_finder-0.2.0-py3-none-any.whl
Algorithm Hash digest
SHA256 e9a6ed40bb9dba8298832cadd192abdb92e5a81627789ae4fedb18adb904e1ee
MD5 d1274bbc46a8fd91b234a4fc555f194f
BLAKE2b-256 7350ae4bdbeebd32bfee9b3c1153096f6f53ccdd5e60981303384f69e61f6438

See more details on using hashes here.

Provenance

The following attestation bundles were made for triplet_finder-0.2.0-py3-none-any.whl:

Publisher: python-publish.yml on OmicsUtils/tripletfinder

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page