Skip to main content

The SPORE clustering algorithm.

Project description

SPORE

SPORE (Skeleton Propagation Over Recalibrating Expansions) is a graph-based clustering algorithm for nonlinear clusters under heterogeneous density and weak boundary contrast.

SPORE

The Algorithm

SPORE builds a reusable k-nearest-neighbor graph, then runs two main phases:

  1. Expansion: clusters are seeded from dense regions and expanded with breadth-first search over the k-NN graph. Candidate neighbors are accepted only when their distances are consistent with the growing cluster's evolving distance statistics. This lets each cluster adapt to its own local density scale while still following nonconvex shapes.

  2. Small-Cluster Reassignment (SCR): clusters below min_cluster_size are treated as fragments. Fragment points are reassigned to established clusters using local k-NN majority voting, with candidate neighbors filtered by cluster size and density compatibility. Any fragments still unresolved after SCR can be labeled as noise or left unchanged, depending on post_reassignment_policy.

Installation

pip install spore-clustering

Quick Start

from spore_clustering import SPORE

labels = SPORE().fit_predict(X)

Key Parameters

Parameter Description
z Z-score threshold controlling how aggressively clusters expand
z_percentile Percentile-based alternative to z; ignored if z is provided
retention_rate Fraction of neighbors that must pass the expansion filter for traversal to continue
min_cluster_size Minimum established-cluster size; ints are absolute counts, floats are interpreted as N ** min_cluster_size
max_z Maximum z-score allowed for candidate receiving-cluster neighbors during SCR
max_z_percentile Percentile-based alternative to max_z; ignored if max_z is provided
max_scr_rounds Maximum number of SCR propagation rounds
post_reassignment_policy Whether remaining unresolved small clusters become noise or are left unchanged

See the full API reference for all parameters.

Reusing a Precomputed Neighbor Index

dindex = SPORE.DataIndex(
    connectivity=k,
    neighbors=neighbors,
    dists=distances,
    dataset_scale=scale,
)

labels = SPORE(dindex=dindex, retention_rate=0.25).fit_predict(X)

Time Complexity

With an efficient k-NN backend and default neighbor scaling, where k ~ O(log N):

Phase Complexity
k-NN graph construction O(N d log N)
Expansion O(N log N)
SCR O(N log N)

In the worst case, with a bounded number of SCR rounds, the clustering phases after neighbor construction scale as O(N log N). Including approximate k-NN construction, the practical overall complexity is O(N d log N).

Scikit-learn Compatibility

SPORE follows standard scikit-learn estimator conventions: fit, fit_predict, get_params, and set_params.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

spore_clustering-1.0.0.tar.gz (202.5 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

spore_clustering-1.0.0-py3-none-any.whl (18.9 kB view details)

Uploaded Python 3

File details

Details for the file spore_clustering-1.0.0.tar.gz.

File metadata

  • Download URL: spore_clustering-1.0.0.tar.gz
  • Upload date:
  • Size: 202.5 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.13.13

File hashes

Hashes for spore_clustering-1.0.0.tar.gz
Algorithm Hash digest
SHA256 63553582d73f49659bac9ca8cee4b41cad534ce1a2f7dc36f66e0338289dcae3
MD5 e69c75f7ab6654ab2725c250179e26ff
BLAKE2b-256 3b87d8a075f5224d862541d19e7dc6d3fa27e6d005fa790f8b4ee8c27c0cd67f

See more details on using hashes here.

Provenance

The following attestation bundles were made for spore_clustering-1.0.0.tar.gz:

Publisher: release-and-publish.yml on RandyWAidoo/SPORE

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file spore_clustering-1.0.0-py3-none-any.whl.

File metadata

File hashes

Hashes for spore_clustering-1.0.0-py3-none-any.whl
Algorithm Hash digest
SHA256 2b9c797001df3947819d21ab6d7a0a514d36c6ceaf64954ecb774e5e45cfe21c
MD5 b75b0074f1d2265537bc36ba48b873c9
BLAKE2b-256 a857f13100fd9369a1514ebf3f06070be025f4c57d5b63f0a8775cfedbb0d400

See more details on using hashes here.

Provenance

The following attestation bundles were made for spore_clustering-1.0.0-py3-none-any.whl:

Publisher: release-and-publish.yml on RandyWAidoo/SPORE

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page