Skip to main content

The SPORE clustering algorithm.

Project description

SPORE

Skeleton Propagation Over Recalibrating Expansions — a graph-based clustering algorithm for arbitrary-shape, arbitrary-scale clusters.

SPORE

How it works

SPORE builds clusters in three stages:

  1. k-NN graph construction — a global nearest-neighbor graph is built (exact or approximate), with neighbor counts scaling as ~O(log N) by default.
  2. Variance-aware BFS expansion — clusters are seeded from densest points outward. Edges are accepted only if their distance is statistically consistent with the cluster's evolving internal distance distribution — mean and variance updated incrementally as expansion proceeds. This prevents bridging across low-density gaps while preserving irregular shapes.
  3. Reassignment — clusters below min_cluster_size are merged into nearby larger ones using a composite score weighing proximity, relative size, density, and angular isotropy, or labeled as noise.

Installation

pip install spore-clustering

Quick start

from spore_clustering import SPORE

labels = SPORE().fit_predict(X)

Key parameters

Parameter Description
expansion Z-score threshold controlling how aggressively clusters grow
neighborhood_percentile Bounded alternative to expansion; typical values: 25, 50, 75, 93.75
retention_rate Fraction of neighbors that must pass variance filter to continue expansion
min_cluster_size Minimum cluster size (int) or exponent for N-relative scaling (float)

See the full API reference for all parameters.

Reusing a precomputed neighbor index

dindex = SPORE.DataIndex(
    connectivity=k,
    neighbors=neighbors,
    dists=distances,
    dataset_scale=scale,
)

labels = SPORE(dindex=dindex, retention_rate=0.25).fit_predict(X)

Complexity

With approximate k-NN and default neighbor scaling (k ~ log N):

Phase Complexity
k-NN construction O(Nd log N)
BFS expansion O(N log N)
Reassignment O(Rd log N), RN

scikit-learn compatibility

SPORE follows standard scikit-learn estimator conventions: fit, fit_predict, get_params, set_params.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

spore_clustering-0.1.0b2.tar.gz (202.1 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

spore_clustering-0.1.0b2-py3-none-any.whl (19.1 kB view details)

Uploaded Python 3

File details

Details for the file spore_clustering-0.1.0b2.tar.gz.

File metadata

  • Download URL: spore_clustering-0.1.0b2.tar.gz
  • Upload date:
  • Size: 202.1 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.12.4

File hashes

Hashes for spore_clustering-0.1.0b2.tar.gz
Algorithm Hash digest
SHA256 b1071c9a678c5d35722df376569c4586a7d96d445043e228dd23f9787dff6e39
MD5 06acb0fa7eba5cf011387713d43c864c
BLAKE2b-256 31d0e398d7fd807f3774058b28b6f0a2cb5ee34e43f90a33b602d62c0602fb17

See more details on using hashes here.

File details

Details for the file spore_clustering-0.1.0b2-py3-none-any.whl.

File metadata

File hashes

Hashes for spore_clustering-0.1.0b2-py3-none-any.whl
Algorithm Hash digest
SHA256 f6342a13c7ffb954615e42fe906d60e0561f182db8d1ac94f69c6d074aa064a2
MD5 142d0344612921904f00f1a162d1b4cb
BLAKE2b-256 3e0e5781c0655a362e994647070c94a0ef93c55ac89ccaf30853c7e186201a92

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page