Skip to main content

The SPORE clustering algorithm.

Project description

SPORE

Skeleton Propagation Over Recalibrating Expansions — a graph-based clustering algorithm for arbitrary-shape, arbitrary-scale clusters.

SPORE

How it works

SPORE builds clusters in three stages:

  1. k-NN graph construction — a global nearest-neighbor graph is built (exact or approximate), with neighbor counts scaling as ~O(log N) by default.
  2. Variance-aware BFS expansion — clusters are seeded from densest points outward. Edges are accepted only if their distance is statistically consistent with the cluster's evolving internal distance distribution — mean and variance updated incrementally as expansion proceeds. This prevents bridging across low-density gaps while preserving irregular shapes.
  3. Reassignment — clusters below min_cluster_size are merged into nearby larger ones using a composite score weighing proximity, relative size, density, and angular isotropy, or labeled as noise.

Installation

pip install spore-clustering

Quick start

from spore_clustering import SPORE

labels = SPORE().fit_predict(X)

Key parameters

Parameter Description
expansion Z-score threshold controlling how aggressively clusters grow
neighborhood_percentile Bounded alternative to expansion; typical values: 25, 50, 75, 93.75
retention_rate Fraction of neighbors that must pass variance filter to continue expansion
min_cluster_size Minimum cluster size (int) or exponent for N-relative scaling (float)

See the full API reference for all parameters.

Reusing a precomputed neighbor index

dindex = SPORE.DataIndex(
    connectivity=k,
    neighbors=neighbors,
    dists=distances,
    dataset_scale=scale,
)

labels = SPORE(dindex=dindex, retention_rate=0.25).fit_predict(X)

Complexity

With approximate k-NN and default neighbor scaling (k ~ log N):

Phase Complexity
k-NN construction O(Nd log N)
BFS expansion O(N log N)
Reassignment O(Rd log N), RN

scikit-learn compatibility

SPORE follows standard scikit-learn estimator conventions: fit, fit_predict, get_params, set_params.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

spore_clustering-0.1.0b1.tar.gz (202.1 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

spore_clustering-0.1.0b1-py3-none-any.whl (19.1 kB view details)

Uploaded Python 3

File details

Details for the file spore_clustering-0.1.0b1.tar.gz.

File metadata

  • Download URL: spore_clustering-0.1.0b1.tar.gz
  • Upload date:
  • Size: 202.1 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.12.4

File hashes

Hashes for spore_clustering-0.1.0b1.tar.gz
Algorithm Hash digest
SHA256 bc0e858f30bcfe596fe5bdc57151668eb64dc01e1f68169b952318d290fae91f
MD5 a9c004ea33f33d7834eb567db6673cf5
BLAKE2b-256 547d4ce4b3bf2b2d8d64de571a0d198c51ff384bdee07891c3db13f89ea40502

See more details on using hashes here.

File details

Details for the file spore_clustering-0.1.0b1-py3-none-any.whl.

File metadata

File hashes

Hashes for spore_clustering-0.1.0b1-py3-none-any.whl
Algorithm Hash digest
SHA256 1801cc520f39a5220a3729a2f515a80acd234d5d1b1488b0414d4fa089d4ddf4
MD5 2613611757c0f58be3981cb30a4986a5
BLAKE2b-256 e89390be9d2138fed3343c88be95492f7db8186fe5f6436a8d66932522281111

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page