Skip to main content

The SPORE clustering algorithm.

Project description

SPORE

Skeleton Propagation Over Recalibrating Expansions — a graph-based clustering algorithm for arbitrary-shape, arbitrary-scale clusters.

SPORE

How it works

SPORE builds clusters in three stages:

  1. k-NN graph construction: a global nearest-neighbor graph is built (exact or approximate), with neighbor counts scaling as ~O(log N) by default.
  2. Variance-aware BFS expansion: clusters are seeded from the densest points outward. Nearby points are accepted only if their distance is statistically consistent with the cluster's particular distance distribution, the mean and variance of which are updated as neighbors are accepted. This allows for density- and shape-adaptive cluster identification.
  3. Reassignment: clusters below min_cluster_size are merged into nearby larger ones using a composite score weighing proximity, relative size, density, and angular isotropy, or labeled as noise.

Installation

pip install spore-clustering

Quick start

from spore_clustering import SPORE

labels = SPORE().fit_predict(X)

Key parameters

Parameter Description
expansion Z-score threshold controlling how aggressively clusters grow
neighborhood_percentile Bounded alternative to expansion; typical values: 25, 50, 75, 93.75
retention_rate Fraction of neighbors that must pass variance filter to continue expansion
min_cluster_size Minimum cluster size (int) or exponent for N-relative scaling (float)

See the full API reference for all parameters.

Reusing a precomputed neighbor index

dindex = SPORE.DataIndex(
    connectivity=k,
    neighbors=neighbors,
    dists=distances,
    dataset_scale=scale,
)

labels = SPORE(dindex=dindex, retention_rate=0.25).fit_predict(X)

Complexity

With approximate k-NN and default neighbor scaling (k ~ log N):

Phase Complexity
k-NN construction O(Nd log N)
BFS expansion O(N log N)
Reassignment O(Rd log N), RN

scikit-learn compatibility

SPORE follows standard scikit-learn estimator conventions: fit, fit_predict, get_params, set_params.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

spore_clustering-0.1.1.tar.gz (202.7 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

spore_clustering-0.1.1-py3-none-any.whl (19.0 kB view details)

Uploaded Python 3

File details

Details for the file spore_clustering-0.1.1.tar.gz.

File metadata

  • Download URL: spore_clustering-0.1.1.tar.gz
  • Upload date:
  • Size: 202.7 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.13.13

File hashes

Hashes for spore_clustering-0.1.1.tar.gz
Algorithm Hash digest
SHA256 ffb6b296450547548a493ef9a450aaa7e9547c0d3ba5c4ad8aec02dc50cfdd02
MD5 d04a5cdc82bba0bcb264fe7ebbc38eaa
BLAKE2b-256 4b890234f6bb9319daf359e47c676e855e125720a75000e444c1c5610f0fbee5

See more details on using hashes here.

Provenance

The following attestation bundles were made for spore_clustering-0.1.1.tar.gz:

Publisher: release-and-publish.yml on RandyWAidoo/SPORE

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file spore_clustering-0.1.1-py3-none-any.whl.

File metadata

File hashes

Hashes for spore_clustering-0.1.1-py3-none-any.whl
Algorithm Hash digest
SHA256 22e2c17f2946cdf5c7f55bc6c19197a65b5195d422b56266dace1add68649789
MD5 571d00ebbff018afe123fba385b9f30c
BLAKE2b-256 a809682b1b4877826ed243516d8de8f8ffa623afc2d787f3dec53294476732d0

See more details on using hashes here.

Provenance

The following attestation bundles were made for spore_clustering-0.1.1-py3-none-any.whl:

Publisher: release-and-publish.yml on RandyWAidoo/SPORE

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page