Skip to main content

The SPORE clustering algorithm.

Project description

SPORE

Skeleton Propagation Over Recalibrating Expansions — a graph-based clustering algorithm for arbitrary-shape, arbitrary-scale clusters.

SPORE

How it works

SPORE builds clusters in three stages:

  1. k-NN graph construction: a global nearest-neighbor graph is built (exact or approximate), with neighbor counts scaling as ~O(log N) by default.
  2. Variance-aware BFS expansion: clusters are seeded from the densest points outward. Nearby points are accepted only if their distance is statistically consistent with the cluster's particular distance distribution, the mean and variance of which are updated as neighbors are accepted. This allows for density- and shape-adaptive cluster identification.
  3. Reassignment: clusters below min_cluster_size are merged into nearby larger ones using a composite score weighing proximity, relative size, density, and angular isotropy, or labeled as noise.

Installation

pip install spore-clustering

Quick start

from spore_clustering import SPORE

labels = SPORE().fit_predict(X)

Key parameters

Parameter Description
expansion Z-score threshold controlling how aggressively clusters grow
neighborhood_percentile Bounded alternative to expansion; typical values: 25, 50, 75, 93.75
retention_rate Fraction of neighbors that must pass variance filter to continue expansion
min_cluster_size Minimum cluster size (int) or exponent for N-relative scaling (float)

See the full API reference for all parameters.

Reusing a precomputed neighbor index

dindex = SPORE.DataIndex(
    connectivity=k,
    neighbors=neighbors,
    dists=distances,
    dataset_scale=scale,
)

labels = SPORE(dindex=dindex, retention_rate=0.25).fit_predict(X)

Complexity

With approximate k-NN and default neighbor scaling (k ~ log N):

Phase Complexity
k-NN construction O(Nd log N)
BFS expansion O(N log N)
Reassignment O(Rd log N), RN

scikit-learn compatibility

SPORE follows standard scikit-learn estimator conventions: fit, fit_predict, get_params, set_params.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

spore_clustering-0.1.2.tar.gz (202.8 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

spore_clustering-0.1.2-py3-none-any.whl (19.0 kB view details)

Uploaded Python 3

File details

Details for the file spore_clustering-0.1.2.tar.gz.

File metadata

  • Download URL: spore_clustering-0.1.2.tar.gz
  • Upload date:
  • Size: 202.8 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.13.13

File hashes

Hashes for spore_clustering-0.1.2.tar.gz
Algorithm Hash digest
SHA256 e200e60d27ad41a68a71cd707e4cfa3d10eb7217cb91845a341c344aab527d0d
MD5 ab01d88ed86318433c611a43c94d7c0c
BLAKE2b-256 6dd5145676e1c2297045934d6f4659ed38a092e615e8cd499d7adef6cf42814f

See more details on using hashes here.

Provenance

The following attestation bundles were made for spore_clustering-0.1.2.tar.gz:

Publisher: release-and-publish.yml on RandyWAidoo/SPORE

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file spore_clustering-0.1.2-py3-none-any.whl.

File metadata

File hashes

Hashes for spore_clustering-0.1.2-py3-none-any.whl
Algorithm Hash digest
SHA256 07bfc1a6e772894d3fd9669517f43867115afc9d61caf20990cd589be7da9353
MD5 928e3563a28f77e63e15dab264310213
BLAKE2b-256 c5293a3a8a7236cd08f6c9a1493ac1abd9a6b5036a972a04505f03ada284316c

See more details on using hashes here.

Provenance

The following attestation bundles were made for spore_clustering-0.1.2-py3-none-any.whl:

Publisher: release-and-publish.yml on RandyWAidoo/SPORE

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page