The SPORE clustering algorithm.
Project description
SPORE
SPORE (Skeleton Propagation Over Recalibrating Expansions) is a graph-based clustering algorithm for nonlinear clusters under heterogeneous density and weak boundary contrast.
The Algorithm
SPORE builds a reusable k-nearest-neighbor graph, then runs two main phases:
-
Expansion: clusters are seeded from dense regions and expanded with breadth-first search over the k-NN graph. Candidate neighbors are accepted only when their distances are consistent with the growing cluster's evolving distance statistics. This lets each cluster adapt to its own local density scale while still following nonconvex shapes.
-
Small-Cluster Reassignment (SCR): clusters below
min_cluster_sizeare treated as fragments. Fragment points are reassigned to established clusters using local k-NN majority voting, with candidate neighbors filtered by cluster size and density compatibility. Any fragments still unresolved after SCR can be labeled as noise or left unchanged, depending onpost_reassignment_policy.
Installation
pip install spore-clustering
Quick Start
from spore_clustering import SPORE
labels = SPORE().fit_predict(X)
Key Parameters
| Parameter | Description |
|---|---|
z |
Z-score threshold controlling how aggressively clusters expand |
z_percentile |
Percentile-based alternative to z; ignored if z is provided |
retention_rate |
Fraction of neighbors that must pass the expansion filter for traversal to continue |
min_cluster_size |
Minimum established-cluster size; ints are absolute counts, floats are interpreted as N ** min_cluster_size |
max_z |
Maximum z-score allowed for candidate receiving-cluster neighbors during SCR |
max_z_percentile |
Percentile-based alternative to max_z; ignored if max_z is provided |
max_scr_rounds |
Maximum number of SCR propagation rounds |
post_reassignment_policy |
Whether remaining unresolved small clusters become noise or are left unchanged |
See the full API reference for all parameters.
Reusing a Precomputed Neighbor Index
dindex = SPORE.DataIndex(
connectivity=k,
neighbors=neighbors,
dists=distances,
dataset_scale=scale,
)
labels = SPORE(dindex=dindex, retention_rate=0.25).fit_predict(X)
Time Complexity
With an efficient k-NN backend and default neighbor scaling, where k ~ O(log N):
| Phase | Complexity |
|---|---|
| k-NN graph construction | O(N d log N) |
| Expansion | O(N log N) |
| SCR | O(N log N) |
In the worst case, with a bounded number of SCR rounds, the clustering phases after neighbor construction scale as O(N log N). Including approximate k-NN construction, the practical overall complexity is O(N d log N).
Scikit-learn Compatibility
SPORE follows standard scikit-learn estimator conventions: fit, fit_predict, get_params, and set_params.
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file spore_clustering-1.0.0.tar.gz.
File metadata
- Download URL: spore_clustering-1.0.0.tar.gz
- Upload date:
- Size: 202.5 kB
- Tags: Source
- Uploaded using Trusted Publishing? Yes
- Uploaded via: twine/6.1.0 CPython/3.13.13
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
63553582d73f49659bac9ca8cee4b41cad534ce1a2f7dc36f66e0338289dcae3
|
|
| MD5 |
e69c75f7ab6654ab2725c250179e26ff
|
|
| BLAKE2b-256 |
3b87d8a075f5224d862541d19e7dc6d3fa27e6d005fa790f8b4ee8c27c0cd67f
|
Provenance
The following attestation bundles were made for spore_clustering-1.0.0.tar.gz:
Publisher:
release-and-publish.yml on RandyWAidoo/SPORE
-
Statement:
-
Statement type:
https://in-toto.io/Statement/v1 -
Predicate type:
https://docs.pypi.org/attestations/publish/v1 -
Subject name:
spore_clustering-1.0.0.tar.gz -
Subject digest:
63553582d73f49659bac9ca8cee4b41cad534ce1a2f7dc36f66e0338289dcae3 - Sigstore transparency entry: 1704816421
- Sigstore integration time:
-
Permalink:
RandyWAidoo/SPORE@f9b8d562007c6ced144510a210275b35ec58f5be -
Branch / Tag:
refs/heads/main - Owner: https://github.com/RandyWAidoo
-
Access:
public
-
Token Issuer:
https://token.actions.githubusercontent.com -
Runner Environment:
github-hosted -
Publication workflow:
release-and-publish.yml@f9b8d562007c6ced144510a210275b35ec58f5be -
Trigger Event:
push
-
Statement type:
File details
Details for the file spore_clustering-1.0.0-py3-none-any.whl.
File metadata
- Download URL: spore_clustering-1.0.0-py3-none-any.whl
- Upload date:
- Size: 18.9 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? Yes
- Uploaded via: twine/6.1.0 CPython/3.13.13
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
2b9c797001df3947819d21ab6d7a0a514d36c6ceaf64954ecb774e5e45cfe21c
|
|
| MD5 |
b75b0074f1d2265537bc36ba48b873c9
|
|
| BLAKE2b-256 |
a857f13100fd9369a1514ebf3f06070be025f4c57d5b63f0a8775cfedbb0d400
|
Provenance
The following attestation bundles were made for spore_clustering-1.0.0-py3-none-any.whl:
Publisher:
release-and-publish.yml on RandyWAidoo/SPORE
-
Statement:
-
Statement type:
https://in-toto.io/Statement/v1 -
Predicate type:
https://docs.pypi.org/attestations/publish/v1 -
Subject name:
spore_clustering-1.0.0-py3-none-any.whl -
Subject digest:
2b9c797001df3947819d21ab6d7a0a514d36c6ceaf64954ecb774e5e45cfe21c - Sigstore transparency entry: 1704816431
- Sigstore integration time:
-
Permalink:
RandyWAidoo/SPORE@f9b8d562007c6ced144510a210275b35ec58f5be -
Branch / Tag:
refs/heads/main - Owner: https://github.com/RandyWAidoo
-
Access:
public
-
Token Issuer:
https://token.actions.githubusercontent.com -
Runner Environment:
github-hosted -
Publication workflow:
release-and-publish.yml@f9b8d562007c6ced144510a210275b35ec58f5be -
Trigger Event:
push
-
Statement type: