Spectral Bridges clustering algorithm
Project description
Spectral Bridges
Spectral Bridges is a Python package that implements a novel clustering algorithm combining k-means and spectral clustering techniques. It leverages efficient affinity matrix computation and merges clusters based on a connectivity measure inspired by SVM's margin concept. This package is designed to provide robust clustering solutions, particularly suited for large datasets.
Features
- Spectral Bridges Algorithm: Integrates k-means and spectral clustering with efficient affinity matrix calculation for improved clustering results.
- Scalability: Designed to handle large datasets by optimizing cluster formation through advanced affinity matrix computations.
- Customizable: Parameters such as number of clusters, iterations, and random state allow flexibility in clustering configurations.
- Model selection: Automatic model selection for number of nodes (m) according to a normalized eigengap metric.
Speed
Starting with version 1.0.0, Spectral Bridges not only utilizes FAISS's efficient k-means implementation but also uses a scikit-learn method clone for centroid initialization which is much faster (over 2x improvement).
Installation
You can install the package via pip:
pip install spectral-bridges
Usage
Example
import spectralbridges as sb
import numpy as np
# Generate sample data
np.random.seed(0)
X = np.random.rand(100, 10) # Replace with your dataset
# Initialize and fit Spectral Bridges (with a specified number of nodes if needed) and random seed
model = sb.SpectralBridges(n_clusters=5, random_state=42)
# Define range of nodes to evaluate, iterable or a single int
n_nodes_range = [10, 15, 20]
# Find the optimal number of nodes for a given value of clusters
# Modifies the instance attributes, returns a dict
# If n_nodes_range is None, then the model selects using self.n_nodes if not None
mean_ngaps = model.fit_select(X, n_nodes_range)
print("Optimal number of nodes:", model.n_nodes)
print("Dict of mean normalized eigengaps:", mean_ngaps)
# Predict clusters for new data points
new_data = np.random.rand(20, 10) # Replace with new data
predicted_clusters = model.predict(new_data)
print("Predicted clusters:", predicted_clusters)
# With a custom number of nodes
custom_model = sb.SpectralBridges(n_clusters=5, n_nodes=12, p=1) # And a p-bridge affinity
# Fit the model
custom_model.fit(X)
# Predict the same way...
custom_predicted_clusters = custom_model.predict(new_data)
print("Predicted clusters:", custom_predicted_clusters)
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
File details
Details for the file spectral_bridges-1.2.7.tar.gz
.
File metadata
- Download URL: spectral_bridges-1.2.7.tar.gz
- Upload date:
- Size: 6.3 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/5.1.1 CPython/3.12.4
File hashes
Algorithm | Hash digest | |
---|---|---|
SHA256 | 6539124273251bd63aeeefaaff66f4288dfe7949a9a29fa10005b15840b226d8 |
|
MD5 | 806fe0a3c6361622c69280177359c6dc |
|
BLAKE2b-256 | b195c75ee1c582ba33908ec91d3626e4e19dd22a66c6ec94ac60dc4be45332e5 |
File details
Details for the file spectral_bridges-1.2.7-py3-none-any.whl
.
File metadata
- Download URL: spectral_bridges-1.2.7-py3-none-any.whl
- Upload date:
- Size: 6.5 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/5.1.1 CPython/3.12.4
File hashes
Algorithm | Hash digest | |
---|---|---|
SHA256 | cd406f4e388422239bd4bfdc96ca6fd541e39219d7c3b7dfe475f0f9f0e128e3 |
|
MD5 | f4ac0428bceb7d9839a0e09523c63b35 |
|
BLAKE2b-256 | 6775659cc8d60ad82cc4ff15b18ee6a3c7920c784442fad1a92378a07609f93f |