Skip to main content

Enhancing unsupervised learning with geometry-density interactions via decomposition of data into geometric core-periphery layers and subsequent clustering

Project description

cplearn

cplearn is a Python toolkit for unsupervised learning on data with underlying core–periphery-like structures.
The package includes:

  • CoreSPECT – identifies most-to-least separable layers in the data w.r.t clustering, along with a clustering of most of the data
  • CoreMAP – Visualization w.r.t. underlying layered structure as derived by corespect using a novel anchor-based optimization.
  • Visualizer – interactive plots for visualizing core structure and subsequent layers

Installation

From PyPI:

pip install cplearn

Quickstart

import numpy as np
from sklearn.mixture import GaussianMixture

from cplearn.corespect import Corespect
from cplearn.visualizer import Visualizer as viz

# Generate synthetic 10D Gaussian Mixture data
n_samples = 500
n_components = 3
dim = 10
rng = np.random.RandomState(42)

# Define means and covariances
means = np.array([
    np.zeros(dim),
    np.ones(dim) * 2,
    np.ones(dim) * -2
])
cov = np.eye(dim) * 1.5
covariances = np.stack([cov] * n_components)

# Setup GMM manually
gmm = GaussianMixture(n_components=n_components, covariance_type='full')
gmm.weights_ = np.ones(n_components) / n_components
gmm.means_ = means
gmm.covariances_ = covariances
gmm.precisions_cholesky_ = np.linalg.cholesky(np.linalg.inv(covariances))

# Sample data
X, _ = gmm.sample(n_samples)


# Initialize and run CoreSPECT
corespect = Corespect(X)

# 1. Find dense core (e.g. 15% of data)
core = corespect.find_core(core_fraction=0.15)

# 2. Cluster core using Louvain
cluster_core = corespect.cluster_core(
    core,
    cluster_algo="louvain",
    cluster_algo_params={"ng_num": 20, "resolution": 1}
)

# 3. Propagate cluster labels outward
propagated_data = corespect.propagate_labels(
    cluster_core,
    propagate_algo="adaptive_majority", #Alternatively use "CDNN" to use the algorithm described in the CoreSPECT paper.
    propagate_algo_params={"ng_num": 20}
)

# 4. Extract layers and labels
layers, labels_for_layer = propagated_data.get_layers_and_labels()

# 5. Visualize (Visualizer internally calls CoreMAP for embedding)
mode_choice='three_steps' #use this with adaptive_majority
#You can use mode_choice='layerwise' to see a more in-depth layer-by-layer visualization
#Use layerwise if propagate_algo is "CDNN"
fig = viz(corespect,mode=mode_choice).fig
fig.show()   # or fig.write_html("corespect_viz.html")

References

If you use this package in your research, please cite:

  • CoreSPECT
    Chandra Sekhar Mukherjee, Joonyoung Bae, and Jiapeng Zhang.
    CoreSPECT: Enhancing Clustering Algorithms via an Interplay of Density and Geometry. *link: https://arxiv.org/abs/2507.08243 *

  • Recursive and adaptive majority propagation – papers coming soon

  • CoreMAP – paper coming soon

Other related work

  • Balanced Ranking
    Chandra Sekhar Mukherjee and Jiapeng Zhang.
    Balanced Ranking with Relative Centrality: A Multi-Core Periphery Perspective.
    ICLR 2025.

License

This package is licensed under the GNU General Public License v3.0 (GPLv3).
See the full license in the LICENSE file.


Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

cplearn-0.1.1.tar.gz (50.3 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

cplearn-0.1.1-py3-none-any.whl (56.6 kB view details)

Uploaded Python 3

File details

Details for the file cplearn-0.1.1.tar.gz.

File metadata

  • Download URL: cplearn-0.1.1.tar.gz
  • Upload date:
  • Size: 50.3 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.9.6

File hashes

Hashes for cplearn-0.1.1.tar.gz
Algorithm Hash digest
SHA256 5beb7374df3e00c827db6eb0f23b420b704f4b3829c618570f30134ab1b60cb3
MD5 fa297b037fdabbbf983b61d5c798c398
BLAKE2b-256 54d80d7fb348b4a6e44568db406394d0a71549b75c9fe515105a92a421103c4d

See more details on using hashes here.

File details

Details for the file cplearn-0.1.1-py3-none-any.whl.

File metadata

  • Download URL: cplearn-0.1.1-py3-none-any.whl
  • Upload date:
  • Size: 56.6 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.9.6

File hashes

Hashes for cplearn-0.1.1-py3-none-any.whl
Algorithm Hash digest
SHA256 a2a2997c593a0f2f45e363f364d5939a0f2740631c4a147b1b35cc00f4a76d98
MD5 5acd65611b294c6e8759e3b60e851be7
BLAKE2b-256 1e2002204d4f9e87411c4d1bea8ad8b6fb2be2da58fc78064d321969dbd835e7

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page