Skip to main content

Enhancing unsupervised learning with geometry-density interactions via decomposition of data into geometric core-periphery layers and subsequent clustering

Project description

cplearn

cplearn is a Python toolkit for unsupervised learning on data with underlying core–periphery-like structures.
The package includes:

  • CoreSPECT – identifies layers in the data from most-to-least separable parts of clustering, along with a clustering of most of the data
  • CoreMAP – Visualization w.r.t. underlying layered structure as derived by corespect using a novel anchor-based optimization.
  • Visualizer – interactive plots for visualizing core structure and subsequent layers

Installation

From PyPI:

pip install cplearn

Quickstart

import numpy as np
from sklearn.mixture import GaussianMixture

from cplearn.corespect import Corespect
from cplearn.visualizer import Visualizer as viz

# Generate synthetic 10D Gaussian Mixture data
n_samples = 500
n_components = 3
dim = 10
rng = np.random.RandomState(42)

# Define means and covariances
means = np.array([
    np.zeros(dim),
    np.ones(dim) * 2,
    np.ones(dim) * -2
])
cov = np.eye(dim) * 1.5
covariances = np.stack([cov] * n_components)

# Setup GMM manually
gmm = GaussianMixture(n_components=n_components, covariance_type='full')
gmm.weights_ = np.ones(n_components) / n_components
gmm.means_ = means
gmm.covariances_ = covariances
gmm.precisions_cholesky_ = np.linalg.cholesky(np.linalg.inv(covariances))

# Sample data
X, _ = gmm.sample(n_samples)


# Initialize and run CoreSPECT
corespect = Corespect(X)

# 1. Find dense core (e.g. 15% of data)
core = corespect.find_core(core_fraction=0.15)

# 2. Cluster core using Louvain
cluster_core = corespect.cluster_core(
    core,
    cluster_algo="louvain",
    cluster_algo_params={"ng_num": 20, "resolution": 1}
)

# 3. Propagate cluster labels outward
propagated_data = corespect.propagate_labels(
    cluster_core,
    propagate_algo="adaptive_majority", #Alternatively use "CDNN" to use the algorithm described in the CoreSPECT paper.
    propagate_algo_params={"ng_num": 20}
)

# 4. Extract layers and labels
layers, labels_for_layer = propagated_data.get_layers_and_labels()

# 5. Visualize (Visualizer internally calls CoreMAP for embedding)
mode_choice='three_steps' #use this with adaptive_majority
#You can use mode_choice='layerwise' to see a more in-depth layer-by-layer visualization
#Use layerwise if propagate_algo is "CDNN"
fig = viz(corespect,mode=mode_choice).fig
fig.show()   # or fig.write_html("corespect_viz.html")

References

If you use this package in your research, please cite:

  • CoreSPECT
    Chandra Sekhar Mukherjee, Joonyoung Bae, and Jiapeng Zhang.
    CoreSPECT: Enhancing Clustering Algorithms via an Interplay of Density and Geometry. *link: https://arxiv.org/abs/2507.08243 *

  • Recursive and adaptive majority propagation – papers coming soon

  • CoreMAP – paper coming soon

Other related work

  • Balanced Ranking
    Chandra Sekhar Mukherjee and Jiapeng Zhang.
    Balanced Ranking with Relative Centrality: A Multi-Core Periphery Perspective.
    ICLR 2025.

License

This package is licensed under the GNU General Public License v3.0 (GPLv3).
See the full license in the LICENSE file.


Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

cplearn-0.1.0.tar.gz (50.3 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

cplearn-0.1.0-py3-none-any.whl (56.6 kB view details)

Uploaded Python 3

File details

Details for the file cplearn-0.1.0.tar.gz.

File metadata

  • Download URL: cplearn-0.1.0.tar.gz
  • Upload date:
  • Size: 50.3 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.9.6

File hashes

Hashes for cplearn-0.1.0.tar.gz
Algorithm Hash digest
SHA256 0278a6c171d0a3b487914f4e5e4045f7158e0d4c003969d22169377c0647c410
MD5 6d6bee2145a2fd56a5231d5e02b34659
BLAKE2b-256 2c2247469c33ea25f78bf8cbaf0bdf7c094367fac7e05a8f0fe43ac1c4c20acb

See more details on using hashes here.

File details

Details for the file cplearn-0.1.0-py3-none-any.whl.

File metadata

  • Download URL: cplearn-0.1.0-py3-none-any.whl
  • Upload date:
  • Size: 56.6 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.9.6

File hashes

Hashes for cplearn-0.1.0-py3-none-any.whl
Algorithm Hash digest
SHA256 564688be4c8273d72688392bd502d3d37289cbbc0f19aabf8fbbd252292814f8
MD5 1669d3144f6a530c913cea4294193a7d
BLAKE2b-256 b8e6a2d36dc4146c8fbc4bb5d618adbdb1d29ae4bde8ff989f0c4579005b6da8

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page