Skip to main content

Active semi-supervised clustering algorithms for scikit-learn

Project description

active-semi-supervised-clustering

Active semi-supervised clustering algorithms for scikit-learn.

Algorithms

Semi-supervised clustering

  • Seeded-KMeans
  • Constrainted-KMeans
  • COP-KMeans
  • Pairwise constrained K-Means (PCK-Means)
  • Metric K-Means (MK-Means)
  • Metric pairwise constrained K-Means (MPCK-Means)

Active learning of pairwise clustering

  • Explore & Consolidate
  • Min-max
  • Normalized point-based uncertainty (NPU) method

Installation

pip install active-semi-supervised-clustering

Usage

from sklearn import datasets, metrics
from active_semi_clustering.semi_supervised.pairwise_constraints import PCKMeans
from active_semi_clustering.active.pairwise_constraints import ExampleOracle, ExploreConsolidate, MinMax
X, y = datasets.load_iris(return_X_y=True)

First, obtain some pairwise constraints from an oracle.

# TODO implement your own oracle that will, for example, query a domain expert via GUI or CLI
oracle = ExampleOracle(y, max_queries_cnt=10)

active_learner = MinMax(n_clusters=3)
active_learner.fit(X, oracle=oracle)
pairwise_constraints = active_learner.pairwise_constraints_

Then, use the constraints to do the clustering.

clusterer = PCKMeans(n_clusters=3)
clusterer.fit(X, ml=pairwise_constraints[0], cl=pairwise_constraints[1])

Evaluate the clustering using Adjusted Rand Score.

metrics.adjusted_rand_score(y, clusterer.labels_)

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

active-semi-supervised-clustering-0.0.1.tar.gz (10.5 kB view details)

Uploaded Source

Built Distribution

File details

Details for the file active-semi-supervised-clustering-0.0.1.tar.gz.

File metadata

  • Download URL: active-semi-supervised-clustering-0.0.1.tar.gz
  • Upload date:
  • Size: 10.5 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/1.11.0 pkginfo/1.4.2 requests/2.19.1 setuptools/40.3.0 requests-toolbelt/0.8.0 tqdm/4.26.0 CPython/3.6.3

File hashes

Hashes for active-semi-supervised-clustering-0.0.1.tar.gz
Algorithm Hash digest
SHA256 5ce2b210988560754a3ca1ac33bc20f60174c7b700504418355ea09e6c149efc
MD5 b7bf75e99c995593f831865fac6922bf
BLAKE2b-256 84cc8189ebe735cd7b6c53869775969d89c6fe2d68a872ddd1cc24df3a38d1ba

See more details on using hashes here.

File details

Details for the file active_semi_supervised_clustering-0.0.1-py3-none-any.whl.

File metadata

  • Download URL: active_semi_supervised_clustering-0.0.1-py3-none-any.whl
  • Upload date:
  • Size: 40.2 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/1.11.0 pkginfo/1.4.2 requests/2.19.1 setuptools/40.3.0 requests-toolbelt/0.8.0 tqdm/4.26.0 CPython/3.6.3

File hashes

Hashes for active_semi_supervised_clustering-0.0.1-py3-none-any.whl
Algorithm Hash digest
SHA256 754ab7082c5343a74c9f3089928348622bfc52147062049baa79c53aa584a566
MD5 1b5fd0f81a0703f3d0737b11cc35be9d
BLAKE2b-256 e5734eb6a2966b94de7ca401d87de4104015bf3c911df0434bd99e1eeac67a84

See more details on using hashes here.

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page