Skip to main content

python-cluster is a "simple" package that allows to create several groups (clusters) of objects from a list

Project description

DESCRIPTION

python-cluster is a “simple” package that allows to create several groups (clusters) of objects from a list. It’s meant to be flexible and able to cluster any object. To ensure this kind of flexibility, you need not only to supply the list of objects, but also a function that calculates the similarity between two of those objects. For simple datatypes, like integers, this can be as simple as a subtraction, but more complex calculations are possible. Right now, it is possible to generate the clusters using a hierarchical clustering and the popular K-Means algorithm. For the hierarchical algorithm there are different “linkage” (single, complete, average and uclus) methods available.

Algorithms are based on the document found at http://www.elet.polimi.it/upload/matteucc/Clustering/tutorial_html/

USAGE

A simple python program could look like this:

>>> from cluster import HierarchicalClustering
>>> data = [12,34,23,32,46,96,13]
>>> cl = HierarchicalClustering(data, lambda x,y: abs(x-y))
>>> cl.getlevel(10)     # get clusters of items closer than 10
[96, 46, [12, 13, 23, 34, 32]]
>>> cl.getlevel(5)      # get clusters of items closer than 5
[96, 46, [12, 13], 23, [34, 32]]

Note, that when you retrieve a set of clusters, it immediately starts the clustering process, which is quite complex. If you intend to create clusters from a large dataset, consider doing that in a separate thread.

For K-Means clustering it would look like this:

>>> from cluster import KMeansClustering
>>> cl = KMeansClustering([(1,1), (2,1), (5,3), ...])
>>> clusters = cl.getclusters(2)

The parameter passed to getclusters is the count of clusters generated.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

cluster-1.2.0.tar.gz (41.1 kB view details)

Uploaded Source

File details

Details for the file cluster-1.2.0.tar.gz.

File metadata

  • Download URL: cluster-1.2.0.tar.gz
  • Upload date:
  • Size: 41.1 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No

File hashes

Hashes for cluster-1.2.0.tar.gz
Algorithm Hash digest
SHA256 da0111d32f94afde73afe09c5f34e6537e5330a3609b5b0f1b4879673f6827f9
MD5 b5eea4df992aba61b73468ba29ebfb65
BLAKE2b-256 f86d2fdff3c974d64351450d00541df72d84a7a1f31058c9d4507bc434415e80

See more details on using hashes here.

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page