No project description provided
Project description
DESCRIPTION
python-cluster is a “simple” package that allows to create several groups (clusters) of objects from a list. It’s meant to be flexible and able to cluster any object. To ensure this kind of flexibility, you need not only to supply the list of objects, but also a function that calculates the similarity between two of those objects. For simple datatypes, like integers, this can be as simple as a subtraction, but more complex calculations are possible. Right now, it is possible to generate the clusters using a hierarchical clustering and the popular K-Means algorithm. For the hierarchical algorithm there are different “linkage” (single, complete, average and uclus) methods available.
Algorithms are based on the document found at http://www.elet.polimi.it/upload/matteucc/Clustering/tutorial_html/
USAGE
A simple python program could look like this:
>>> from cluster import HierarchicalClustering >>> data = [12,34,23,32,46,96,13] >>> cl = HierarchicalClustering(data, lambda x,y: abs(x-y)) >>> cl.getlevel(10) # get clusters of items closer than 10 [96, 46, [12, 13, 23, 34, 32]] >>> cl.getlevel(5) # get clusters of items closer than 5 [96, 46, [12, 13], 23, [34, 32]]
Note, that when you retrieve a set of clusters, it immediately starts the clustering process, which is quite complex. If you intend to create clusters from a large dataset, consider doing that in a separate thread.
For K-Means clustering it would look like this:
>>> from cluster import KMeansClustering >>> cl = KMeansClustering([(1,1), (2,1), (5,3), ...]) >>> clusters = cl.getclusters(2)
The parameter passed to getclusters is the count of clusters generated.
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Hashes for cluster-1.4.1.post3.linux-x86_64.tar.gz
Algorithm | Hash digest | |
---|---|---|
SHA256 | 942ecd9d02572e3d5ab3111b07307026830d70eae57d6f5c1bf9bcba3abc653f |
|
MD5 | 266d943ab9c0623bee189e1532bedbeb |
|
BLAKE2b-256 | 786ece37ab112e7f704df2c0b61cee544c3b1d49c54b9e43a1beff03a4a03d71 |
Hashes for cluster-1.4.1.post3-py2.py3-none-any.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | e05f9a79a634e942d34f015f37fb880e610f2d1adaaceb6b2e03308f16fd2a0f |
|
MD5 | dc06ac641689ff9208f4942757afb035 |
|
BLAKE2b-256 | a9c2fe1c0b71de370e94da9ecb85c8e751eaa0c64c48cd774c54c9c8fc97d23c |