Skip to main content

a hierachical clustering algorithm based on information theory

Project description

Travis Appveyor

Python binding

PyPI

How to build

The binding uses Cython. To package the library, use python setup.py bdist_wheel. Install the package by pip install --user info_cluster. Below is the prebuilt binary packages:

Platform py3.6 py3.7
Windows T T
MacOS T T
Linux T T

Demo code

We provide a high-level wrapper of info-clustering algorithm. After installing info_cluster, you can use it as follows:

from info_cluster import InfoCluster
import networkx as nx
g = nx.Graph() # undirected graph
g.add_edge(0, 1, weight=1)
g.add_edge(1, 2, weight=1)
g.add_edge(0, 2, weight=5)
ic = InfoCluster(affinity='precomputed') # use precomputed graph structure
ic.fit(g)
ic.print_hierachical_tree()

The output is like

      /-0
   /-|
--|   \-2
  |
   \-1
import psp # classify the three data points shown in the above figure
g = psp.PyGraph(3, [(0,1,1),(1,2,1),(0,2,5)]) # index started from zero, similarity is 5 for vertex 0 and 2
g.run() # use maximal flow algorithm to classify them
print(g.get_critical_values()) # [2,5]
print(g.get_category(2)) # get the result which has at least 2 categories, which is [0,1,0]

Parametric Dilworth Truncation(pdt) implementation

We provide another alternative implementation, which can be used similar to PyGraph. To make it work, you should apply a patch preflow.patch to preflow.h before building, which belongs to lemon library 1.3.1, see #625.

import psp
g = psp.PyGraphPDT(3, [(0,1,1),(1,2,1),(0,2,5)]) # index started from zero, similarity is 5 for vertex 0 and 2
g.run() # use maximal flow algorithm to classify them
print(g.get_critical_values()) # [2,5]
print(g.get_category(2)) # get the result which has at least 2 categories, which is [0,1,0]

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distributions

No source distribution files available for this release.See tutorial on generating distribution archives.

Built Distributions

info_cluster-0.5.post1-cp37-cp37m-win_amd64.whl (91.7 kB view hashes)

Uploaded CPython 3.7m Windows x86-64

info_cluster-0.5.post1-cp37-cp37m-manylinux2010_x86_64.whl (2.3 MB view hashes)

Uploaded CPython 3.7m manylinux: glibc 2.12+ x86-64

info_cluster-0.5.post1-cp37-cp37m-macosx_10_14_x86_64.whl (118.3 kB view hashes)

Uploaded CPython 3.7m macOS 10.14+ x86-64

info_cluster-0.5.post1-cp36-cp36m-win_amd64.whl (91.9 kB view hashes)

Uploaded CPython 3.6m Windows x86-64

info_cluster-0.5.post1-cp36-cp36m-manylinux2010_x86_64.whl (2.3 MB view hashes)

Uploaded CPython 3.6m manylinux: glibc 2.12+ x86-64

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page