Skip to main content

a hierachical clustering algorithm based on information theory

Project description

Travis Appveyor

Python binding

PyPI

How to build

The binding uses boost-python library. To enable it, run cmake with -DUSE_PYTHON=ON To make it independent of boost dynamic library, static linking should be enabled in CMAKE configuration. To package the library, use python setup.py bdist_wheel. Install the package by pip install --user info_cluster. If your system cmake is called cmake3, you can use CMAKE=cmake3 pip install --user info_cluster`. Below is the prebuild binary packages:

Platform py3.6 py3.7
Windows T T
MacOS T T
Linux

Demo code

We provide a high-level wrapper of info-clustering algorithm. After installing info_cluster, you can use it as follows:

from info_cluster import InfoCluster
import networkx as nx
g = nx.Graph() # undirected graph
g.add_edge(0, 1, weight=1)
g.add_edge(1, 2, weight=1)
g.add_edge(0, 2, weight=5)
ic = InfoCluster(affinity='precomputed') # use precomputed graph structure
ic.fit(g)
ic.print_hierachical_tree()

The output is like

      /-0
   /-|
--|   \-2
  |
   \-1
import psp # classify the three data points shown in the above figure
g = psp.PyGraph(3, [(0,1,1),(1,2,1),(0,2,5)]) # index started from zero, similarity is 5 for vertex 0 and 2
g.run() # use maximal flow algorithm to classify them
print(g.get_critical_values()) # [2,5]
print(g.get_category(2)) # get the result which has at least 2 categories, which is [0,1,0]

Parametric Dilworth Truncation(pdt) implementation

We provide another alternative implementation, which can be used similar to PyGraph. To make it work, you should apply a patch preflow.patch to preflow.h before building, which belongs to lemon library 1.3.1, see #625.

import psp
g = psp.PyGraphPDT(3, [(0,1,1),(1,2,1),(0,2,5)]) # index started from zero, similarity is 5 for vertex 0 and 2
g.run() # use maximal flow algorithm to classify them
print(g.get_critical_values()) # [2,5]
print(g.get_category(2)) # get the result which has at least 2 categories, which is [0,1,0]

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distributions

No source distribution files available for this release.See tutorial on generating distribution archives.

Built Distributions

info_cluster-0.4.post2-cp37-cp37m-win_amd64.whl (84.3 kB view hashes)

Uploaded CPython 3.7m Windows x86-64

info_cluster-0.4.post2-cp37-cp37m-manylinux2010_x86_64.whl (2.3 MB view hashes)

Uploaded CPython 3.7m manylinux: glibc 2.12+ x86-64

info_cluster-0.4.post2-cp37-cp37m-macosx_10_14_x86_64.whl (250.5 kB view hashes)

Uploaded CPython 3.7m macOS 10.14+ x86-64

info_cluster-0.4.post2-cp36-cp36m-win_amd64.whl (144.1 kB view hashes)

Uploaded CPython 3.6m Windows x86-64

info_cluster-0.4.post2-cp36-cp36m-manylinux2010_x86_64.whl (2.3 MB view hashes)

Uploaded CPython 3.6m manylinux: glibc 2.12+ x86-64

info_cluster-0.4.post2-cp36-cp36m-macosx_10_13_x86_64.whl (260.4 kB view hashes)

Uploaded CPython 3.6m macOS 10.13+ x86-64

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page