Skip to main content

Order Preserving Hierarchical Agglomerative Clustering

Project description

Copyright 2020 Daniel Bakkelund

This program is free software: you can redistribute it and/or modify it under the terms of the GNU Lesser General Public License as published by the Free Software Foundation, either version 3 of the License, or (at your option) any later version.
This program is distributed in the hope that it will be useful, but WITHOUT ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU Lesser General Public License for more details.
You should have received a copy of the GNU Lesser General Public License along with this program. If not, see https://www.gnu.org/licenses/.

Background

The code in this project realises the theory described in https://arxiv.org/abs/2004.12488. The functionality provided is that of order preserving hierarchical agglomerative clustering of partially ordered sets.

Dependencies:

  • numpy
  • matplotlib

The library is made to run on python 3.x

The code can be used as-is, just place the src directory in your PYTHONPATH.

However, if you want to have a look at the examples, you should follow the below recipe.

Installing (for developers or if you want to view the examples)

1) Run

init.sh

This script downloads the repository https://bitbucket.org/Bakkelund/upyt containing the unit test library that has been used for the development of ophac.

2) Source the script setPyPath.sh:

>source setPyPath.sh

The script sets the PYTHONPATH environment variable. The script is written for UX like platforms, and may work for older versions of Cygwin as well. The directories to add to PYTHONPATH are as follows (in case you have to do it manually):

./src
./test
./xlibs/upyt/src

Remember that in PYTHONPATH you must specify these as absolute paths.

3) Now, try running

>python -um upyt.discover

This should make your prompt look something along the lines

>python -um upyt.discover
------------------------------------------------------------------------
Running 21 tests.
------------------------------------------------------------------------
.....................
------------------------------------------------------------------------
Ran 21 tests in 0.017 s.
------------------------------------------------------------------------
SUCCEEDED!!!
------------------------------------------------------------------------

4) Now, try running

>python -u examples/demo/json_demo.py

This should present a window containing three partial dendrograms. It is the clusterings of the data in Section 6 of the article. The example also shows how to load data from a file (json).

5) Now, try running

>python -u examples/random/random_demo.py

This may take a while. The program generates random data models and runs order preserving clustering using complete linkage. At the end of the run, a 3d-plot shows the correlation between set-sizes, number of ties and running times.

The above command runs one sample for each configuration. By running

>python -u examples/random/random_demo.py 5

you can have 5 samples generated for each configuration, but the running time will be five times longer, on average.

Data model

For documentation about the data model on a high level, take a look in the file datamodel.md, found in the same directory as this README file.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

ophac-0.3.0.tar.gz (20.7 kB view details)

Uploaded Source

Built Distribution

ophac-0.3.0-py3-none-any.whl (27.2 kB view details)

Uploaded Python 3

File details

Details for the file ophac-0.3.0.tar.gz.

File metadata

  • Download URL: ophac-0.3.0.tar.gz
  • Upload date:
  • Size: 20.7 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/3.1.1 pkginfo/1.5.0.1 requests/2.23.0 setuptools/49.2.0 requests-toolbelt/0.9.1 tqdm/4.43.0 CPython/3.7.7

File hashes

Hashes for ophac-0.3.0.tar.gz
Algorithm Hash digest
SHA256 5ecfc0c9c34f316a8fab9f094d5fc4d1cb96b84ca50eaa1736c3c508fa7dfad1
MD5 b76e4d2d4e4759bab359eeaafd058527
BLAKE2b-256 ea38b2f19a5756f546230dfc10337002cbc9ba8318a84a90db74638791158ef9

See more details on using hashes here.

File details

Details for the file ophac-0.3.0-py3-none-any.whl.

File metadata

  • Download URL: ophac-0.3.0-py3-none-any.whl
  • Upload date:
  • Size: 27.2 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/3.1.1 pkginfo/1.5.0.1 requests/2.23.0 setuptools/49.2.0 requests-toolbelt/0.9.1 tqdm/4.43.0 CPython/3.7.7

File hashes

Hashes for ophac-0.3.0-py3-none-any.whl
Algorithm Hash digest
SHA256 db72a53e7c2fd1463783eacde7633243504e2a66d109eb83f1246a2b978eb3a7
MD5 83e059691db19abeb4aa8de697145ae5
BLAKE2b-256 9cdefb10648383ee7fb2d7f04d5003a9954cd34281a19f13aa7688cea7e6464d

See more details on using hashes here.

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page