Order Preserving Hierarchical Agglomerative Clustering
Project description
Copyright 2020 Daniel Bakkelund
This program is free software: you can redistribute it and/or modify
it under the terms of the GNU Lesser General Public License as published by
the Free Software Foundation, either version 3 of the License, or
(at your option) any later version.
This program is distributed in the hope that it will be useful,
but WITHOUT ANY WARRANTY; without even the implied warranty of
MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
GNU Lesser General Public License for more details.
You should have received a copy of the GNU Lesser General Public License
along with this program. If not, see https://www.gnu.org/licenses/.
Background
The code in this project realises the theory described in https://arxiv.org/abs/2004.12488. The functionality provided is that of order preserving hierarchical agglomerative clustering of partially ordered sets.
Dependencies:
- numpy
- matplotlib
The library is made to run on python 3.x
The code can be used as-is, just place the src directory in your PYTHONPATH.
However, if you want to have a look at the examples, you should follow the below recipe.
Installing (for developers or if you want to view the examples)
1) Run
init.sh
This script downloads the repository https://bitbucket.org/Bakkelund/upyt containing the unit test library that has been used for the development of ophac.
2) Source the script setPyPath.sh:
>source setPyPath.sh
The script sets the PYTHONPATH environment variable. The script is written for UX like platforms, and may work for older versions of Cygwin as well. The directories to add to PYTHONPATH are as follows (in case you have to do it manually):
./src
./test
./xlibs/upyt/src
Remember that in PYTHONPATH you must specify these as absolute paths.
3) Now, try running
>python -um upyt.discover
This should make your prompt look something along the lines
>python -um upyt.discover
------------------------------------------------------------------------
Running 21 tests.
------------------------------------------------------------------------
.....................
------------------------------------------------------------------------
Ran 21 tests in 0.017 s.
------------------------------------------------------------------------
SUCCEEDED!!!
------------------------------------------------------------------------
4) Now, try running
>python -u examples/demo/json_demo.py
This should present a window containing three partial dendrograms. It is the clusterings of the data in Section 6 of the article. The example also shows how to load data from a file (json).
5) Now, try running
>python -u examples/random/random_demo.py
This may take a while. The program generates random data models and runs order preserving clustering using complete linkage. At the end of the run, a 3d-plot shows the correlation between set-sizes, number of ties and running times.
The above command runs one sample for each configuration. By running
>python -u examples/random/random_demo.py 5
you can have 5 samples generated for each configuration, but the running time will be five times longer, on average.
Data model
For documentation about the data model on a high level, take a look in the file datamodel.md, found in the same directory as this README file.
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
File details
Details for the file ophac-0.2.0.tar.gz
.
File metadata
- Download URL: ophac-0.2.0.tar.gz
- Upload date:
- Size: 17.7 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/3.1.1 pkginfo/1.5.0.1 requests/2.23.0 setuptools/49.2.0 requests-toolbelt/0.9.1 tqdm/4.43.0 CPython/3.7.7
File hashes
Algorithm | Hash digest | |
---|---|---|
SHA256 | e9451e27553695fb6f35f55e9897d307f9f9ba10436a1b98688328522cd064ec |
|
MD5 | ce030285b75fe796820a76acf17fabb5 |
|
BLAKE2b-256 | 934857de9d569cd347adff4d38ff7c2d4766fa2ab188ebbc9ac88277d7aa95dd |
File details
Details for the file ophac-0.2.0-py3-none-any.whl
.
File metadata
- Download URL: ophac-0.2.0-py3-none-any.whl
- Upload date:
- Size: 22.8 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/3.1.1 pkginfo/1.5.0.1 requests/2.23.0 setuptools/49.2.0 requests-toolbelt/0.9.1 tqdm/4.43.0 CPython/3.7.7
File hashes
Algorithm | Hash digest | |
---|---|---|
SHA256 | 4ebcb3ad0f88547405c5bad0f9c7a311f2b40bb81ee8aad9c79deeaf656a7983 |
|
MD5 | e83b64cd71276a789adecad9c2c063f4 |
|
BLAKE2b-256 | 1435e29bab57dba3956976397cfb3ecad9941d7c2258a2af994a61105e74ecc7 |