Skip to main content

Python implementation of the PAN biclustering algorithm

Project description

PAN

Python implementation of the PAN biclustering algorithm [1].

PAN is made to perform analysis on the Pareto set and front resulting from a multi-objective optimization method. In its turn, PAN is a bi-objective evolutionary algorithm, performing clustering in two spaces at the same time. Since this is another conflicting optimization problem, PAN returns multiple partitionings (collections of clusters), from which the user can choose their preferred partitioning.

PAN makes a few assumptions about the given data:

  • The best partitioning in the objective space differs from the best partitioning in the decision space. If this is not the case, a single objective clustering algorithm can be used on one of the two spaces.
  • Each cluster contains at least two solutions. If this is not the case, outliers need to be manually removed.

To speedup optimization, PAN uses k-medoids clustering as a local heuristic, implemented here using the fast kmedoids package.

Installation

A pre-built package is available at Pypi and can be installed with:

pip install pan_biclustering

Usage

A quick example usage can be found below.

from pan_biclustering.pan import PAN

# Decision space of points 1, 2, 3 and 4
pareto_set = [
    [1, 2], [2, 1], [14, 14], [13, 13],
]
# Objective space of points 1, 2, 3 and 4
pareto_front = [
    [14, 14], [13, 13], [1, 2], [2, 1],
]

def euclidean_distance(p1, p2):
    return sum([(a - b) ** 2 for a, b in zip(p1, p2)]) ** 0.5

pan = PAN(pareto_set, pareto_front, euclidean_distance)
population, population_indices = pan.find_clusters(5, 400)
partitioning = population[0]

# Two clusters, both containing two of the points 1, 2, 3 and 4
print(partitioning)

A second, more elaborate example can be found in the /examples folder.

To automatically pick a partitioning out of the found partitionings, a knee-point detection algorithm such as kneed can be used on the returned population_indices.

References

1 Ulrich, T. (2013), Pareto-Set Analysis: Biobjective Clustering in Decision and Objective Spaces. J. Multi-Crit. Decis. Anal., 20: 217-234. https://doi.org/10.1002/mcda.1477

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

pan_biclustering-1.0.0.tar.gz (12.0 kB view details)

Uploaded Source

Built Distribution

pan_biclustering-1.0.0-py3-none-any.whl (12.5 kB view details)

Uploaded Python 3

File details

Details for the file pan_biclustering-1.0.0.tar.gz.

File metadata

  • Download URL: pan_biclustering-1.0.0.tar.gz
  • Upload date:
  • Size: 12.0 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/5.1.1 CPython/3.11.1

File hashes

Hashes for pan_biclustering-1.0.0.tar.gz
Algorithm Hash digest
SHA256 2d192cead578acec90eba6606c113e55d2fb976846289b0bc7aaa1d0f7981b4f
MD5 4441981a933adaf4f599ebe963db205d
BLAKE2b-256 fdc0e4f2e609043db1f99ba6cecced292ca6260142007ec0a3d2188f92ade05d

See more details on using hashes here.

File details

Details for the file pan_biclustering-1.0.0-py3-none-any.whl.

File metadata

File hashes

Hashes for pan_biclustering-1.0.0-py3-none-any.whl
Algorithm Hash digest
SHA256 865082eb1ae26b74507df116437699dbda3fc84e98f484607fe5d0a805bd8510
MD5 c67c88a36224cf51317fb1233b66a3e4
BLAKE2b-256 d6de94a6ca9205e86ea29cbbd669b8bee357beff9af14e5ea11d0d76825c8af8

See more details on using hashes here.

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page