Skip to main content

Python Implementation of the Glimmer algorithm for multidimensional scaling

Project description

PyGlimmerMDS

Multidimensional scaling (MDS) for large data sets - a python implementation of the Glimmer algorithm.
[Glimmer: Multilevel MDS on the GPU - 2009 - IEEE TVCG - Ingram, Munzner, Olano]

Glimmer performs dimensionality reduction on high-dimensional data sets of many instances, avoiding the quadratic runtime behavior of naive MDS implementations by employing a multilevel (coarse to fine) approach. This implementation does not utilize the GPU, but gives considerable speedup nonetheless and makes MDS on large data sets feasible.

Glimmer is a metric MDS and uses Euclidean distance in the high-dimensional space as the dissimilarity measure. This is not the classical MDS that has a linear projection solution. Instead it solves the following optimization problem:

$$\underset{y_1,..,y_n}{\mathrm{argmin}} ~ \sum_{i=1}^n \sum_{j=i+1}^n \Big(\lVert x_i-x_j \rVert - \lVert y_i-y_j \rVert\Big) ^2 \quad \mathrm{where} x_i \in \mathbb{R}^D \mathrm{and} y_i \in \mathbb{R}^{d \ll D}$$

Installation

PyGlimmerMDS is available on PyPi and can be installed through pip.

pip install PyGlimmerMDS

or if you want to install a specific commit use

pip install git+https://github.com/hageldave/PyGlimmerMDS@<commit_hash>

How to use

Very briefly

Performing Glimmer on a data set works like this:

mds = Glimmer(decimation_factor=2, stress_ratio_tol=1-1e-5, rng=rng)
projection = mds.fit_transform(data) # alternative: projection, stress = execute_glimmer(data)
print(f"final stress={mds.stress}")

Complete example

Jittering the Iris data set to produce a data set of 38,400 points. Performing Glimmer on this data set.

from pyglimmermds import Glimmer, execute_glimmer
from sklearn import preprocessing as prep
from sklearn import datasets
import numpy as np
import matplotlib.pyplot as plt

rng = np.random.default_rng(seed=0xBA0BAB)

# get iris data
dataset = datasets.load_iris()
data = dataset.data
labels = dataset.target
# duplicate data with added noise
for _ in range(8):
  data = np.vstack((data,data+(rng.random((data.shape[0], data.shape[1]))*0.2-.1)))
  labels = np.append(labels,labels)
print(data.shape)
print(labels.shape)

# perform MDS
data = prep.StandardScaler().fit_transform(data)
mds = Glimmer(decimation_factor=2, stress_ratio_tol=1-1e-5, rng=rng)
projection = mds.fit_transform(data) # alternative: projection, stress = execute_glimmer(data)
print(f"final stress={mds.stress}")

# show scatter plot
fig, ax = plt.subplots()
scatter = ax.scatter(projection[:, 0], projection[:, 1], c=labels, s=0.02)
ax.axis('equal')
plt.show(fig)

glimmer_iris

This video shows the layouting happening per level and iteration

https://github.com/user-attachments/assets/aa9f7a8c-1c03-46a3-8ee1-19b3d2d4033e

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

pyglimmermds-1.0.0.tar.gz (8.1 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

pyglimmermds-1.0.0-py3-none-any.whl (10.0 kB view details)

Uploaded Python 3

File details

Details for the file pyglimmermds-1.0.0.tar.gz.

File metadata

  • Download URL: pyglimmermds-1.0.0.tar.gz
  • Upload date:
  • Size: 8.1 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.10.12

File hashes

Hashes for pyglimmermds-1.0.0.tar.gz
Algorithm Hash digest
SHA256 a56887329a62bf0dac91bfe11c1d3de92998ec9b8863bc2bcb7bf799b341e84b
MD5 5331b1d5f180392d8fec730369a6e8fa
BLAKE2b-256 0d66b1b8683f9630a14b7ba8386ef6d6598e951a31a9d6bd35a453c8676f78f9

See more details on using hashes here.

File details

Details for the file pyglimmermds-1.0.0-py3-none-any.whl.

File metadata

  • Download URL: pyglimmermds-1.0.0-py3-none-any.whl
  • Upload date:
  • Size: 10.0 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.10.12

File hashes

Hashes for pyglimmermds-1.0.0-py3-none-any.whl
Algorithm Hash digest
SHA256 fbf4574ad528558648c413492765d854c6b2bbcea78421aff9637fe18a4248c8
MD5 4da21f2f4e2cf88450136bd170000848
BLAKE2b-256 5f16e5b139dcf95e092e59bdee2f4e4cb51af4b33e823bdee27b1b9fa6cd0cd5

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page