Python Implementation of the Glimmer algorithm for multidimensional scaling
Project description
PyGlimmerMDS
Multidimensional scaling (MDS) for large data sets - a python implementation of the Glimmer algorithm.
[Glimmer: Multilevel MDS on the GPU - 2009 - IEEE TVCG - Ingram, Munzner, Olano]
Glimmer performs dimensionality reduction on high-dimensional data sets of many instances, avoiding the quadratic runtime behavior of naive MDS implementations by employing a multilevel (coarse to fine) approach. This implementation has a GPU switch, but also gives considerable speedup with only CPU nonetheless and makes MDS on large data sets feasible.
Glimmer is a metric MDS and uses Euclidean distance in the high-dimensional space as the dissimilarity measure. This is not the classical MDS that has a linear projection solution. Instead it solves the following optimization problem:
$$\underset{y_1,..,y_n}{\mathrm{argmin}} ~ \sum_{i=1}^n \sum_{j=i+1}^n \Big(\lVert x_i-x_j \rVert - \lVert y_i-y_j \rVert\Big) ^2 \quad \mathrm{where} x_i \in \mathbb{R}^D \mathrm{and} y_i \in \mathbb{R}^{d \ll D}$$
Installation
PyGlimmerMDS is available on PyPi and can be installed through pip.
pip install PyGlimmerMDS
or if you want to install a specific commit use
pip install git+https://github.com/hageldave/PyGlimmerMDS@<commit_hash>
How to use
Very briefly
Performing Glimmer on a data set works like this:
mds = Glimmer(decimation_factor=2, stress_ratio_tol=1-1e-5, rng=rng)
projection = mds.fit_transform(data) # alternative: projection, stress = execute_glimmer(data)
print(f"final stress={mds.stress}")
Enable GPU acceleration
The GPU implementation is based on CuPy, which is an optional dependency.
The GPU implementation will only be available if a CuPy package is installed.
Which CuPy package to install depends on the available hardware and driver (i.e. cupy-cuda12x, cupy-cuda13x, or cupy-rocm-7-0 as of writing). GPU acceleration can then be used as follows:
mds = Glimmer(gpu=True, ....)
# or
projection, stress = execute_glimmer_gpu(data, ....)
Complete example
Jittering the Iris data set to produce a data set of 38,400 points. Performing Glimmer on this data set.
from pyglimmermds import Glimmer, execute_glimmer
from sklearn import preprocessing as prep
from sklearn import datasets
import numpy as np
import matplotlib.pyplot as plt
rng = np.random.default_rng(seed=0xBA0BAB)
# get iris data
dataset = datasets.load_iris()
data = dataset.data
labels = dataset.target
# duplicate data with added noise
for _ in range(8):
data = np.vstack((data,data+(rng.random((data.shape[0], data.shape[1]))*0.2-.1)))
labels = np.append(labels,labels)
print(data.shape)
print(labels.shape)
# perform MDS
data = prep.StandardScaler().fit_transform(data)
mds = Glimmer(decimation_factor=2, stress_ratio_tol=1-1e-5, rng=rng)
projection = mds.fit_transform(data) # alternative: projection, stress = execute_glimmer(data)
print(f"final stress={mds.stress}")
# show scatter plot
fig, ax = plt.subplots()
scatter = ax.scatter(projection[:, 0], projection[:, 1], c=labels, s=0.02)
ax.axis('equal')
plt.show(fig)
This video shows the layouting happening per level and iteration
https://github.com/user-attachments/assets/aa9f7a8c-1c03-46a3-8ee1-19b3d2d4033e
Project details
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file pyglimmermds-1.2.1.tar.gz.
File metadata
- Download URL: pyglimmermds-1.2.1.tar.gz
- Upload date:
- Size: 14.8 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.2.0 CPython/3.10.12
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
a412389ba13d1493dc9287501f4c13882204590ec0f82ad0675f0c9155c2f777
|
|
| MD5 |
3798f62fde63699b0b0d2db84baee042
|
|
| BLAKE2b-256 |
b97aebc23b800fd6c9fe2714c1c560a8aa468bd053c9065a357dc19444512002
|
File details
Details for the file pyglimmermds-1.2.1-py3-none-any.whl.
File metadata
- Download URL: pyglimmermds-1.2.1-py3-none-any.whl
- Upload date:
- Size: 19.9 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.2.0 CPython/3.10.12
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
3faf397d08d1ea086e65df8576e723ac6b674351edad0bdfe373a174958e9b16
|
|
| MD5 |
23e8ad6ba528c68b2be6fa2b95834396
|
|
| BLAKE2b-256 |
0a7110a75c9642907c083c6ec0897314800b3f94a331fd6b092cc964a6db1c05
|