Source code for Multi-objective Data embeddings
Project description
Python implementation for MoDE (Multi-objective Data Embedding)
Description of scripts in MoDE_embeddings/
MoDE.py
: This file contains the main class that implements MoDE.metrics.py
: This file contains the functions to compute the three metrics introduced in the paper, i.e, distance, correlation, and order preservation metrics.waterfilling_compression.py
: This file contains the implementation of waterfilling algorithm.fastgd/
: This directory contains the fast implementation of the Gradient Decsent algorithm in Cython.
Usage
MoDE embeddings can be trained on exact or inexact distance matrices. In the case of inexact distance information, ranges of lower and upper bounds on the distances in the form of seperate lower and upper bound distance matrices should be given to the fit_transform
function. The resulting embeddings are in 2D dimensions and the data points are placed in the embedding space such that samples with higher scores are placed in higher angles (in polar coordinates).
from MoDE_embeddings.MoDE import MoDE
mode = MoDE(n_neighbor=20, max_iter=100000, tol=0.0001, verbose=True)
x_2d = mode.fit_transform(data, score)
Once the MoDE embeddings are trained, you can measure the fidelity of the embedded dataset to the original dataset in terms of preserving distances, correlations and orders. To do so, you can use the metric functions available in "metrics.py".
from MoDE_embeddings.metrics import distance_metric, correlation_metric, order_preservation
R_d = distance_metric(data, x_2d, n_neighbor=20)
R_c = correlation_metric(data, x_2d, n_neighbor=20)
R_o = order_preservation(x_2d, mode.P.squeeze(), n_neighbor=20, score=score.squeeze())
Waterfilling algorithm (for data compression)
With the waterfilling algorithm you can find tight lower and upper bounds on the pair-wise distances between data points that have been compressed using orthonormal
transforms, e.g, fourier transform. Using the WaterfillingCompression
class you can compress the data by keeping only a small portion of fourier transform
coefficients. Then by calling the compute_distance_bounds
method you are able to compute tight lower and upper bounds on pair-wise distances. For more information
on the waterfilling algorithm check out the paper: https://arxiv.org/pdf/1405.5873.pdf
from MoDE_embeddings.waterfilling_compression import WaterfillingCompression
comp = WaterfillingCompression(num_coeffs=4, coeffs_to_keep='optimal')
dm_ub, dm_lb = comp.compute_distance_bounds(data)
Project details
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
File details
Details for the file mode_embeddings-0.1.5.tar.gz
.
File metadata
- Download URL: mode_embeddings-0.1.5.tar.gz
- Upload date:
- Size: 414.5 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/5.1.0 CPython/3.9.7
File hashes
Algorithm | Hash digest | |
---|---|---|
SHA256 | f71004e09139663fb74be2f14a4ef9cddaa48d648e446e39b68a76c50d2b484d |
|
MD5 | 3b52088ea0b2b6f889e6353ce530ee61 |
|
BLAKE2b-256 | b9b60c7c6bbc2c46092837d1543a1f6e3d563942276352694262d15032c3da9a |
File details
Details for the file MoDE_embeddings-0.1.5-cp39-cp39-manylinux2014_x86_64.whl
.
File metadata
- Download URL: MoDE_embeddings-0.1.5-cp39-cp39-manylinux2014_x86_64.whl
- Upload date:
- Size: 681.8 kB
- Tags: CPython 3.9
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/5.1.0 CPython/3.9.7
File hashes
Algorithm | Hash digest | |
---|---|---|
SHA256 | af20a21a1188bad22784e3f40972c23a9d9a5e6ae75d396eb0beab9cfbd5db8f |
|
MD5 | 908d0d24caeea1a26cc24f4d1de0679a |
|
BLAKE2b-256 | bd1dbc583f51a906cc4b7fe4f811db89149c11868cc625d8e926f2bbb468cef3 |