Measures of projection quality
Project description
sortedness
sortedness
is a measure of quality of data transformation, often dimensionality reduction.
It is less sensitive to irrelevant distortions and return values in a more meaningful interval than Kruskal stress formula I.
This Python library / code provides a reference implementation for the functions presented here (paper unavailable until publication).
Overview
Local variants return a value for each provided point. The global variant returns a single value for all points. Any local variant can be used as a global measure by taking the mean value.
Local variants: sortedness(X, X_)
, pwsortedness(X, X_)
, rsortedness(X, X_)
.
Global variant: global_sortedness(X, X_)
.
Python installation
from package through pip
# Set up a virtualenv.
python3 -m venv venv
source venv/bin/activate
# Install from PyPI
pip install -U sortedness
from source
git clone https://github.com/sortedness/sortedness
cd sortedness
poetry install
Examples
Sortedness
import numpy as np
from numpy.random import permutation
from sklearn.decomposition import PCA
from sortedness import sortedness
# Some synthetic data.
mean = (1, 2)
cov = np.eye(2)
rng = np.random.default_rng(seed=0)
original = rng.multivariate_normal(mean, cov, size=12)
projected2 = PCA(n_components=2).fit_transform(original)
projected1 = PCA(n_components=1).fit_transform(original)
np.random.seed(0)
projectedrnd = permutation(original)
# Print `min`, `mean`, and `max` values.
s = sortedness(original, original)
print(min(s), sum(s) / len(s), max(s))
"""
1.0 1.0 1.0
"""
s = sortedness(original, projected2)
print(min(s), sum(s) / len(s), max(s))
"""
1.0 1.0 1.0
"""
s = sortedness(original, projected1)
print(min(s), sum(s) / len(s), max(s))
"""
0.432937128932 0.7813889452999166 0.944810120534
"""
s = sortedness(original, projectedrnd)
print(min(s), sum(s) / len(s), max(s))
"""
-0.578096068617 -0.06328160775358334 0.396112816715
"""
Pairwise sortedness
import numpy as np
from numpy.random import permutation
from sklearn.decomposition import PCA
from sortedness import pwsortedness
# Some synthetic data.
mean = (1, 2)
cov = np.eye(2)
rng = np.random.default_rng(seed=0)
original = rng.multivariate_normal(mean, cov, size=12)
projected2 = PCA(n_components=2).fit_transform(original)
projected1 = PCA(n_components=1).fit_transform(original)
np.random.seed(0)
projectedrnd = permutation(original)
# Print `min`, `mean`, and `max` values.
s = pwsortedness(original, original)
print(min(s), sum(s) / len(s), max(s))
"""
1.0 1.0 1.0
"""
s = pwsortedness(original, projected2)
print(min(s), sum(s) / len(s), max(s))
"""
1.0 1.0 1.0
"""
s = pwsortedness(original, projected1)
print(min(s), sum(s) / len(s), max(s))
"""
0.730078995423 0.7744573488776667 0.837310352695
"""
s = pwsortedness(original, projectedrnd)
print(min(s), sum(s) / len(s), max(s))
"""
-0.198780473657 -0.0645984203715 0.147224384381
"""
Sortedness
import numpy as np
from numpy.random import permutation
from sklearn.decomposition import PCA
from sortedness import global_pwsortedness
# Some synthetic data.
mean = (1, 2)
cov = np.eye(2)
rng = np.random.default_rng(seed=0)
original = rng.multivariate_normal(mean, cov, size=12)
projected2 = PCA(n_components=2).fit_transform(original)
projected1 = PCA(n_components=1).fit_transform(original)
np.random.seed(0)
projectedrnd = permutation(original)
# Print measurement result and p-value.
s = global_pwsortedness(original, original)
print(list(s))
"""
[1.0, 3.6741408919675163e-93]
"""
s = global_pwsortedness(original, projected2)
print(list(s))
"""
[1.0, 3.6741408919675163e-93]
"""
s = global_pwsortedness(original, projected1)
print(list(s))
"""
[0.7715617715617715, 5.240847664048334e-20]
"""
s = global_pwsortedness(original, projectedrnd)
print(list(s))
"""
[-0.06107226107226107, 0.46847188611226276]
"""
** Copyright (c) 2023. Davi Pereira dos Santos and Tacito Neves**
Grants
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
File details
Details for the file sortedness-0.230710.0.tar.gz
.
File metadata
- Download URL: sortedness-0.230710.0.tar.gz
- Upload date:
- Size: 730.3 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: poetry/1.3.2 CPython/3.10.6 Linux/5.15.0-73-generic
File hashes
Algorithm | Hash digest | |
---|---|---|
SHA256 | dc8fb564e7cc1ee44ee504713e3545ad59abae1e9ab8ed1f569ca2298cc0a656 |
|
MD5 | 0d04722d4dd511229598a418568f17e1 |
|
BLAKE2b-256 | 38d832999c9a9b89f81bc70a2d4a490b7343849121190216517c9322487ca4d2 |
File details
Details for the file sortedness-0.230710.0-cp310-cp310-manylinux_2_35_x86_64.whl
.
File metadata
- Download URL: sortedness-0.230710.0-cp310-cp310-manylinux_2_35_x86_64.whl
- Upload date:
- Size: 745.7 kB
- Tags: CPython 3.10, manylinux: glibc 2.35+ x86-64
- Uploaded using Trusted Publishing? No
- Uploaded via: poetry/1.3.2 CPython/3.10.6 Linux/5.15.0-73-generic
File hashes
Algorithm | Hash digest | |
---|---|---|
SHA256 | f1dd8beb36585970cfde08a2366a136b0fd0f3efe2e804060df2a35f5077ac77 |
|
MD5 | a7f2e5e2dd658fc4f7e188ad947d4f7d |
|
BLAKE2b-256 | ecb7f9ec2ee6c486f14a66ac9d7421675dfdc720e3d0999f258ab2b7bbc3b97e |