Skip to main content

Measures of projection quality

Project description

test codecov github Python version license: GPL v3

API documentation DOI Downloads

sortedness

sortedness is a measure of quality of data transformation, often dimensionality reduction. It is less sensitive to irrelevant distortions and return values in a more meaningful interval than Kruskal stress formula I.
This Python library / code provides a reference implementation for the functions presented here (paper unavailable until publication).

Overview

Local variants return a value for each provided point. The global variant returns a single value for all points. Any local variant can be used as a global measure by taking the mean value.

Local variants: sortedness(X, X_), pwsortedness(X, X_), rsortedness(X, X_).

Global variant: global_sortedness(X, X_).

Python installation

from package through pip

# Set up a virtualenv. 
python3 -m venv venv
source venv/bin/activate

# Install from PyPI
pip install -U sortedness

from source

git clone https://github.com/sortedness/sortedness
cd sortedness
poetry install

Examples

Sortedness

import numpy as np
from numpy.random import permutation
from sklearn.decomposition import PCA

from sortedness import sortedness

# Some synthetic data.
mean = (1, 2)
cov = np.eye(2)
rng = np.random.default_rng(seed=0)
original = rng.multivariate_normal(mean, cov, size=12)
projected2 = PCA(n_components=2).fit_transform(original)
projected1 = PCA(n_components=1).fit_transform(original)
np.random.seed(0)
projectedrnd = permutation(original)

# Print `min`, `mean`, and `max` values.
s = sortedness(original, original)
print(min(s), sum(s) / len(s), max(s))
"""
1.0 1.0 1.0
"""
s = sortedness(original, projected2)
print(min(s), sum(s) / len(s), max(s))
"""
1.0 1.0 1.0
"""
s = sortedness(original, projected1)
print(min(s), sum(s) / len(s), max(s))
"""
0.432937128932 0.7813889452999166 0.944810120534
"""
s = sortedness(original, projectedrnd)
print(min(s), sum(s) / len(s), max(s))
"""
-0.578096068617 -0.06328160775358334 0.396112816715
"""

Pairwise sortedness

import numpy as np
from numpy.random import permutation
from sklearn.decomposition import PCA

from sortedness import pwsortedness

# Some synthetic data.
mean = (1, 2)
cov = np.eye(2)
rng = np.random.default_rng(seed=0)
original = rng.multivariate_normal(mean, cov, size=12)
projected2 = PCA(n_components=2).fit_transform(original)
projected1 = PCA(n_components=1).fit_transform(original)
np.random.seed(0)
projectedrnd = permutation(original)

# Print `min`, `mean`, and `max` values.
s = pwsortedness(original, original)
print(min(s), sum(s) / len(s), max(s))
"""
1.0 1.0 1.0
"""
s = pwsortedness(original, projected2)
print(min(s), sum(s) / len(s), max(s))
"""
1.0 1.0 1.0
"""
s = pwsortedness(original, projected1)
print(min(s), sum(s) / len(s), max(s))
"""
0.730078995423 0.7744573488776667 0.837310352695
"""
s = pwsortedness(original, projectedrnd)
print(min(s), sum(s) / len(s), max(s))
"""
-0.198780473657 -0.0645984203715 0.147224384381
"""

Sortedness

import numpy as np
from numpy.random import permutation
from sklearn.decomposition import PCA

from sortedness import global_pwsortedness

# Some synthetic data.
mean = (1, 2)
cov = np.eye(2)
rng = np.random.default_rng(seed=0)
original = rng.multivariate_normal(mean, cov, size=12)
projected2 = PCA(n_components=2).fit_transform(original)
projected1 = PCA(n_components=1).fit_transform(original)
np.random.seed(0)
projectedrnd = permutation(original)

# Print measurement result and p-value.
s = global_pwsortedness(original, original)
print(list(s))
"""
[1.0, 3.6741408919675163e-93]
"""
s = global_pwsortedness(original, projected2)
print(list(s))
"""
[1.0, 3.6741408919675163e-93]
"""
s = global_pwsortedness(original, projected1)
print(list(s))
"""
[0.7715617715617715, 5.240847664048334e-20]
"""
s = global_pwsortedness(original, projectedrnd)
print(list(s))
"""
[-0.06107226107226107, 0.46847188611226276]
"""

** Copyright (c) 2022. Davi Pereira dos Santos and Tacito Neves**

Grants

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

sortedness-0.230501.10.tar.gz (633.9 kB view details)

Uploaded Source

Built Distribution

sortedness-0.230501.10-cp310-cp310-manylinux_2_35_x86_64.whl (647.5 kB view details)

Uploaded CPython 3.10 manylinux: glibc 2.35+ x86-64

File details

Details for the file sortedness-0.230501.10.tar.gz.

File metadata

  • Download URL: sortedness-0.230501.10.tar.gz
  • Upload date:
  • Size: 633.9 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: poetry/1.3.2 CPython/3.10.6 Linux/5.15.0-58-generic

File hashes

Hashes for sortedness-0.230501.10.tar.gz
Algorithm Hash digest
SHA256 f6dbd5a9f9b8246c275fbef0af1842d26c3a5089de5af4c4d2ef513d291a8e7e
MD5 8f751cd81f83627b300924a0ffea8f4d
BLAKE2b-256 95d3a17e21260353d542423e3222f4262ae0ec103202efc51695a0f7b124bfb0

See more details on using hashes here.

File details

Details for the file sortedness-0.230501.10-cp310-cp310-manylinux_2_35_x86_64.whl.

File metadata

File hashes

Hashes for sortedness-0.230501.10-cp310-cp310-manylinux_2_35_x86_64.whl
Algorithm Hash digest
SHA256 147389625b65aba70ae7cb446d40152fa1a2159903cd9cb41da4ef46638e3a33
MD5 edd240f1084367d13be5132f177c6e17
BLAKE2b-256 5064363b7882291ca6ac98b0a7311a27805433ff94eaf6beb7e07fbb01d25630

See more details on using hashes here.

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page